Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Support for handling multiple 'q' parameters #105

Closed
wants to merge 16 commits into from
Closed

Added Support for handling multiple 'q' parameters #105

wants to merge 16 commits into from

Conversation

Axle7XStriker
Copy link

Currently, APy doesn't support multiple q's as parameters when given for translation. I have resolved this by storing multiple responses(local) first and then sending all at once in a global response. Each parameter q is iteratively translated by a given langpair and an error is raised if at least one of the translation fails.

refs #86

@coveralls
Copy link

coveralls commented Apr 5, 2018

Pull Request Test Coverage Report for Build 886

  • 16 of 24 (66.67%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 45.439%

Changes Missing Coverage Covered Lines Changed/Added Lines %
apertium_apy/handlers/translate_raw.py 0 8 0.0%
Totals Coverage Status
Change from base Build 884: 0.1%
Covered Lines: 1054
Relevant Lines: 2095

💛 - Coveralls

@Axle7XStriker
Copy link
Author

One of the problems which were there in the previous PR was how the response was structured as I was sending the whole local response in the responseData field instead of just translatedText field. The other problem was the auto_finish problem. In it, tornado automatically invokes the finish() whenever it gets the response even if its a local response. Although I still don't clearly understand how the yield is working in the code and I am also not sure if my handling of the finish() is correct or not.

@sushain97
Copy link
Member

sushain97 commented Apr 5, 2018

When you do foo = yield bar it gets stored like you want it. When you do yield bar it's like a return. You might have to do some magic to allow all the translations to occur at the same time. In JS, the equivalent would be, Promise.all. There's certainly some equivalent mechanism we can use here.

@Axle7XStriker
Copy link
Author

Finally, I was able to find the equivalent which you were referring to. I found out that tornado provides a similar functionality in the form of gen.multi_future() which does the same thing as Promise.all. I also found out the reason behind the finish() being called twice, that was because before sending the final response I didn't check, whether the response is empty (because of certain error in resolving language pair) or filled previously but now that is also resolved with this commit.

Copy link
Member

@sushain97 sushain97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good! Some minor remarks.

@@ -157,26 +157,38 @@ def translate_and_respond(self, pair, pipeline, to_translate, mark_unknown, nosp
before = self.log_before_translation()
translated = yield pipeline.translate(to_translate, nosplit, deformat, reformat)
self.log_after_translation(before, len(to_translate))
self.send_response({
self.clean_pairs()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we will clean pairs for each individual q. Is that what we want?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't clearly get what clean pairs mean. Can you please explain this function.

Copy link
Member

@sushain97 sushain97 Apr 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It restarts pipes that are highly contentious and old. Take a look at the code path and it should be relatively clear. Honestly, I think translate_and_respond can be removed and its functionality collapsed into get so that cleaning pairs and logging can be done once instead of many times. Right now, we're doing the same thing a bunch of times inside the loop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, then I think clean pairs can be done in get only. Also, since gen.multi_future guarantees resolving futures in parallel, isn't that more efficient than collapsing translate_and_respond into get?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not mutually exclusive. Don't use the yield from translate_and_respond and you get a future.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you suggest we do about this? and should I put the clean pairs in get only?

Copy link
Member

@sushain97 sushain97 Apr 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collapse would be good and yes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While collapsing should I use yield to get the translated text and create a local response in the same loop or use gen.multi_future and create the local response in a different loop before creating the global response.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter since then you will block on all the translations rather than a single one.

reformat=reformat)
query_list = self.get_arguments('q')
langpair = self.get_argument('langpair')
list_of_futures = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list_of_futures -> translations or translation_futures would be clearer

langpair = self.get_argument('langpair')
list_of_futures = []
for query in query_list:
pair = self.get_pair_or_error(langpair, len(query))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line and the next three can go outside the loop, yes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of keeping it outside the loop but then I saw that we had to pass the length of the query string to be translated which is used for some logging purposes. Not sure what that is used for though. And other lines after that are dependent on this line, that is why I had to put it inside the loop instead of keeping it outside. What do you suggest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending the sum of the lengths should be OK.

@sushain97
Copy link
Member

Also, please add a test that verifies multiple q params works correctly. It should just be 3-5 LOC.

@Axle7XStriker
Copy link
Author

There is an Incompatible type assignment error which is coming on running mypy, so to fix this I think I should use the send_response function in the if condition only but the thing I don't get is, in the previous commits also I had done the same thing only but it passed there so why isn't it passing now?
Also, for the test case, I will open a new PR as soon as I am done with it.

@sushain97
Copy link
Member

What exactly is the type error? Also, tests should go in the same PR so I can verify that they pass.

@Axle7XStriker
Copy link
Author

By type error, I mean that in line 181, I am assigning a dict to a variable which contains a list on which an error is raised by mypy. But this error was not raised in Fixed twice Finish() call commit which also had the same line of code.

tests/test.py Outdated
response = self.fetch_translation(['welcome', 'respect', 'serve'], 'eng|spa')
translations = []
for translation_response in response['responseData']:
translations.append(translation_response['responseData']['translatedText'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a list comprehension or map, depending on your preference.

@sushain97
Copy link
Member

Yeah, I think mypy is being a bit trigger happy here. I suggest just using an inline ternary. i.e.

'responseData': response if len(response) > 1 else response[0]['responseData']

I'm not a huge fan of variables changing type anyway.

translation_futures.append(pipeline.translate(query, nosplit=False, deformat=deformat, reformat=reformat))
self.log_after_translation(before, len(query))

translations = yield gen.multi_future(translation_futures)
Copy link
Member

@sushain97 sushain97 Apr 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think L170 onwards needs to be indented.

if pair is not None:
    return

is also fine. I like early returns but it's really up to you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clear something you mean

if pair is None:
    return 

right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. My bad.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, I'll do it right now.

@sushain97
Copy link
Member

@unhammer see any issues?

@unhammer
Copy link
Member

What about subclass users of translate_and_respond?

@sushain97
Copy link
Member

sushain97 commented Apr 14, 2018

Good point, missed that. Looks like translate_raw is the only subclass user ( and we don't have tests for it :( ). I think that we should have a new @property in the base handler along the lines of mark_unknown and replicate the functionality in translate_raw since aside from the mark_unknown = ... code, the rest is pretty standard.

@Axle7XStriker
Copy link
Author

Although, I get that we need to replicate the functionality in translate_raw, but I don't get specifying the @property for mark_unknown in the base handler. Can you please explain that.

@sushain97
Copy link
Member

I don't get specifying the @Property for mark_unknown in the base handler. Can you please explain that.

Create a method in the base translation handler called mark_unknown that returns a boolean. Then, use the @property decorator on it so that it can be referred to as self.mark_unknown (i.e. without the parens).

@@ -60,6 +60,10 @@ class BaseHandler(tornado.web.RequestHandler):
def initialize(self):
self.callback = self.get_argument('callback', default=None)

@property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not here, in the base translation handler. Looks good otherwise

@sushain97 sushain97 requested a review from unhammer April 20, 2018 19:35
@sushain97 sushain97 force-pushed the master branch 2 times, most recently from b36ac97 to 51fb9c5 Compare January 3, 2022 10:56
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants