Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't go past 54 units after going through critical checks and muting them #4983

Closed
iafan opened this issue Jul 27, 2016 · 18 comments
Closed

Comments

@iafan
Copy link
Contributor

iafan commented Jul 27, 2016

I've got the same problem twice in a row. I'm on latest stable Windows / Chrome.

Here are the steps:

  1. I clicked on the red pill in the project browser to go through errors. The URL I clicked on was https://translate.evernote.com/projects/mac_v5_evernote/translate/#filter=checks&category=critical
  2. Muted the check (clicked 'X' with the mouse), then pressed Ctrl+Down to go to the next unit. Repeated 54 times.
  3. Pagination control says 54 / 194 and there are no more units loaded at the bottom.

Javascript console shows no errors. This is the log of XHR requests:

  1. Right from the start: https://translate.evernote.com/xhr/stats/checks/?path=%2Fprojects%2Fmac_v5_evernote%2F
  2. Immediately after (1st page of units): https://translate.evernote.com/xhr/units/?path=%2Fprojects%2Fmac_v5_evernote%2F&category=critical&filter=checks
  3. alternating edit/ and toggle/ requests for 18 units
  4. request for the second page of units: https://translate.evernote.com/xhr/units/?path=%2Fprojects%2Fmac_v5_evernote%2F&category=critical&filter=checks&offset=18&previous_uids=3807041&previous_uids=4992561&previous_uids=3807045&previous_uids=4992562&previous_uids=3807047&previous_uids=3807048&previous_uids=4992563&previous_uids=3807050&previous_uids=3807051&previous_uids=4992564&previous_uids=3807053&previous_uids=3807054&previous_uids=4992565&previous_uids=3807056&previous_uids=3807057&previous_uids=3807058&previous_uids=3809590&previous_uids=4992593
  5. alternating edit/ and toggle/ requests for 18 units
  6. request for the third page of units: https://translate.evernote.com/xhr/units/?path=%2Fprojects%2Fmac_v5_evernote%2F&category=critical&filter=checks&offset=25&previous_uids=3809592&previous_uids=4992594&previous_uids=3809594&previous_uids=4992595&previous_uids=3809596&previous_uids=3809597&previous_uids=4992596&previous_uids=3809599&previous_uids=3809600&previous_uids=4992597&previous_uids=3809602&previous_uids=3809603&previous_uids=4992598&previous_uids=3809605&previous_uids=3809606&previous_uids=3809607&previous_uids=4992626&previous_uids=3812141
  7. alternating edit/ and toggle/ requests for 18 units

That's all. There's no request for new units past the third page.

I still have the browser page open in this state, so if you want me to run some debug JS, let me know.

See also a totally unrelated bug: #4625

@iafan
Copy link
Contributor Author

iafan commented Jul 28, 2016

One thing I noticed is that in XHR there is a pretty strange offset at step 6: offset=25. Why 25 and not 36? Could it be that because of this offset the returned set of units confuses some client or server logic and the client in the end thinks we reached the end of the list?

@iafan
Copy link
Contributor Author

iafan commented Aug 1, 2016

From my experiments, this bug is 100% reproducible. I can never go past 54 units when muting checks.

@dwaynebailey
Copy link
Member

@iafan this is on a live site, do you have a staging site we can use? I can't seem to login to the EN staging instance.

@iafan
Copy link
Contributor Author

iafan commented Aug 1, 2016

@dwaynebailey I recreated your admin user entrу on stage — do you remember your username/password? Go to https://translate.stage.evernote.com/projects/, log in, click the "Fix critical errors" link in the action toolbar, and go through 54 units, muting the check on each of them.

@dwaynebailey
Copy link
Member

@iafan managed to replicate. Will play a bit to see if there is a simpler way to replicate.

Also noted that you can't unmute muted checks, not sure if that is related to this bug though.

@iafan
Copy link
Contributor Author

iafan commented Aug 2, 2016

Also noted that you can't unmute muted checks

Indeed. Will you file a bug or should I do this?

@dwaynebailey
Copy link
Member

@iafan if you can validate please file bug. I wasn't 100% sure if I was seeing mirage or not.

@iafan
Copy link
Contributor Author

iafan commented Aug 3, 2016

Ok, filed #5010

@julen
Copy link
Contributor

julen commented Aug 3, 2016

This might be related: while translating on MLO using the incomplete filter, I got stuck at 96/181 (I skipped some units manually in the process). Although now that I check the about page, it reports version 2.7.3b1, so I have no way to really confirm whether it's the same issue or not.

@dwaynebailey
Copy link
Member

@julen mlo is on master. The VERSION variable hasn't been updated correctly on master it seems. So possibly you saw the same issue.

@julen julen added this to the 2.8.0 milestone Aug 3, 2016
@julen
Copy link
Contributor

julen commented Aug 3, 2016

As a heads-up, I retried on the updated MLO the same what I did this morning and I could reproduce the issue again (stuck at 80/82).

@iafan
Copy link
Contributor Author

iafan commented Oct 2, 2016

I think the bug is somewhere in this code (and in the approach behind it in general): https://github.com/translate/pootle/blob/master/pootle/apps/pootle_store/unit/search.py#L171-L179

Server tries to rely on an offset to the search results that can unpredictably change from one execution to another. This can possibly work only in the case of a stable set (no synchronization in background + only navigating through units without altering their state that would affect the filter + no one else editing these units). Any single step off that path, and you can get all sorts of odd behavior.

The only reliable solution in addressing this that comes to my mind would be to keep the snapshot of uids on the client and not redo any search on requesting the new page of units. It would work as follows:

  1. whenever the user goes into translation UI, server returns them a vector of uids capped at e.g. 1000 units (this is more than enough for one pass, be it review or translation). This adds about 10KB to the unzipped page source, which should be ok.
  2. client renders the UI, and as user navigates through the units, it sends requests to the server specifying exact uids for the units it needs full information about.
  3. when/if the user reaches the 1000th unit in a single pass, and if there were more than this number of units in the filter initially, the UI can render a button at the bottom of the table, by clicking on which the user can redo the search (reapply the filter) and start with unit 1. The effect of this button would be the same as going back to the browser table and then clicking the pill link again.

In addition to fixing the original problem, this approach has several advantages:

  1. Smaller number of expensive SQL queries
  2. Ability to implement a "continuously scrolling" translation UI without showing just 9 units above/below.

@dwaynebailey
Copy link
Member

@iafan this is much like the model we moved away from for some very good technical reasons including that data "can unpredictably change from one execution to another" (search for untranslated units and suddenly you have a translated one that someone else visited), it was really expensive and killed users browsers. Choosing a bigger offset doesn't actually change this issue, you're still looking at potentially out of date data if the backend data changed while you were hanging around in your 1000 units. Starting from 1 assumes a certain type of dataset, I don't think any translator wants to revisit a few 100 units that they've already evaluated. The code already does some work to find out where the offset would restart while you were busy.

The SQL queries are limited but still expensive and still not 100% correct over time. Smaller query sets are more correct over time as their footprint is just way smaller.

Continuous scrolling is tangential to this issue as nothing in the current model would prevent that being implemented.

My view is that a more correct model is to actually allow the units in the current offset to be updated when any changes happen on the server that might remove them from the dataset. That would allow the editor to safely remove them and adjust its expected offset. This coupled with the editor being able to request the next offset while busy with the last few units of the current offset would ensure a smooth transition from offset to offset and allow continuous scrolling to happen.

@iafan
Copy link
Contributor Author

iafan commented Oct 3, 2016

Note that even if you request smaller chunks, you still get them from sorted sets, so DB has to sort the entire set of units just to return a few of them. And now you do this on each page load.

Anyway, just wanted to provide my additional input based on the analysis of the code. The main takeaway here is that this doesn't seem to be a bug which is easy to fix; this requires significant change to the approach. I think we'll internally experiment with the approach I outlined above, and we'll pay attention to browser memory use.

@julen
Copy link
Contributor

julen commented Nov 6, 2016

Just wanna raise that during the Paris hackathon I've hit this many times in MLO while going through units and editing them.

@dwaynebailey
Copy link
Member

This is reproducible with a test file of broken critical checks made up on N repeats of:

msgid "One 55 %file"
msgstr "Een 55 %leêr "

This has broken printf() and endwhitespace.

The results show some consistant patterns which I think are linked to the counts for the unit batches we are retrieving. I suspect this is an offset issue where muting checks is somehow not adjusting expectations, or over adjusting expectations.

Just needed some pretty brutal clicky testing and a painfully consistent test file to replicate consistently. Personally I don't think its hard to fix once the pattern emerges.

@dwaynebailey
Copy link
Member

Small script I used to make a file to reproduce the bug. This is designed to have one critical check and one non-critical check. The number in the unit is designed to match the unit counts in the editor. The code below will produce 19 PO units, adjust the range to N+1 as needed for more units.

# -*- coding: utf-8 -*-
from translate.storage import po

pofile = po.pofile()
for i in range(1, 20):
    unit = pofile.addsourceunit("One %s %%file" % i)
    unit.target = "Een %s %%leêr " % i

with open("54units.po", "w") as outfile:
    pofile.serialize(outfile)

@dwaynebailey
Copy link
Member

Fixed in #5798 clicky test results have been updated to reflect the outcomes with the new changes.

The issue was indeed offset and not methodology.

Bottom line: reproducible test, JS debugging, a head to work through the offset issues and patience to clicky testing was what this bug actually needed to get itself fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants