Bug 1240041 - Fix Top Sites concurrency issues#1642
Conversation
1210e65 to
e75ee34
Compare
|
I fired up the simulator with an existing synced account, went to delete the first top site, and got a crash: I have a feeling its crashing because I have more than 12 top sites and the deletion doesn't factor in that a new tile will come in after deleting the first one? |
|
I think it might have to do with this removal: e75ee34#diff-abfc8d88f8fbb974154bce59e22c81a6L218 It checks to see if we have enough to add in another tile so instead of just a deletion its a delete + insert. |
|
Nice catch -- I emptied my sites from testing deleting so much and forgot to try on a full set of sites ;) |
|
Though that presents another question: what do we show? If our cache is larger than the number of thumbnails that fit, we can use the next site in the list to fill the gap (though there's no guarantee it will be the same site after the actual requery). Best case, it matches. Otherwise, it doesn't, and the top sites change on us after we click Done. Worse: what happens if the cache isn't larger than the number of thumbnails? We'll end up with gaps where the deleted sites are, and they won't be filled until we click Done. |
|
I think the most consistent thing to do is that we let you delete whatever you see, and when you exit edit mode we fill in the gaps. But that's annoying: if your results are crap, you will have to re-enter edit mode multiple times to get rid of the junk on your top sites. I think the least annoying thing to do, then, is to keep streaming in new results. As @sleroux notes, to handle that you'd need to compute the new results, and in the same animation block switch the data source, diff, remove the deleted item, and then insert the new one. We could more smoothly handle the common case by pre-fetching extras (perhaps we already do?), or pre-fetching the next N as soon as the user enters edit mode or has deleted N-M, with the goal that there's always a filled |
|
This is some gnarly stuff, but here goes (going out of order for ease of explanation): Part 2: First, we move the refresh to happen after the deletion. This gives us a consistent data set to work with during the deletion while making sure we also keep our buffer up-to-date if we delete more. To prevent multiple quick deletions from conflicting with each other, we hold the But there's a problem: since we pull in the latest sites after each deletion, and since the sites/order can change for any Sync updates or location changes, we have no guarantees about which site we're getting back from the last visible index. It's possible that some new sites have shown up in the meantime, pushing down the existing visible sites, so the "new" site we're pulling in is actually a site we're already showing. That's where Part 1 comes in: we now update, rather than replace, the existing data source -- preserving the order in the process. That means any new sites will always be appended to the existing set of sites. Finally, Part 3 is a simple extension to this |
|
Logically it makes more sense than what was happening before but I was able to find two edge cases where this breaks down:
[bnicholson: edited to scale down images] |
|
Thanks for testing. Yeah, I came across similar issues last night. Two more parts: Part 5: After adding Part 4, I noticed that new sites would always appear after suggested sites unless I opened Top Sites a second time (open the panel, Cancel, open the panel again). The reason is that we do two queries to the DB when we open the panel: the first pass gets the stale set of sites, and the second pass gets the updated set of sites. But since we hold onto the position of the suggested sites after the first pass, the So, long story short: we should never show suggested sites before normal sites if we're simply loading the panel (unless your top history sites actually include a suggested site domain, of course). Deletions may cause suggested sites to appear before normal sites because we want the animation to be smooth, and new sites may have been added to the DB since the panel was open. But that will fix itself the next time we open Top Sites. Depending what we decide for bug 1257592, that could also have an effect here. Also, it was possible to hit duplicates because the |
| self.deleteOrUpdateSites(result, indexPath: indexPath) | ||
| self.collection?.userInteractionEnabled = true | ||
| profile.history.removeSiteFromTopSites(site).uponQueue(dispatch_get_main_queue()) { _ in | ||
| // Remove the site from the current data source. Don't requery yet |
There was a problem hiding this comment.
You should check isSuccess on the input here. If the removal failed, bail out.
|
I hate to say "LGTM" on a PR like this, but: modulo those nits, if it stands up to the testing you've done, this looks fine to me. |
|
Speaking of testing, I'll throw in a few UI tests that can verify some of the things here (order of incoming sites after deletions, maintaining the positioning of suggested sites, etc.). |
|
Did some more testing and I'm seeing a lot of flickering on the tiles when they are being reloaded. Seems that the tiles that come in after deletion have their favicons flicker on the next deletion. Were you seeing this as well? I'm not sure if this was happening before or not. |
|
FWIW on all branches I've seen icons flicker in every time top sites loads, sometimes. We're probably doing something wrong somewhere :/ |
|
@sleroux Yeah, I see the flicker when deleting here. This happens because we now call @rnewman Related: what's happening is we often fetch (and call Separate bug, but obvious options are:
|
Bug 1240041 - Fix Top Sites concurrency issues
|
Merged with tests and comments addressed: f39f2d8 |
As mentioned this morning, the issue here is that sync triggers while we're acting on a stale UI, causing the UI to be out-of-sync with the data source count. There are two places this can affect us:
refreshTopSites, which can results in inconsistency between the stale post-deletion data and the newly updated sync data.To address the first case, I proposed we simply drop sync notifications while the view is visible; @rnewman suggested postponing Sync (and other) refreshes until we're done editing. I went with the latter here since that's closer to our current behavior, and we can discuss how we want to handle Sync (and location change) updates in bug 1257592.
The fix to the second case is to simply act on the existing data source instead of pulling it in after a deletion, which can cause bad things to happen. Right now, for sync additions, we hit bug 1257291; for sync deletions, we'll crash.