Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage.sync: first cut at the core sync infrastructure #2892

Merged
merged 1 commit into from May 4, 2020

Conversation

mhammond
Copy link
Member

@mhammond mhammond commented Apr 1, 2020

Builds on #2810. This needs the trait from Lina's patch, but is setup to work with it. It implements the 3-way merge (although doesn't quite get deletions right yet, so there's still some work to do) - but comments welcome.

@mhammond mhammond changed the title First cut at the core sync infrastructure storage.sync: first cut at the core sync infrastructure Apr 1, 2020
@thomcc
Copy link
Contributor

thomcc commented Apr 1, 2020

Regarding how to do the two/three way merge, I'm pretty happy with how https://github.com/mozilla/application-services/blob/master/components/logins/src/update_plan.rs#L34-L60 works, and it's a high level description of the set of operations you perform for a three way merge.

https://github.com/mozilla/application-services/blob/master/components/logins/src/db.rs#L1061-L1114 is the code that calls it, and decides how to handle deletions vs missing mirror, etc. I think trying to handle it as you have is going to be tricky (but your comment says just about as much).

That said, handing things in bulk as this does is clearly not necessary, it could still be done piecemeal.

components/webext-storage/src/api.rs Outdated Show resolved Hide resolved
components/webext-storage/src/schema.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/outgoing.rs Outdated Show resolved Hide resolved
components/webext-storage/sql/create_schema.sql Outdated Show resolved Hide resolved
components/webext-storage/sql/create_schema.sql Outdated Show resolved Hide resolved
components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@rfk rfk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my comments on #2810, I was poking around in here to see if I understood what's going on.

components/webext-storage/src/sync/outgoing.rs Outdated Show resolved Hide resolved
components/webext-storage/src/api.rs Outdated Show resolved Hide resolved
components/webext-storage/src/api.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
// XXX - this needs more thought, and probably needs significant changes.
// Main problem is that it doesn't handle deletions - but to do that, we need
// something other than a simple Option<JsonMap> - we need to differentiate
// "doesn't exist" from "removed".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naively, I would expect the three-way nature of three-way-merge to help us with deletions here. For example we could iterate over the keys in parent and find any that have been removed from other, indicating that they were deleted remotely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could do this, but TBH I'm not sure what the exact semantics are here - it's not documented, so we probably need to check how chrome behaves in these cases. I've just punted doing anything else here for now as hopefully @glasserc is tackling that as part of #2900

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing the discussion from that thread, it sounds like Chrome's behavior is too coarse, and not really what we want...but also, the Kinto "client wins" strategy isn't right for us, and it's more important to maintain compatibility with what we had before than with Chrome's implementation. Is that accurate?

Since we have the mirror copy, @rfk's suggestion makes sense!

Copy link
Contributor

@linabutler linabutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I left some comments here and there, but I really like this overall!

components/webext-storage/src/api.rs Outdated Show resolved Hide resolved
components/webext-storage/src/schema.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
}
(None, _, _) => {
// Deleted remotely. Server wins.
// XXX - WRONG - we want to 3 way merge here still!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

components/webext-storage/src/sync/incoming.rs Outdated Show resolved Hide resolved
// XXX - need to work out how to consolidate "payload" vs "bso", particularly
// the timestamp, etc. For now, we represent this in ServerPayload, even though
// that's never actually stored in this way.
// XXX - this isn't going to work without other changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, yeah, we're going to need to pass that over in Desktop. I wonder if we can "just" pass the raw decrypted JSON over XPCOM, and have the Desktop integration layer parse it into a sync15_traits::Payload (which has id, deleted, and data already on it) before handing it over to webext_storage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a bit of time noodling on this, and the result is #3067 and https://phabricator.services.mozilla.com/D73581. Basically, we have this thing called an "envelope", which is BSO metadata fields + the unparsed payload. An incoming envelope has a payload() method, which you can use to get at the payload—which you can then either deserialize into a record, or check if it's a tombstone via is_deleted, like you can in a-s sync15.

I think we can land your patch as-is now, and, assuming you like the envelope idea, go back and tweak this after.

// XXX - this needs more thought, and probably needs significant changes.
// Main problem is that it doesn't handle deletions - but to do that, we need
// something other than a simple Option<JsonMap> - we need to differentiate
// "doesn't exist" from "removed".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing the discussion from that thread, it sounds like Chrome's behavior is too coarse, and not really what we want...but also, the Kinto "client wins" strategy isn't right for us, and it's more important to maintain compatibility with what we had before than with Chrome's implementation. Is that accurate?

Since we have the mirror copy, @rfk's suggestion makes sense!

components/webext-storage/src/sync/mod.rs Show resolved Hide resolved
components/webext-storage/src/sync/outgoing.rs Outdated Show resolved Hide resolved
@linabutler linabutler changed the base branch from storage.sync-schema to master April 27, 2020 05:47
@linabutler linabutler self-requested a review April 27, 2020 06:14
@mhammond
Copy link
Member Author

phew! Thanks the the thorough review @linacambridge! I think I've addressed most review comments here - what's outstanding is:

  • Working out what the payload struct should look like - I figured it's best to do that as part of wiring up the sync bridge, and it should be quite easy to retrofit changes here.

  • A few outstanding items around merging, for which I'm going to wait for Ethan.

linabutler added a commit that referenced this pull request May 1, 2020
Currently, the only supported method is `wipe`, with all other
methods returning `NotImplemented`. We can wire this up to Mark's
syncing implementation in #2892. This also implements enough of the
glue to where it can be vendored and used in Desktop.
linabutler added a commit that referenced this pull request May 3, 2020
Currently, the only supported method is `wipe`, with all other
methods returning `NotImplemented`. We can wire this up to Mark's
syncing implementation in #2892. This also implements enough of the
glue to where it can be vendored and used in Desktop.
Copy link
Contributor

@linabutler linabutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much, @mhammond, this is fantastic work! 😍😍 I love the cleanup with DataState, and all the extra comments and tests! I think we should get this landed as soon as we can, and file follow-up bugs for anything else we need—and that way, we can target @glasserc's merging PR to master.


/// Details about an incoming item.
#[derive(Debug, PartialEq)]
pub struct IncomingItem {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds perfect to me, thanks for taking a look and the thorough response! I agree that the alternatives are even worse, and the way you have this set up now is the cleanest.

/// record. This "state" is the input to calculating the IncomingAction.
#[derive(Debug, PartialEq)]
pub enum IncomingState {
/// There's an incoming item, but data for that extension doesn't exist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love these comments, thank you so much for adding them! 😍

IncomingOnly { incoming: DataState },
/// There's an incoming item and we have data for the same extension in
/// our local store - but not in our mirror. This should be relatively
/// uncoming as it means:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: uncommon

// XXX - need to work out how to consolidate "payload" vs "bso", particularly
// the timestamp, etc. For now, we represent this in ServerPayload, even though
// that's never actually stored in this way.
// XXX - this isn't going to work without other changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a bit of time noodling on this, and the result is #3067 and https://phabricator.services.mozilla.com/D73581. Basically, we have this thing called an "envelope", which is BSO metadata fields + the unparsed payload. An incoming envelope has a payload() method, which you can use to get at the payload—which you can then either deserialize into a record, or check if it's a tombstone via is_deleted, like you can in a-s sync15.

I think we can land your patch as-is now, and, assuming you like the envelope idea, go back and tweak this after.

components/webext-storage/src/sync/outgoing.rs Outdated Show resolved Hide resolved
components/webext-storage/src/sync/outgoing.rs Outdated Show resolved Hide resolved
@mhammond mhammond marked this pull request as ready for review May 4, 2020 05:22
@mhammond mhammond added the checkin-needed Auto-merge this PR if there's an approved review and CI passes label May 4, 2020
@linabutler linabutler merged commit 5c52133 into master May 4, 2020
linabutler added a commit that referenced this pull request May 4, 2020
Currently, the only supported method is `wipe`, with all other
methods returning `NotImplemented`. We can wire this up to Mark's
syncing implementation in #2892. This also implements enough of the
glue to where it can be vendored and used in Desktop.
@mhammond mhammond deleted the storage.sync-sync branch May 28, 2020 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
checkin-needed Auto-merge this PR if there's an approved review and CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants