Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add table to cross check user migration status #136

Merged
merged 3 commits into from
Feb 5, 2020
Merged

Conversation

jrconlin
Copy link
Member

This branch adds support to look in a migration table for a user's
fxa_uid record. If it is found, the server will return a 5xx error to
the client preventing sync operations.

NOTE: IT IS IMPORTANT THAT THE SERVER NOT RETURN A 2xx RESULT TO ANY
CLIENT IN MIGRATION STATUS UNLESS THE meta/global BSO IS RESET.

The migration table is created by the user_migration script, which
is part of syncstorage-rs.

@jrconlin jrconlin requested review from rfk and a team January 23, 2020 20:38
@jrconlin jrconlin added this to In Review in Services Engineering via automation Jan 23, 2020
Copy link
Contributor

@rfk rfk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "request changes" here is specifically about the risky behaviour of swallowing errors; the rest of my comments are nits.

Two other things for us to think about here:

  • This adds a db lookup on every storage request, even the ones like /info/collections that currently only hit memcache. Is that going to be a problem for the servers in practice? I'd like to get @Micheletto's (or a delegate's) thoughts on that front, since the memcache is definitely there for a reason.
  • How will this affect self-hosters? I don't think there is any SQL migration infra in this repo currently, so as-s they would have to create the new table by hand.
    • Perhaps we can gate this behind a pref in config e.g. storage.allow_user_migration, default it to False, and only do the lookup if that pref is set explicitly to True?

syncstorage/storage/sql/dbconnect.py Outdated Show resolved Hide resolved
syncstorage/tests/test_sql.py Outdated Show resolved Hide resolved
syncstorage/views/__init__.py Show resolved Hide resolved
syncstorage/views/decorators.py Outdated Show resolved Hide resolved
syncstorage/storage/sql/__init__.py Outdated Show resolved Hide resolved
syncstorage/storage/sql/__init__.py Outdated Show resolved Hide resolved
syncstorage/tests/test_sql.py Outdated Show resolved Hide resolved
syncstorage/views/decorators.py Outdated Show resolved Hide resolved
@rfk
Copy link
Contributor

rfk commented Jan 24, 2020

Please also consider the additional test from #137

Copy link
Contributor

@rfk rfk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm down to leaving nits on your docstrings, so I think this is good to go from my perspective ;-)

Two non-nit comments that I want to call out (but that don't need to go through another round of review):

  • The deletion of the "old storage" test suite
  • The unconditional addition of testing data during table setup

I realized that one of my previous concerns (self-hosters knowing to create the table) is probably going to work out OK in practice, because they typically have create_tables=True, and that will work fine in this case without any special migration logic since we're just adding a new table.

I remain lightly concerned about the perf impact of doing a db lookup on every request, even ones that were previously only hitting memcache. But that's all theoretical until we try it out, so let's push ahead and find out.

If we do find the per is unacceptable in practice, I think there's a way to avoid doing this lookup for /info/collections requests, which make up the bulk of the memcache-only requests. But let's cross that bridge if and when we come to it.

syncstorage/storage/sql/__init__.py Outdated Show resolved Hide resolved
def is_migrating(self, session, user):
"""Returns if the user is migrating.

This should probably NOT be a cached call, since migration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In retrospect, it's not clear what you mean here by "be a cached call". Do you mean that it shouldn't be cached internally by this class, or not cached by callers of this class, or something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly? Either. The problem right now is that migration is independent of the server, so a migration may begin during an active sync session. Since I don't believe that the migration server has the ability to alter memcache as well as the database, I don't believe that the server can properly cache this state. (I would be thrilled to be proven wrong here.)

I'm also not going to disagree that having the server check the table on call isn't fantastic. The alternate is possibly letting some corruption leak into the user record. I'm already trying my best to limit the potential by making the potential damage radius as small as possible.

I guess it comes down to keeping the server safe or performant during migration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Just to be clear, I'm not trying to argue for any sort of caching mechanism here - it just wasn't clear as a reader of the docstring exactly what it was trying to tell me to do or not do. I would support a more direct imperative like "Do not cache the result of this call, since...".

syncstorage/storage/sql/dbconnect.py Outdated Show resolved Hide resolved
syncstorage/storage/sql/dbconnect.py Outdated Show resolved Hide resolved
syncstorage/tests/functional/support.py Outdated Show resolved Hide resolved
syncstorage/tests/functional/test_old_storage.py Outdated Show resolved Hide resolved
metadata,
Column("fxa_uid", String(255), primary_key=True, nullable=False),
Column("started_at", BigInteger, nullable=False),
Column("state", String(32)) # unknown/NULL, in_progress, complete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Micheletto, any concern about the space overhead of using a string here rather than e.g. a smallint enum?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I'm fine with whatever. I just need to make sure that this is coordinated with the other PRs related to migration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrconlin do we need to open a new PR here(and issue) to get this change in since it looks like this merged already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syncstorage/tests/test_wsgiapp.py Show resolved Hide resolved
@jrconlin
Copy link
Member Author

jrconlin commented Feb 3, 2020

Not going to argue about the potential performance impact. I'm really hoping that the migration state table is VERY temporary and we can disable it once migration has occurred. If you want, I might be able to put it behind an option flag since most stand-alone folk are probably not going to be migrating their data to spanner anytime soon.

@jrconlin
Copy link
Member Author

jrconlin commented Feb 3, 2020

waiting on @Micheletto's response regarding the state field.

@rfk
Copy link
Contributor

rfk commented Feb 4, 2020

If you want, I might be able to put it behind an option flag since most stand-alone folk
are probably not going to be migrating their data to spanner anytime soon.

Self-hosters are probably least likely to be impacted by perf issues here, since they have so few users. It may be worth an option for Mozilla China though.

@jrconlin
Copy link
Member Author

jrconlin commented Feb 4, 2020

Agreed about China. Let me see what I can do about that.

@@ -363,6 +379,7 @@ def __init__(self, sqluri, create_tables=False, pool_size=100,
collections.create(self.engine, checkfirst=True)
user_collections.create(self.engine, checkfirst=True)
batch_uploads.create(self.engine, checkfirst=True)
migration.create(self.engine, checkfirst=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Micheletto How have we done migrations (if any) in the past? Is create_tables = true (triggering this block) always specified in the config file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, migrations aren't always triggered, which is why i trap around any call to the migration table. The table can also be built by the migration script, which we will have a bit more control over.

This branch adds support to look in a `migration` table for a user's
`fxa_uid` record. If it is found, the server will return a 5xx error to
the client preventing sync operations.

NOTE: IT IS IMPORTANT THAT THE SERVER NOT RETURN A 2xx RESULT TO ANY
CLIENT IN MIGRATION STATUS UNLESS THE `meta`/`global` BSO IS RESET.

The `migration` table can be created by the `user_migration` script, which
is part of `syncstorage-rs`.

To enable migration support add the following flag to the `ini` file:

```ini
[storage]
allow_migration = true
```
@jrconlin
Copy link
Member Author

jrconlin commented Feb 5, 2020

Going to land this because @Micheletto is going to be out for a while. Can address any concerns of his later.

@jrconlin jrconlin merged commit bd00249 into master Feb 5, 2020
Services Engineering automation moved this from In Review to Done Feb 5, 2020
@tublitzed tublitzed moved this from Done to Archived in Services Engineering Feb 11, 2020
@jrconlin jrconlin deleted the migration branch February 13, 2020 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants