Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: User migration scripts #401

Merged
merged 15 commits into from
Jan 24, 2020
Merged

feat: User migration scripts #401

merged 15 commits into from
Jan 24, 2020

Conversation

jrconlin
Copy link
Member

@jrconlin jrconlin commented Jan 7, 2020

Issue: #286

Description

Several scripts to assist in user migration from MySQL to spanner. Consult the README.md file for details on use and configuration.

Testing

  1. Set up a local syncserver, connect, and store some BSO data.
  2. Configure the move_dsns.lst config file to point to the correct mysql and spanner instances. (e.g.
mysql://test:test@localhost/syncstorage
spanner://projects/sync-spanner-dev-225401/instances/spanner-test/databases/sync_schema3
  1. Store the known local syncserver userID into move_users.lst (e.g. 1)
  2. After running setup (python3 -m venv venv && . venv/bin/activate && pip install -r requirements.txt) Run the python3 migrate_user.py script.
  3. user data record should be copied from mysql to spanner, Note the
    Processsing... # -> ###:###
    which displays the original mysql userid and the spanner fxa_uid:fxa_kid

Issue(s)

Issue #286, mozilla-services/services-engineering#18

Several scripts to assist in user migration from MySQL to spanner.

Issue: #286
@jrconlin jrconlin requested a review from a team January 7, 2020 20:08
@tublitzed
Copy link
Contributor

@jrconlin just to clarify, this is related to #18, not 186 or 286, y?

@jrconlin
Copy link
Member Author

jrconlin commented Jan 7, 2020

Sigh, typo. I've updated to address the two issues I believe are related. (Hilariously, I got the issue
correct in the git PR.)

@tublitzed
Copy link
Contributor

np, just making sure I wasn't missing something

@jrconlin jrconlin mentioned this pull request Jan 9, 2020
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/README.md Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
@tublitzed
Copy link
Contributor

@jrconlin following the setup instructions here, regarding step 2: where does the move_dsns.lst file live? I'm not seeing it here in this repo...

@tublitzed
Copy link
Contributor

^ ah, nvm. I see now. Following the example here going to assume I create this file and that it's used as the dsns arg. Trying that now....

* switch to `logging` for logging
* use spanner `collections` table as one source of truth about
  collection ID numbers
* use spanner.transaction
* gracefully fail on user previously migrated.
* remove some dragons
@jrconlin jrconlin requested a review from pjenvey January 14, 2020 00:01
@tublitzed
Copy link
Contributor

tublitzed commented Jan 14, 2020

When I run python3 migrate_user.py --dsns move_dsns.lst --users move_users.lst (with the .lst files in the same dir as the script) I see the following error:

  File "migrate_user.py", line 380, in <module>
    main()
  File "migrate_user.py", line 365, in main
    databases[dsn.scheme] = conf_db(dsn)
  File "migrate_user.py", line 135, in conf_db
    return conf_mysql(dsn)
  File "migrate_user.py", line 112, in conf_mysql
    database=dsn.path[1:]
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/__init__.py", line 179, in connect
    return MySQLConnection(*args, **kwargs)
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/connection.py", line 95, in __init__
    self.connect(**kwargs)
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/abstracts.py", line 716, in connect
    self._open_connection()
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/connection.py", line 210, in _open_connection
    self._ssl)
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/connection.py", line 142, in _do_auth
    auth_plugin=self._auth_plugin)
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/protocol.py", line 102, in make_auth
    auth_data, ssl_enabled)
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/protocol.py", line 58, in _auth_response
    auth = get_auth_plugin(auth_plugin)(
  File "/Users/tublitzed/GIT/syncstorage-rs/tools/user_migration/venv/lib/python3.7/site-packages/mysql/connector/authentication.py", line 191, in get_auth_plugin
    "Authentication plugin '{0}' is not supported".format(plugin_name))
mysql.connector.errors.NotSupportedError: Authentication plugin 'caching_sha2_password' is not supported
(venv)

Anyone else seeing this? I'm running from OSX, MySQL 8.0.18 if that helps.

Edit: I also kept poking around and tried with GOOGLE_APPLICATION_CREDENTIALS set correctly too if that helps. Not sure if the mysql-connector needs some extra configuration in order to get caching_sha2_password working...

@tublitzed
Copy link
Contributor

Looks like this might be a known issue with the caching_sha2_password plugin against newer MySQL versions

I have tried the suggested alter workaround to no avail. @jrconlin -what version of MySQL are you testing this against? Looks like 8.x might just be too new for caching_sha2_password...in which case I can try against an older version. Figured I'd check though before going through that :)

(FWIW, I don't think it matters if this script might require an older MySQL version, just noting this for anyone else trying to test this out)

pjenvey
pjenvey previously approved these changes Jan 14, 2020
Copy link
Member

@pjenvey pjenvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can merge as is but this needs a couple more bits:

  • collection_id creation in transaction
  • (probably) handle users exceeding batch limit
  • how to handle errors during a user migration. can the entire user's move be in one transaction (so if an error happens, we cleanly abort)? and if we abort a user's migration should we retry? or just report the user id that failed for fixing up later?

tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
@jrconlin
Copy link
Member Author

@tublitzed My server is reporting it's "5.7.28-0ubuntu0.18.04.4", which is what I get out of apt/debian, which might be stable locked for ubuntu.

@jrconlin
Copy link
Member Author

@pjenvey Fixed the collection_id to be in the same transaction, but tempted to leave the other two points as future bugs, mostly because i'll need to create a few datasets to try and replicate those so I understand what the breaking points are.

* pass transaction handle to collection_id generator
* reserve collection_ids < 100
@tublitzed
Copy link
Contributor

@jrconlin I downgraded to MySQL 5.7, still seem same errors from OSX. It'd probably be good to confirm we can get at least a few people able to run this locally so we can effectively test client behaviour internally before handing off to QA. IIRC the plan was to test locally first, so I don't want that to fall entirely to you if you're the only one who can run this :)

Also curious if maybe it's just my machine. @pjenvey are you able to run this locally?

@jrconlin
Copy link
Member Author

Also kinda curious if this might be a mac thing vs. linux thing. (if it is, i am even LESS enamored with mysql).

IIUC, the scripts would be run from debian sh/bash shells, since that's what GCP provides. I'm running ubuntu, which is a debian fork, so the libraries should match up reasonably well.

@jrconlin jrconlin requested a review from pjenvey January 16, 2020 17:36
pjenvey
pjenvey previously approved these changes Jan 23, 2020
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
cursor = database.cursor()
cursor.execute(
"""create table if not exists migration (
fxa_uid varchar(255) NOT NULL PRIMARY KEY,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this being for the python syncstorage side to reference, I think it will be much easier to use its "legacy" userid vs syncstorage-rs's fxa_uid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but "userid" is kinda db dependent since it can come off of the db row. fxa_uid isn't. I'd rather there be zero chance of collision here, so I'd prefer to use fxa_uid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I forgot it's really not difficult to get at either (it's already setup on the user dict/object "fxa_uid" field).

tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_user.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants