New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store 2-letter country codes in database #113
Conversation
2d77bb9
to
b46f3e3
Compare
b46f3e3
to
fb44df0
Compare
6c6b0b9
to
57f74ee
Compare
This PR has been updated to allow both alpha2 and alpha3 codes for new rows. We can keep this version on production, update eos-activation-server to send alpha2 codes, and then merge the |
|
This PR includes migrations for ping and activation, and #112 includes a migration for metrics. It means that we don’t need to rebase these deployments, and that we can deploy #112 and #113 in any order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I tested it in the test environment (where there are no pings nor activations) and in the dev environments (with just 2 activations and 3 pings). My only concern is the migration command of ping_v1
in production where there are 23M records, but I don't think that trying to do that in raw SQL is better.
57f74ee
to
1f5405d
Compare
for chunk_number, chunk in enumerate(query, start=1): | ||
for ping in chunk: | ||
ping.country = tranform_alpha_3_into_2(ping.country) | ||
dbsession.commit() | ||
progress(chunk_number * args.chunk_size, num_records) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we have an issue with another similar migration recently where this proved extremely slow? Might it perform better to do something like (pseudocode):
for alpha_3, alpha_2 in COUNTRIES.items():
dbsession.execute("UPDATE activation_v1 SET country = ? WHERE country = ?", alpha_2, alpha_3)
which would allow the work to live in the database, rather than having to stream all 23 million rows (in chunks) in and out of the database?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we can try it and see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see @adarnimrod had the same observation above. I would naively expect the UPDATE
to perform better. I could be wrong!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we can try it and see.
I can try on the test database.
tranform_alpha_3_into_2
Oops, there’s a typo…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The migration takes about 3.5 days, it’s too long.
We can generate one request for each country value, that’s dirty but should be fast enough. I can try to find a better solution if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one UPDATE per country code is a clean solution, personally :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one UPDATE per country code is a clean solution, personally :)
😄
The new solution takes 10 minutes for pings. Tests pass, the result seems to be OK, but I’d prefer to have another review, just to be sure…
1f5405d
to
ad32d76
Compare
Countries are stored in ping_v1 and activation_v1 tables. As many third-party tools use 2-letter ISO codes, we use these commands to store alpha2 instead of alpha3 codes.
ad32d76
to
347d8d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR includes these changes:
only alpha2alpha2 and alpha3 codes when inserting new records,ping_v1
andactivation_v1
.https://phabricator.endlessm.com/T29210