Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assign WCA IDs when posting results #5123

Open
wants to merge 1 commit into
base: master
from

Conversation

@tussosedan
Copy link
Contributor

tussosedan commented Feb 1, 2020

Fixes #21

Once merged, we can run the following code to assign all the matching WCA IDs already in the system that aren't linked yet -- almost 19 thousand of them can be linked according to my tests. The command ran for ~20 minutes on my laptop.

However, first we need to handle some weird data of 4 registrations of non-existing users. As it crashes the code in an unexpected way, we can update the registrations to be deleted for now.

Registration.where.not(user_id: User.all).update_all(deleted_at: Time.now, deleted_by: 277, accepted_at: nil) 
Competition.all.order(registration_close: :desc).each {|comp| comp.registrations.accepted.each { |registration| registration.user.maybe_assign_wca_id_by_results(comp, notify=false) } }
@Jambrose777

This comment has been minimized.

Copy link
Contributor

Jambrose777 commented Feb 2, 2020

Does this check if the WCAID is already assigned to a different account?

@tussosedan

This comment has been minimized.

Copy link
Contributor Author

tussosedan commented Feb 2, 2020

@Jambrose777 of course! The matches.first.user.nil? part checks that the person in the results is not connected to a user, meaning the WCA ID is not assigned to any account.

@SAuroux

This comment has been minimized.

Copy link
Contributor

SAuroux commented Feb 2, 2020

There are however 17233 (!!!) non-unique name/DOB combinations in the users table right now:

SELECT COUNT(*) FROM (SELECT name, dob, count(id) as num FROM users GROUP BY name, dob) t WHERE num > 1

How do you deal with this?

@tussosedan

This comment has been minimized.

Copy link
Contributor Author

tussosedan commented Feb 3, 2020

@SAuroux if the user that was used for an accepted registration matches one and only one person in the results by name + DOB + country + gender, and if that WCA ID is not assigned to any user, then that user gets assigned the WCA ID. The other possibly duplicate users are ignored here, as they were not the ones used for registration. What do you think?

@SAuroux

This comment has been minimized.

Copy link
Contributor

SAuroux commented Feb 3, 2020

Hm, that actually sounds pretty good to me. Not perfect though, as you can still have multiple such accounts. What happens in that case? My suggestion would be to take the account most recently used for such a registration.
Apart from that, how is this going to work for future registrations? I.e. when does the check get executed?

@tussosedan

This comment has been minimized.

Copy link
Contributor Author

tussosedan commented Feb 4, 2020

@SAuroux this process happens when results are posted, in the same place that so far only sent emails to newcomers to claim their WCA IDs -- so it only happens to the user account used for an accepted registration for the comp that gets its results posted.

@SAuroux

This comment has been minimized.

Copy link
Contributor

SAuroux commented Feb 4, 2020

Thanks for clarifying that @tussosedan, that sounds good too. What about the tie question?

@tussosedan

This comment has been minimized.

Copy link
Contributor Author

tussosedan commented Feb 5, 2020

@SAuroux I'm not sure what tie do you mean, could you give an example?

@SAuroux

This comment has been minimized.

Copy link
Contributor

SAuroux commented Feb 5, 2020

@tussosedan what you said was

if the user that was used for an accepted registration matches one and only one person in the results by name + DOB + country + gender, and if that WCA ID is not assigned to any user, then that user gets assigned the WCA ID.

When you run the code for a specific competition (whose results where just posted) there can of course be only one such user. But when running the code for all past competitions, you can have multiple such users for one WCA ID.

Copy link
Member

jonatanklosko left a comment

I'm glad this turned out very clean and I think the logic is perfectly fine. Happy to hear another opinion as this is a relatively significant change 🚀

@tussosedan tussosedan force-pushed the tussosedan:fix-21-link-wca-ids branch from 4f7dca0 to 9ab7f60 Feb 8, 2020
@tussosedan

This comment has been minimized.

Copy link
Contributor Author

tussosedan commented Feb 8, 2020

Merged conflicts.

Copy link
Member

viroulep left a comment

Thanks a lot for addressing this very long standing issue @tussosedan!

In general the code LGTM.

I'm a bit concerned with having something run 20 minutes against the db, we definitely need to make sure it won't interact badly with the WRT's work (ie: not have CAD and this run at the same time as regular users would probably notice some slow usage of the website).
I'm not totally familiar with the WRT workload but I'd assume most results of the weekend are posted by Wednesday and it should be an appropriate day to deploy this @SAuroux?
In which case we can schedule a time frame where we avoid posting results and we deploy this.

Registration.where.not(user_id: User.all).update_all(deleted_at: Time.now, deleted_by: 277, accepted_at: nil)

Yeah it may be better to put my id there instead of yours :p

WcaOnRails/app/models/user.rb Show resolved Hide resolved
Copy link
Member

viroulep left a comment

The previous comment was meant as an approving review, I probably messed up the selected radio when commenting around...

@tussosedan

This comment has been minimized.

Copy link
Contributor Author

tussosedan commented Feb 9, 2020

Great! It may run much faster on the AWS instances than on my laptop (hopefully...), it would be great to test it on staging.

@SAuroux

This comment has been minimized.

Copy link
Contributor

SAuroux commented Feb 9, 2020

@viroulep @tussosedan If all you need is a 20-30 minute time window feel free to go ahead right now, WRT's queue is empty right now!

@tussosedan tussosedan force-pushed the tussosedan:fix-21-link-wca-ids branch from 9ab7f60 to 1c2200e Feb 9, 2020
Copy link
Member

viroulep left a comment

Awesome thanks @tussosedan!

@SAuroux I'll check that on staging first, I'll ping the WRT for an available time frame before doing anything to prod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

6 participants
You can’t perform that action at this time.