Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LB-240: Dump sanitized version of the user table in the stats dump #293

Merged
merged 3 commits into from Nov 15, 2017

Conversation

paramsingh
Copy link
Collaborator

This should make it easier to import the stats dump in absence of a corresponding private dump.

Ideally, this should be reviewed and merged before #274, so that I can rebase and make the changes required to the import process after the extra table is added to the stats dump.

@paramsingh
Copy link
Collaborator Author

I've removed the three columns (auth_token, last_login and latest_import) completely from the sanitized dump. Is this the correct way to do this or is there another way?

@mayhem
Copy link
Member

mayhem commented Nov 14, 2017

I think that will still make the FK's fail upon import. We'll need to keep the columns, but put blank and/or dummy values in. For instance, in MB, when we need to dump the passwd field, we dump the data with all the passwords effectively set of "mb". We should probably do the same here.

Now the stats dump will contain dummy values for each of
the sensitive columns in the table.
Copy link
Member

@mayhem mayhem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really nice and sane now! Well done! 😋 :octocat:

# dump in the stats dump
'\'auth_token\'', # auth token
'to_timestamp(0)', # last_login
'to_timestamp(0)', # latest_import
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was too tired yesterday when I tried to understand this -- it feels a little odd, but in the general this is a clever way of solving the problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@mayhem mayhem merged commit 4aca739 into metabrainz:master Nov 15, 2017
@paramsingh paramsingh deleted the dump/sanitize-user branch November 15, 2017 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants