Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change migration to use django's get_model #1313

Merged
merged 1 commit into from May 14, 2021

Conversation

daviddavis
Copy link
Contributor

fixes #8656

Please be sure you have read our documentation on creating PRs:
https://docs.pulpproject.org/contributing/pull-request-walkthrough.html

@pulpbot
Copy link
Member

pulpbot commented May 7, 2021

Attached issue: https://pulp.plan.io/issues/8656

@bmbouter
Copy link
Member

bmbouter commented May 7, 2021

I ran two tests on this. Both worked completely for me.

Upgrade and expected to remove sha1 and md5

  1. Install a fresh pulpcore==3.10.0 (the one without this migration at all) and pulp_file==1.6.1 and set ALLOWED_CONTENT_CHECKSUMS = ['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512']
  2. Sync a file repo
  3. verify in the db that all checksums are calculated and present
  4. Stop pulp and modify the setting to `ALLOWED_CONTENT_CHECKSUMS = ['sha224', 'sha256', 'sha384', 'sha512']
  5. upgrade to this PR and latest pulp_file which includes running this migration
  6. verify in the db that sha1 and md5 have been removed and pulp starts without complaint

Upgrade and expected to add sha1 and md5

  1. Install a fresh pulpcore==3.10.0 (the one without this migration at all) and pulp_file==1.6.1 and set ALLOWED_CONTENT_CHECKSUMS = ['sha224', 'sha256', 'sha384', 'sha512']
  2. Sync a file repo
  3. verify in the db that all checksums are calculated and present except for sha1 and md5
  4. Stop pulp and modify the setting to `ALLOWED_CONTENT_CHECKSUMS = ['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512']
  5. upgrade to this PR and latest pulp_file which includes running this migration
  6. verify in the db that sha1 and md5 have been calculated and are present and pulp starts without complaint

Copy link
Member

@bmbouter bmbouter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also read through the code. This all look good to me. Thank you so much @daviddavis !

if artifacts_qs:
Artifact.objects.bulk_update(objs=artifacts_qs, fields=[checksum], batch_size=1000)
if paths:
_logger.warn(
Copy link
Member

@ipanova ipanova May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially this warning message will be printed as many times as len of ALLOWED_CONTENT_CHECKSUMS

Applying core.0059_proxy_creds... OK
  Applying core.0060_data_migration_proxy_creds... OK
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
  Applying core.0061_call_handle_artifact_checksums_command... OK

you can make paths a set() and move the if paths out of the outer forloop.
But this is a neat-pick, i am happy either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that sets use a large amount of memory. I think having duplicate warnings is probably acceptable trade off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the error message to specify which checksum it's missing artifact files for so the error message should no longer be duplicated.

@cutwater
Copy link
Contributor

From my perspective the whole approach of running this task in the migration tree is wrong. This leads to nondeterministic migration application time, which may also lead to failure due to timeout or out of memory on large databases. It fetches entire queryset in memory and then while iterating over it fetches all related content re-calculating checksums for it. Normally these kind of tasks should be performed as a background tasks, not causing the whole service unavailability, until the migration is complete.

Also the migration has an external dependency to site settings (ALLOWED_CONTENT_CHECKSUMS).

@daviddavis
Copy link
Contributor Author

We talked about this during our pulpcore meeting. I'll try to sum it up. The three options we considered:

  1. Change the migration to a noop and have users manually run the handle-artifact-checksums command when they upgrade
  2. Run the migration but if it fails, proceed anyway and then have users manually run the handle-artifact-checksums command
  3. Run the migration and raise an error if it fails

I think no one favored option 3 since it would leave the system in a bad state where users couldn't run the handle-artifact-checksums command if there was a failure. I think people were on the fence between options 1 and 2 but in the end we agreed on 2 as a compromise since should it fail, the migrations should proceed anyway. Users can run pulpcore-manager migrate core 0061 --fake to skip the migration as well. I think a large factor was the fact that we support variety of different installations where running a manual command during an upgrade would be problematic.

@daviddavis daviddavis force-pushed the issue8656 branch 3 times, most recently from b40f944 to ce7056f Compare May 13, 2021 16:58
@daviddavis
Copy link
Contributor Author

Ok, I added a wrapper to allow the migration to proceed if there's any failure and also made the perf improvements that were suggested.

Copy link
Member

@mdellweg mdellweg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see all comments addressed, and even the management command was improved with the iterator and batching.

@daviddavis daviddavis merged commit 20fa101 into pulp:master May 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants