New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change migration to use django's get_model #1313
Conversation
|
Attached issue: https://pulp.plan.io/issues/8656 |
|
I ran two tests on this. Both worked completely for me. Upgrade and expected to remove sha1 and md5
Upgrade and expected to add sha1 and md5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also read through the code. This all look good to me. Thank you so much @daviddavis !
pulpcore/app/migrations/0061_call_handle_artifact_checksums_command.py
Outdated
Show resolved
Hide resolved
| if artifacts_qs: | ||
| Artifact.objects.bulk_update(objs=artifacts_qs, fields=[checksum], batch_size=1000) | ||
| if paths: | ||
| _logger.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potentially this warning message will be printed as many times as len of ALLOWED_CONTENT_CHECKSUMS
Applying core.0059_proxy_creds... OK
Applying core.0060_data_migration_proxy_creds... OK
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
pulp [None]: pulpcore.app.migrations.0061_call_handle_artifact_checksums_command:WARNING: Missing files needed to update artifact checksums: ['ina/path/to/artifact/']. Please run 'pulpcore-manager handle-artifact-checksums'.
Applying core.0061_call_handle_artifact_checksums_command... OK
you can make paths a set() and move the if paths out of the outer forloop.
But this is a neat-pick, i am happy either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that sets use a large amount of memory. I think having duplicate warnings is probably acceptable trade off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the error message to specify which checksum it's missing artifact files for so the error message should no longer be duplicated.
pulpcore/app/migrations/0061_call_handle_artifact_checksums_command.py
Outdated
Show resolved
Hide resolved
pulpcore/app/migrations/0061_call_handle_artifact_checksums_command.py
Outdated
Show resolved
Hide resolved
|
From my perspective the whole approach of running this task in the migration tree is wrong. This leads to nondeterministic migration application time, which may also lead to failure due to timeout or out of memory on large databases. It fetches entire queryset in memory and then while iterating over it fetches all related content re-calculating checksums for it. Normally these kind of tasks should be performed as a background tasks, not causing the whole service unavailability, until the migration is complete. Also the migration has an external dependency to site settings ( |
|
We talked about this during our pulpcore meeting. I'll try to sum it up. The three options we considered:
I think no one favored option 3 since it would leave the system in a bad state where users couldn't run the handle-artifact-checksums command if there was a failure. I think people were on the fence between options 1 and 2 but in the end we agreed on 2 as a compromise since should it fail, the migrations should proceed anyway. Users can run |
b40f944
to
ce7056f
Compare
fixes #8656
|
Ok, I added a wrapper to allow the migration to proceed if there's any failure and also made the perf improvements that were suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see all comments addressed, and even the management command was improved with the iterator and batching.
fixes #8656
Please be sure you have read our documentation on creating PRs:
https://docs.pulpproject.org/contributing/pull-request-walkthrough.html