Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taught PulpImporter to retry (once) on _import_file() failure. #1278

Merged
merged 1 commit into from May 18, 2021

Conversation

ggainey
Copy link
Contributor

@ggainey ggainey commented Apr 27, 2021

There's a race condition in django-import-export's get_or_init_instance()
that is exercised by importing repo-versions concurrently. We attempt
an import and check for errors, retrying ONCE if encountered. On a
second error, fail the attempt.

The test added for pulp_rpm #7904 cover this case.

fixes #8633
ref #7904
[nocoverage]

@ggainey ggainey force-pushed the 8633_import_export_uq_retry branch from 15a17f4 to 90c0a31 Compare April 27, 2021 13:59
@pulpbot
Copy link
Member

pulpbot commented Apr 27, 2021

Attached issue: https://pulp.plan.io/issues/8633

ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Apr 27, 2021
Ordering-on-export insures we won't deadlock at import-time
when repo-versions have overlapping content.

Accepting this change without the referenced pulpcore fix
will result in fixing the deadlock, but generating unique-constraint
violations in the same place at import-time.

fixes #7904

Requires PR: pulp/pulpcore#1278
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Apr 27, 2021
Ordering-on-export insures we won't deadlock at import-time
when repo-versions have overlapping content.

Accepting this change without the referenced pulpcore fix
will result in fixing the deadlock, but generating unique-constraint
violations in the same place at import-time.

fixes #7904

Requires PR: pulp/pulpcore#1278
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Apr 27, 2021
Ordering-on-export insures we won't deadlock at import-time
when repo-versions have overlapping content.

Accepting this change without the referenced pulpcore fix
will result in fixing the deadlock, but generating unique-constraint
violations in the same place at import-time.

fixes #7904

Required PR: pulp/pulpcore#1278
)
# Second try will either succeed, or raise an error
try:
a_result = _import_file(os.path.join(rv_path, filename), res_class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will either succeed (as in have no errors) or raise an Exception, right?
In that case, there is no use for the while True: construct.

Instead, i would put the first invocation in a try construct, catch the specific constraints exception and do one retry in the except branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left the while-true construct in place because one thought I had was "is retrying exactly once the right thing to do?" I haven't convinced myself that there might be scenarios where multiple-retries are the Right Answer - but neither have I convinced myself one is The Way, either. Any thoughts?

RE retry-inside-except - mmm, yeah could do that. I try to avoid code that expects exceptions, inside an except: block, just because of Bad Experiences with nested exceptions in the past. May well be cleaner/more understandable here to do so tho, especially if "try just once" is the way we go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also get rid of while loop in case we end up re-trying once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And done - @ipanova @mdellweg re-review please?

There's a race condition in django-import-export's get_or_init_instance()
that is exercised by importing repo-versions concurrently. We attempt
an import and check for errors, retrying ONCE if encountered. On a
second error, fail the attempt.

The test added for pulp_rpm #7904 cover this case.

fixes #8633
[nocoverage]
@ggainey ggainey force-pushed the 8633_import_export_uq_retry branch from 90c0a31 to 6c013a1 Compare May 11, 2021 02:19
Copy link
Member

@mdellweg mdellweg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One cosmetic suggestion, however.

try:
log.info(_("Importing file {}.").format(fpath))
with open(fpath, "r") as json_file:
data = Dataset().load(json_file.read(), format="json")
resource = resource_class()
log.info(_("...Importing resource {}.").format(resource.__class__.__name__))
return resource.import_data(data, raise_errors=True)
return resource.import_data(data, raise_errors=do_raise)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could call that variable raise_errors for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I actually don't like exposing parameter names - esp someone else's parameter-names - in this way. To me, the parameter is telling _import_files() to either raise any errors found, or to deal with them. _import_files() is choosing to implement that contract by relying on what import_data() does - but that's an implementation detail.

Parameter names belong to the API that implement them, exposing them up the tree feels like a leaky/broken encapsulation to me.

@bmbouter bmbouter merged commit d621ebf into pulp:master May 18, 2021
ggainey added a commit to pulp/pulp_rpm that referenced this pull request May 18, 2021
Ordering-on-export insures we won't deadlock at import-time
when repo-versions have overlapping content.

Accepting this change without the referenced pulpcore fix
will result in fixing the deadlock, but generating unique-constraint
violations in the same place at import-time.

fixes #7904

Required PR: pulp/pulpcore#1278
@ggainey ggainey deleted the 8633_import_export_uq_retry branch May 10, 2022 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants