Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that pulp imports can run concurrently with orphan cleanup #2021

Open
pulpbot opened this issue Jan 17, 2022 · 4 comments
Open

Ensure that pulp imports can run concurrently with orphan cleanup #2021

pulpbot opened this issue Jan 17, 2022 · 4 comments
Labels

Comments

@pulpbot
Copy link
Member

pulpbot commented Jan 17, 2022

Author: daviddavis (daviddavis)

Redmine Issue: 8960, https://pulp.plan.io/issues/8960


I've confirmed that when an import happens, it bumps the timestamp_of_interest for artifacts and content so that when the import gets to the repository version creation step, the artifacts/content are still there.

However, I am not so sure that there isn't a possibly a race condition with django-import-export that could occur between when it selects the record(s) and then updates them. In fact, given the async problems we've seen in the past, I think it's likely.

I know we fixed another error similar to this:

https://pulp.plan.io/issues/8633

And perhaps our fix for #8633 might save us here: the fix added code to retry the import if it experienced errors and perhaps that will also apply to this situation where an artifact/content goes missing before it's updated. We need to confirm this though.

@pulpbot
Copy link
Member Author

pulpbot commented Jan 17, 2022

From: daviddavis (daviddavis)
Date: 2021-06-23T19:06:25Z


This needs to be done before we make orphan cleanup non-blocking but it doesn't necessarily need to be done as part of 3.14.

@pulpbot
Copy link
Member Author

pulpbot commented Jan 17, 2022

From: daviddavis (daviddavis)
Date: 2021-08-09T16:13:49Z


Per our go/no-go meeting, we decided that a potential fix could be released as a z-stream and doesn't need to block the 3.15 release.

@bmbouter
Copy link
Member

@ggainey do you know if this is still an issue?

@ggainey ggainey removed the Finished? label May 10, 2022
@ggainey
Copy link
Contributor

ggainey commented May 10, 2022

There is a hole at import-time, and for two reasons.

Currently, if importing an entity, that currently exists, whose TOI ( timestamp-of-interest) is expired, while orphan-cleanup is running - there is a hole between "I updated the existing object" and "I added the existing object to a new repo-version", when orphan-cleanup could decide to remove the entity.

And second - base content-export doesn't exclude "timestamp_of_interest" at export. This means that, at import-time, an existing entity will have its timestamp-of-interest updated to whatever it was on the upstream Pulp, which could be anything and could "push" an existing entity into orphan-cleanup-available. Then the same hole as above exists.

I think we can fix this by teaching BaseContentResource to update timestamp-of-interest on every row to "now", prior to saving. In any event - we def want to look into this. The timing window isn't large, and the chance of running an import at the same time as orphan-cleanup is small - but it is a window, which means someone will eventually hit it.

(just as a note for future debuggers - if the window is hit, and we "lose" content - re-running the import will then recreate it. So it's at least a recoverable problem.)

@dralley dralley removed the Sprint label Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants