Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cleanup tasks to remove stale objects #3950

Merged
merged 1 commit into from Jun 30, 2023
Merged

Conversation

mdellweg
Copy link
Member

Uploads and PulpTemporaryFiles are cleaned up. Set the *_PROTECTION_TIME settings to None in order to disable this behaviour.

fixes #3949

@mdellweg mdellweg marked this pull request as ready for review June 26, 2023 07:26
@@ -0,0 +1 @@
Added periodically executed cleanup tasks for uploads and temporary files.
Copy link
Member

@lubosmj lubosmj Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to mention the setting in the changelog as well?

Set the *_PROTECTION_TIME settings to None in order to disable this behaviour.

Besides that, I, personally, would make this behaviour disabled by default for a couple of releases. Just in case.

Copy link
Member

@lubosmj lubosmj Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a note to the upload workflow (I think it is in the pulp_file repository? Not sure if it is even in our docs because we use CLI and uploads are translated to artifacts automatically) that as of now we will remove uncommitted uploads?


def upload_cleanup():
expiration = timezone.now() - timezone.timedelta(minutes=settings.UPLOAD_PROTECTION_TIME)
qs = Upload.objects.filter(pulp_created__lt=expiration)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having expiration equal to timezone.now() (settings.UPLOAD_PROTECTION_TIME=0) results in pulp_create__lt always being less than expiration, so we will remove uploads even though users opted for not doing that. Is my understanding correct? If yes, we need to fix it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as such. Having it 0 will unschedule the task, but yes, there may be a window where things go south.
I'll find a different solution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now checked to be > 0.

Uploads and PulpTemporaryFiles are cleaned up. Set the *_PROTECTION_TIME
settings to None in order to disable this behaviour.

fixes pulp#3949
@mdellweg mdellweg merged commit 28882ff into pulp:main Jun 30, 2023
15 checks passed
@mdellweg mdellweg deleted the cleanup_tasks branch June 30, 2023 14:53

Pulp uses ``uploads`` and ``pulp temporary files`` to pass data from the api to worker tasks.
These options allow to specify a timeinterval in minutes used for cleaning up stale entries. If
set to 0, automatic cleanup is disabled, which is the default.
Copy link
Member

@ipanova ipanova Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for my late 2c, but this a bit contradicts( or confuses?) to our similar setting orphan_protection_time. when that one set to 0 it means the orphans are not protected not that setting/cleanup is disabled .
How can we make this more consistent? The naming for all these 3 settings suggests they work in a similar way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a contradiction, but i can see it being confusing. And I must confess, being similar is what inspired me in the first place to use similar names. The biggest difference is probably that we do not have a scheduled task for orphan cleanup (not that this would be impossible...).
Is there any way you think we could improve the documentation here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear where you're coming from.
Unfortunately, I don't see what can be improved in the documentation. Documentation is well written per se, and if users read it, it's ok even to have behavioral differences between the settings which have similar name.

@@ -240,8 +240,10 @@

WORKER_TTL = 30

# how long to protect orphan content in minutes
# how long to protect ephemeral items in minutes
Copy link
Member

@ipanova ipanova Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

especially this comment suggests that having UPLOAD_PROTECTION_TIME = 0 uploads are not protected and removed ASAP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add "schedulable" cleanup tasks for Uploads and PulpTemporaryFiles
4 participants