Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purge the artifacts in the cache directory when they are not being used #5924

Open
adityasood opened this issue Feb 28, 2019 · 3 comments
Open
Labels

Comments

@adityasood
Copy link
Contributor

Issue Type
  • Feature enhancement
Summary

Currently, GoCD does not clean the artifacts stored in the artifacts/cache directory. With setups with larger artifacts being generated in each run of the job, this can eat up quite some space on the GoCD server and cause the server itself to starve. This further causes the server to be preemptive and delete the artifacts causing some jobs to fail.

@ketan
Copy link
Member

ketan commented Mar 1, 2019

I did some quick analysis of the current code and here's a summary of the existing implementation:

  1. The stages table in the DB keeps track of stages where artifacts have been purged. We run a a query to find the oldest stages that can be purged.

  2. For each stage that can be purged, GoCD will attempt to clear the artifacts dir for that stage, and the cache dir. If a delete fails (open file handles, among other reasons,) the cleanup will silently continue. This stage will then be marked as "purged" so we don't clean it up again in the future.

    This implementation another side-effect in that if an artifact cache is created after a stage has been marked as "purged", the cache will not be cleaned up.


A possible solution:

ZipArtifactCache (via its superclass ArtifactCache), maintains a list of artifacts currently being zipped. We can possibly use this information to make the object a bit more smarter by:

  • adding a LRU cache, to know when a zip file was accessed last
  • adding a reference count of which files are being borrowed (for purpose for sending to clients requesting the file)

When a file in the cache expires after a set (configurable) TTL, it can be safely purged (as long as the reference count for that file is 0)

@bdpiprava bdpiprava modified the milestones: Release: Near term, Release 19.3.0 Mar 1, 2019
@adityasood
Copy link
Contributor Author

@ketan's fix solves the problem of open file handles.

@maheshp maheshp added this to To do in 19.3.0 Mar 7, 2019
@maheshp maheshp removed this from To do in 19.3.0 Apr 22, 2019
@maheshp maheshp added this to To do in 19.4.0 Apr 22, 2019
@rajiesh rajiesh removed this from To do in 19.4.0 May 27, 2019
@maheshp maheshp added this to To do in 19.5.0 Jun 6, 2019
@maheshp maheshp removed this from To do in 19.5.0 Jun 10, 2019
@maheshp maheshp added this to To do in 19.6.0 Jun 10, 2019
@maheshp maheshp removed this from To do in 19.6.0 Jul 17, 2019
@maheshp maheshp modified the milestones: Release 19.6.0, Release: Near term Jul 17, 2019
@stale
Copy link

stale bot commented Apr 1, 2020

This issue has been automatically marked as stale because it has not had activity in the last 90 days.
If you can still reproduce this error on the master branch using local development environment or on the latest GoCD Release, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@stale stale bot added the stale label Apr 1, 2020
@stale stale bot closed this as completed Apr 8, 2020
@chadlwilson chadlwilson added no stalebot Don't mark this stale. enhancement artifacts and removed stale labels Jun 29, 2023
@chadlwilson chadlwilson reopened this Jun 29, 2023
@chadlwilson chadlwilson removed this from the Release: Near term milestone Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants