Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci.jenkins.io disk almost full #3492

Closed
smerle33 opened this issue Apr 6, 2023 · 15 comments
Closed

ci.jenkins.io disk almost full #3492

smerle33 opened this issue Apr 6, 2023 · 15 comments

Comments

@smerle33
Copy link
Contributor

smerle33 commented Apr 6, 2023

          During the research for information as to which controller was impacted, we discovered that `ci.jenkins.io` had the same kind of disk usage : 
/dev/sdb1    492G 443G  24G 95% /var/lib/jenkins

this need to be fixed.

Originally posted by @smerle33 in #3491 (comment)

@smerle33
Copy link
Contributor Author

smerle33 commented Apr 6, 2023

steps foreseen :

  • create a snapshot
  • increase the disk size to 1024 GiB
  • increase the volume /var/lib/jenkins on /dev/sdb1

@smerle33
Copy link
Contributor Author

smerle33 commented Apr 6, 2023

to increase over 512Gib we will have to change to 1024Gib disk and then change of Disk tiers

@dduportal
Copy link
Contributor

We discovered that $JENKINS_HOME/config-history weights 27Gb!

It's the directory where https://plugins.jenkins.io/jobConfigHistory/ stores the "config changes".

The current setup is the default one: we could set up some of the best practises from https://docs.cloudbees.com/docs/cloudbees-ci-kb/latest/best-practices/jobconfighistory-best-practices to decrease the disk usage (move it on another drive?) and improve I/O performances for ci.j (at least removing the nodes and tools from history)

@dduportal
Copy link
Contributor

Ping @MarkEWaite , could you share what you did in #2736 last time it happenned?

@MarkEWaite
Copy link

I think that I deleted history. I think that we should remove that plugin from the ci.jenkins.io instance and accept that job configuration history is not worth the disc space penalty

@dduportal
Copy link
Contributor

  • @smerle33 did take a snapshot of the datadisk (ci-data) of type "full" just in case.
  • A new temproarly disk has been created and is being mounted by @smerle33 to ci.jenkins.io VM for moving data without removing immediatly

@dduportal
Copy link
Contributor

dduportal commented Apr 6, 2023

  • Config Job history setup was updated to ignore nodes and buildtriggerbadge, and the associated directories were deleted in $JENKINS_HOME/config-history/
  • Result:
$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       492G  436G   31G  94% /var/lib/jenkins

@dduportal
Copy link
Contributor

  • Top-level directory "Infra" iswieght 8.4 Gb, and can be shrinked.
  • Updated its configuration with the following changes:
    • Removing the following repositories bigquery-uploader, cn.jenkins.io and community-functions
    • Setting Scan Organization Triggers to 1 week
    • Enabled "Abort Builds" on Orphaned Item Strategy
    • Settings Automatic branch project triggering -> Suppression strategy to For matching branches suppress builds triggered by indexing (continue to honor webhooks) (same as Disable PR-merge mode everywhere #3474)

=> cleaning up led to a size of 5.6 Gb

@dduportal
Copy link
Contributor

  • Top-level directory "Tools" weight ~ 51 Gb, with almost 45G for bom

  • Updated its configuration with the following changes:

    • Setting Scan Organization Triggers to 1 week
    • Enabled "Abort Builds" on Orphaned Item Strategy
    • Settings Automatic branch project triggering -> Suppression strategy to For matching branches suppress builds triggered by indexing (continue to honor webhooks) (same as Disable PR-merge mode everywhere #3474)
  • The jenkinsci/bom build had hundrerds on builds on its master branch:

    • Short term, a cleanup was triggered by running the following pipeline using Replay (to force cleanup):
properties([disableConcurrentBuilds(abortPrevious: true), buildDiscarder(logRotator(numToKeepStr: '10'))])
echo 'https://github.com/jenkins-infra/helpdesk/issues/3492 investigation, retain 10 most recent builds'
df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       492G  391G   76G  84% /var/lib/jenkins

dduportal added a commit to jenkinsci/bom that referenced this issue Apr 6, 2023
@dduportal
Copy link
Contributor

As seen with @smerle33 , the next "culprit" will be the top-level item "Websites" which weight more than 30 Gb !!

@dduportal
Copy link
Contributor

$ df -h /var/lib/jenkins/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       492G  388G   79G  84% /var/lib/jenkins
root@ci:~# du -sh /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
36G     /var/lib/jenkins/jobs/Websites/jobs/jenkins.io

=> the disk is used by the archived ZIP for each build. Let's remove them:

$ df -h /var/lib/jenkins/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       492G  388G   79G  84% /var/lib/jenkins
$ du -sh /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
36G     /var/lib/jenkins/jobs/Websites/jobs/jenkins.io

# Remove the ZIP archived files
$ cd /var/lib/jenkins/jobs/Websites/jobs/jenkins.io/branches && find . -type f -name "jenkins*.zip" -exec rm -f {} \;

# Result:
$ du -sh /var/lib/jenkins/jobs/Websites/jobs/jenkins.io
161M    /var/lib/jenkins/jobs/Websites/jobs/jenkins.io

$ df -h /var/lib/jenkins/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       492G  352G  115G  76% /var/lib/jenkins

@dduportal
Copy link
Contributor

This issue is closeable as we went under the 80% usage bar (requirement for good I/O performances).

A few improvement (to be treated as separated issues) as discussed with team:

@dduportal
Copy link
Contributor

Closing the issue as operation is finished!

@dduportal
Copy link
Contributor

Post-cleanup: @smerle33 ran an ncdu analysis, and found a 61 Gb tgz file in $JENKINS_HOME/.bkp dated from 1 year ago (25 August 2022). We removed it as not needed (and we have snapshot).

@dduportal
Copy link
Contributor

Final status:

$ df -h /var/lib/jenkins/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       492G  294G  173G  63% /var/lib/jenkins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants