Skip to content

Disk space is not released after the runner completes #4056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 of 4 tasks
XuehaoSun opened this issue Apr 24, 2025 · 2 comments
Closed
3 of 4 tasks

Disk space is not released after the runner completes #4056

XuehaoSun opened this issue Apr 24, 2025 · 2 comments
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers

Comments

@XuehaoSun
Copy link

Checks

Controller Version

0.11.0

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

Run docker compose in dind mode

Describe the bug

We use the dind mode, which allows us to run docker compose in CI
It is obvious that the outputs of the df -h and sudo du -sh /* are significantly different, df -h shows that 700GB of storage is being used, but sudo du - sh /* only has a total of around 200GB. I know this cannot be used as direct evidence, so I used four k8s nodes, two of which are running ARC, the result is that only the two nodes running ARC will encounter this issue.
But after restarting the machine, the disk usage will return to normal
We also analyzed the data of lsof | grep delete and did not find anything.

  • df -h
    Image
  • sudo du -sh /*
    Image

Describe the expected behavior

Disk resources can be restored normally

Additional Context

K8s Rev: v1.29.15

githubConfigUrl: ""
githubConfigSecret:
  github_token: ""
maxRunners: 16
minRunners: 1
containerMode:
  type: "dind"
template:
  spec:
    volumes:
      - name: model-path
        hostPath:
          path: /data2
    containers:
      - name: runner
        image: runner-name:latest
        command: ["/home/runner/run.sh"]
        volumeMounts:
          - name: model-path
            mountPath: /data2
        resources:
          requests:
            cpu: "40"
            memory: "200Gi"
          limits:
            cpu: "40"
            memory: "200Gi"

Controller Logs

No exceptions or errors were found

Runner Pod Logs

No exceptions or errors were found
@XuehaoSun XuehaoSun added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels Apr 24, 2025
Copy link
Contributor

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

@nikola-jokic
Copy link
Collaborator

Hey @XuehaoSun,

This issue is not related to ARC. The ARC is only responsible for spinning up the runner and removing it once it is done.
As far as I can tell from the issue, once you remove the runner, it might land on a different machine, providing a different result for available space (but I cannot be certain).
I will close this issue since it is unrelated to ARC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers
Projects
None yet
Development

No branches or pull requests

2 participants