Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-34360][ci] Adds cleanup to job_init action #24277

Merged
merged 2 commits into from
Feb 7, 2024

Conversation

XComp
Copy link
Contributor

@XComp XComp commented Feb 6, 2024

What is the purpose of the change

There was a drop in disk space some time between Feb 2 (see 84G example workflow) and Feb 3 (see 73G example workflow). Previously, the runners had a disk size of ~84G. That dropped to ~73G. Instead of 24G free space, the e2e tests were only able to use 14G.

This seems to be a GHA runner specific issue. The Azure Pipeline runners have a disk size of 73G as well, but come with less disk being used from the start. That is why we're not seeing the issue in Azure Pipelines.

There are some binaries which we're not really using. Removing those would already free quite some disk space:

  • Andoid: ~9G
  • CodeQL: ~5G
  • .NET: ~1G

Removing these binaries results in another 15G of additional space. There is a risk that GitHub Action requires some of these dependencies, though. But for this change, I got inspired by easymon/maximize-build-space:action.yml. That plugin does exactly that (+ some other even more elaborate things like creating a LVM volumes). I didn't use the plugin itself because I wanted to keep the impact on the runner as small as possible. Just removing the files seems to be the easiest way to gain some disk space.

Brief change log

  • Adds step to job_init action that deletes the folders for Android, CodeQL and .NET and logs the disk space afterwards

Verifying this change

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 6, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@snuyanzin snuyanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this
looks good to me

i have a minor comment

@XComp XComp merged commit 6ca9ec7 into apache:master Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants