Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

static-checks: CI slowdown from unconditional job triggers on self-hosted runner #8998

Closed
BbolroC opened this issue Feb 2, 2024 · 2 comments · Fixed by #9020
Closed

static-checks: CI slowdown from unconditional job triggers on self-hosted runner #8998

BbolroC opened this issue Feb 2, 2024 · 2 comments · Fixed by #9020
Labels
enhancement Improvement to an existing feature needs-review Needs to be assessed by the team.

Comments

@BbolroC
Copy link
Member

BbolroC commented Feb 2, 2024

Which feature do you think can be improved?

The event pull_request for Static checks / build-checks triggers the workflow whenever a PR is updated. A job build-checks is followed by 36 sub-jobs. If multiple PRs are actively updated, an event for each PR would add up another set of CI jobs (after cancelling the previous set). The cumulative CI jobs would be:

  • 36 jobs x # of PRs actively updated

While this has been tolerated with GitHub-hosted ubuntu-20.04 runners, it poses challenges for self-hosted runners (e.g., arm-no-k8s, s390x and ppc64le) due to restrictions on the number of instance provisioning.

This inevitably creates a bottleneck for the entire CI process, significantly impacting the speed of the Kata Containers CI. We could easily observe that some workflows take longer than 5 hours to complete, which is not the desired outcome of migrating CI from Jenkins to GHA.

How can it be improved?

To alleviate the bottleneck, we could refactor the workflow so that the static-checks runs on the self-hosted runner only when triggered by the ok-to-test label. It would be sensible to maintain the jobs running on ubuntu-20.04 as is, as it can serve as a preemptive measure for the others without causing a slowdown in the CI process.

CC: @wainersm @fidencio @stevenhorsman

@BbolroC BbolroC added enhancement Improvement to an existing feature needs-review Needs to be assessed by the team. labels Feb 2, 2024
@BbolroC
Copy link
Member Author

BbolroC commented Feb 2, 2024

BbolroC added a commit to BbolroC/kata-containers that referenced this issue Feb 5, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
@katacontainersbot katacontainersbot moved this from To do to In progress in Issue backlog Feb 5, 2024
BbolroC added a commit to BbolroC/kata-containers that referenced this issue Feb 5, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
BbolroC added a commit to BbolroC/kata-containers that referenced this issue Feb 5, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
BbolroC added a commit to BbolroC/kata-containers that referenced this issue Feb 5, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
BbolroC added a commit to BbolroC/kata-containers that referenced this issue Feb 5, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
BbolroC added a commit to BbolroC/kata-containers that referenced this issue Feb 5, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
BbolroC added a commit to BbolroC/kata-containers that referenced this issue Feb 5, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
@BbolroC
Copy link
Member Author

BbolroC commented Feb 5, 2024

FYI: This issue has been identified after #8485 was merged.

c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: kata-containers#8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement to an existing feature needs-review Needs to be assessed by the team.
Projects
Issue backlog
  
In progress
Development

Successfully merging a pull request may close this issue.

1 participant