Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a nightly pipeline for GA/EUS releases #4046

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

tkoscieln
Copy link
Contributor

@tkoscieln tkoscieln commented Mar 28, 2024

Separate RHEL GA/EUS releases testing to nightly pipeline from the pipeline ran on every PR.

This pull request includes:

  • adequate testing for the new functionality or fixed issue

@tkoscieln tkoscieln force-pushed the optimize_pipeline_execution branch 3 times, most recently from 887bdd8 to 4905193 Compare April 18, 2024 13:41
@tkoscieln tkoscieln marked this pull request as ready for review April 18, 2024 13:43
@tkoscieln
Copy link
Contributor Author

Sample succesfull run of the GA/EUS Nightly pipeline (debug):
https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1258487788

Copy link
Contributor

@jrusz jrusz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rename it to just GA pipeline? The nightly pipeline is called that way because it's testing nightly composes while this is supposed to be testing GA composes which is something else, nightly GA doesn't make sense. So please also update all instances of "ga" and "nightly" used together to just "ga".

The rules don't seem to work, I see many jobs on GA runners on the regular PR pipeline.

.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Show resolved Hide resolved
Copy link
Contributor

@atodorov atodorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@jrusz jrusz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that it would be enough to just call it GA not GA/EUS because there is also ELS and idk what else and it's basically all a subset of GA. Every released RHEL version no matter the support level can still be considered GA. What do you think @atodorov ?

Also the rules don't seem to work, I see many GA jobs in the PR pipeline. The way these rules work by default is that whenever it matches the conditions it will schedule the job. Since you've added these new rules it will enabled to run it in the scheduled GA pipeline but there is nothing preventing it from running on PRs for that I think you'll need to modify the upstream rules to exclude these jobs

@@ -9,7 +9,11 @@ fi

COMPOSE_ID=$(cat COMPOSE_ID)
COMPOSER_NVR=$(cat COMPOSER_NVR)
MESSAGE="\"Nightly pipeline execution on *$COMPOSE_ID* with *$COMPOSER_NVR* finished with status *$1* $2 \n QE: @atodorov, @jrusz\n Link to results: $CI_PIPELINE_URL\n For edge testing status please see https://url.corp.redhat.com/edge-pipelines \""
if [ "$3" == "ga" ]; then
MESSAGE="\"Nightly GA/EUS releases pipeline execution finished with status *$1* $2 \n QE: @atodorov, @jrusz, @tkosciel\n Link to results: $CI_PIPELINE_URL \""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MESSAGE="\"Nightly GA/EUS releases pipeline execution finished with status *$1* $2 \n QE: @atodorov, @jrusz, @tkosciel\n Link to results: $CI_PIPELINE_URL \""
MESSAGE="\"GA/EUS composes pipeline execution finished with status *$1* $2 \n QE: @atodorov, @jrusz, @tkosciel\n Link to results: $CI_PIPELINE_URL \""

@jrusz
Copy link
Contributor

jrusz commented Apr 30, 2024

So what seems to work is if you take your exclude rule and add it to the upstream rules and build rules for example:

.build_rules:
  rules:
    - if: '$CI_PIPELINE_SOURCE != "schedule" && $SKIP_CI == "false" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/'
    - if: '$CI_PIPELINE_SOURCE != "schedule" && $SKIP_CI == "true" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/'
      when: manual

.upstream_rules_all:
  rules:
    - if: '$CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/'

There is also the upstream rule for x86_64 which will require a little bit different regex and then there are a few jobs which have their own rules defined like aws.sh and some regression jobs. Use https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/ci/lint to check out if it works, just paste the whole CI config there, tick the checkbox to Simulate a pipeline created for the default branch and then just search for any -ga jobs that are still there.

@atodorov
Copy link
Contributor

I'm thinking that it would be enough to just call it GA not GA/EUS because there is also ELS and idk what else and it's basically all a subset of GA. Every released RHEL version no matter the support level can still be considered GA. What do you think @atodorov ?

+1 for the proposed simplified naming. Let's use what you propose for now and update it later if need be.

.gitlab-ci.yml Outdated Show resolved Hide resolved
@tkoscieln tkoscieln marked this pull request as draft April 30, 2024 12:10
Copy link
Contributor

@jrusz jrusz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments in the code. One of the main issues I see right now is that I don't think it will run on main after a PR is merged. It's defined as one of the goals in the Jira ticket, we just want to remove these jobs from running on each PR.

@thozza I wanted to ask you if you agree with this change. The goal here is to not run tests on GA releases on PRs but instead run them only on main and in a schedule similar to current nightly pipelines. This will make PR testing faster with minimum impact to quality if and an issue does occur on some of the GA releases we'll see it reported on main and in Slack along with nightly pipelines.

.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Outdated Show resolved Hide resolved
schutzbot/slack_notification.sh Outdated Show resolved Hide resolved
.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Show resolved Hide resolved
.gitlab-ci.yml Show resolved Hide resolved
.gitlab-ci.yml Show resolved Hide resolved
@thozza
Copy link
Member

thozza commented May 3, 2024

@thozza I wanted to ask you if you agree with this change. The goal here is to not run tests on GA releases on PRs but instead run them only on main and in a schedule similar to current nightly pipelines. This will make PR testing faster with minimum impact to quality if and an issue does occur on some of the GA releases we'll see it reported on main and in Slack along with nightly pipelines.

We should keep testing on PRs at least for GA releases used in our SaaS deployment. Otherwise, this LGTM.

@ondrejbudai @croissanne WDYT?

@thozza
Copy link
Member

thozza commented May 21, 2024

So, for the SaaS version, we need to keep testing the latest RHEL-9 GA release (9.4).

WRT the specific test cases, the following ones are relevant for the Saas:

@ondrejbudai @croissanne feel free to add / correct me.

Copy link

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 21, 2024
@tkoscieln tkoscieln removed the Stale label Jun 24, 2024
@tkoscieln tkoscieln force-pushed the optimize_pipeline_execution branch from 098e858 to b34676c Compare June 25, 2024 13:30
@atodorov atodorov self-requested a review June 25, 2024 14:59
@atodorov
Copy link
Contributor

@tkoscieln can you execute the new so called GA pipeline and post a link here so we can compare the results ?

@tkoscieln
Copy link
Contributor Author

I have updated the pipeline to re-add the tests specified by @thozza earlier.
How the pipeline looked before: https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1347409921
How it looks now after I re-added some of the test cases: https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1347426312
How the "GA" pipeline looks now:
https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1348619307

I have left the re-added test which now run in the PR pipeline again in the GA pipeline as well. What do you think @atodorov @jrusz ? Do we want to test them in both pipelines or just in the PR one?

@atodorov
Copy link
Contributor

atodorov commented Jun 26, 2024

How the pipeline looked before:

https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1348643668 (from pull #4232) - 217 test jobs

How it looks now after I re-added some of the test cases: https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1347426312

^^^ 271 test jobs - quite a lot more actually.

  • Packer job is missing, see comment above. I'm not quite certain if it needs to be executed on PRs but the original conditions indicate so and you are making it the opposite.

  • Not sure about Container either, need decision there.

The rest of the jobs in the list are the same, need to examine a bit better whether or not the expected runners are used.

How the "GA" pipeline looks now: https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1348619307

What is the expected result here? That test jobs are executed only on GA runners and not on nightly runners, isn't it ?

@tkoscieln
Copy link
Contributor Author

How the pipeline looked before:

https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1348643668 (from pull #4232) - 217 test jobs

How it looks now after I re-added some of the test cases: https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1347426312

^^^ 271 test jobs - quite a lot more actually.

This number includes some cancelled jobs, so I checked and it is actually lower than 217.

  • Packer job is missing, see comment above. I'm not quite certain if it needs to be executed on PRs but the original conditions indicate so and you are making it the opposite.
  • Not sure about Container either, need decision there.

The rest of the jobs in the list are the same, need to examine a bit better whether or not the expected runners are used.

I left Packer and Container jobs as they were not specified to be re-added, but can add them back as well.

How the "GA" pipeline looks now: https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1348619307

What is the expected result here? That test jobs are executed only on GA runners and not on nightly runners, isn't it ?

Yes, it should contain jobs run on GA runners only, with some of the tests cases being extracted from the PR pipeline entirely to be run only in this GA pipeline.

@atodorov
Copy link
Contributor

atodorov commented Jun 27, 2024

How the pipeline looked before:
^^^ 271 test jobs - quite a lot more actually.

This number includes some cancelled jobs, so I checked and it is actually lower than 217.

Can you git push this PR again so we can see the real number of jobs being scheduled.

I left Packer and Container jobs as they were not specified to be re-added, but can add them back as well.

Container job should be executed on PRs to detect changes which break container builds

from templates/packer/README.md:

This directory contains a packer configuration for building osbuild-composer
worker AMIs based on RHEL.

So it looks like this job, Packer, needs to be executed on pull requests as well, seems important to me.

@tkoscieln
Copy link
Contributor Author

tkoscieln commented Jun 27, 2024

Ok, will add back the Packer and Container jobs and in the process trigger a new pipeline

@tkoscieln
Copy link
Contributor Author

@atodorov it seems that with this optimalization we are down to 136 jobs in the PR pipeline from 217 including the Packer and Container jobs.

atodorov
atodorov previously approved these changes Jun 27, 2024
Copy link
Contributor

@atodorov atodorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@atodorov atodorov requested a review from jrusz June 27, 2024 11:40
Copy link
Contributor

@jrusz jrusz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some cosmetic changes requested.

.gitlab-ci.yml Outdated
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $NIGHTLY== "false"'
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $NIGHTLY == "false"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between these two rules besides the whitespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Propably none :D will delete

.gitlab-ci.yml Outdated
Comment on lines 88 to 87
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $RUNNER =~ "/^.*(x86_64).*$/" && $NIGHTLY== "false"'
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $RUNNER =~ "/^.*(x86_64).*$/" && $NIGHTLY == "false"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between these two rules besides the whitespace?

.gitlab-ci.yml Outdated
Comment on lines 69 to 71
.upstream_rules_ga_all:
rules:
- if: '$CI_PIPELINE_SOURCE != "schedule" && $RUNNER'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of this rule? All jobs have $RUNNER defined, why does it have ga in the name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking through the file I realized it's upstream+ga, can you update the naming to make it more obvious?

@tkoscieln tkoscieln marked this pull request as ready for review July 2, 2024 09:01
@tkoscieln tkoscieln force-pushed the optimize_pipeline_execution branch from cf80627 to e4e1338 Compare July 2, 2024 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants