Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH action workflows should ensure images are on quay before helm charts are released #6865

Closed
jmazzitelli opened this issue Nov 20, 2023 · 11 comments · Fixed by kiali/openshift-servicemesh-plugin#307
Assignees
Labels
github_actions Pull requests that update GitHub Actions code

Comments

@jmazzitelli
Copy link
Collaborator

We had a problem with the latest release (1.77). The Helm Charts release failed its smoke test. Details here: kiali/helm-charts#233 (comment)

Looking at our GitHub Action release workflows, all our releases are asynchronously kicked off at the same time (Monday 7am UTC):

Kiali Server/UI: https://github.com/kiali/kiali/blob/v1.77.0/.github/workflows/release.yml#L6
Kiali Operator: https://github.com/kiali/kiali-operator/blob/v1.77.0/.github/workflows/release.yml#L6
Helm Charts: https://github.com/kiali/helm-charts/blob/v1.77.0/.github/workflows/release.yml#L6
OSSMC: https://github.com/kiali/openshift-servicemesh-plugin/blob/v1.77.0/.github/workflows/release.yaml#L6

The problem is the Helm Charts run a smoke test that will fail if one of those other builds (Kiali Server, Operator, OSSMC) do not have an image uploaded to quay.io. We do this smoke test because in the past we released the helm charts but one of the images failed to get uploaded to quay, which means the helm charts were broken (i.e. community people who did a helm chart upgrade got failures because the images failed to load).

So, we need to make sure the Helm Charts are released last. We cannot release the Helm Charts without ensuring the server, operator, and ossmc images are in quay.

Right now the hack suggestion is to delay the Helm Chart release process an hour - 8am UTC. Hopefully that gives enough time for the images to be built and released to quay (though it isn't a guarantee). See this PR that does this: kiali/helm-charts#234

But we might want to consider doing something more guaranteed. Suggestions:

@jmazzitelli jmazzitelli added enhancement This is the preferred way to describe new end-to-end features. github_actions Pull requests that update GitHub Actions code and removed enhancement This is the preferred way to describe new end-to-end features. labels Nov 20, 2023
@leandroberetta leandroberetta self-assigned this Nov 21, 2023
@ScriptingShrimp
Copy link
Contributor

It's a bit out of the scope of this ticket; For nightly pipeline running on OCP we need have same to-be-sure check that all images are uploaded from last nightly job. Only then we can execute some mechanism (webhook? TBD) to our jenkins to pull these.

@leandroberetta
Copy link
Contributor

It would be nice to coordinate the actions, I looked the repository_dispatch event but it seems that it only works for the default branch (meaning, the workflow being at the default branch, master in our case).

"This event will only trigger a workflow run if the workflow file is on the default branch."

This will work for our minor releases but not for the patches as we are using the workflow definitions from the specific version we are patching.

Probably is still fine as the minor releases are the only ones that are scheduled. When realeasing a patch release, we are doing it manually and we can take care of the order (waiting for the server and operator to be pushed, then release the helm chart release, that is what I usually do).

I will try the repository_dispatch method for minor releases.

@jshaughn jshaughn added the backlog Triaged Issue added to backlog label Jan 8, 2024
@jmazzitelli
Copy link
Collaborator Author

Probably is still fine as the minor releases are the only ones that are scheduled. When realeasing a patch release, we are doing it manually and we can take care of the order (waiting for the server and operator to be pushed, then release the helm chart release, that is what I usually do).

+1

@leandroberetta
Copy link
Contributor

I was trying different approaches but with the GITHUB_TOKEN generated, I can't start workflows in other repositories (from kiali/release to kiali-operator/release).

I saw that it might be possible with: https://docs.github.com/en/apps/creating-github-apps/authenticating-with-a-github-app/making-authenticated-api-requests-with-a-github-app-in-a-github-actions-workflow

@jmazzitelli do you remember if we had something like this in the past?, a "github app". I remember that it was deprecated, but we might need to do something similar now if we want to coordinate actions.

Or, create a Personal Access Token, a thing that I don't like because it will belong to a user.

@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented Feb 6, 2024

@leandroberetta we used to use GitHub Apps (I can't remember which ones) but we removed all of them (this was done a long time ago, I can't remember all the details). But I don't ever remember having a set up where we coordinated Actions across repositories.

@jshaughn jshaughn removed the backlog Triaged Issue added to backlog label Feb 8, 2024
@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented Mar 4, 2024

We hit this again in our latest release. The problem is OSSMC takes a long time to complete its release (right now it is over 4 hours, and counting update - it took just a little over 5 hours).

I think for now, as a temporary solution, we should start the OSSMC release action several hours before the other release actions. kiali and kiali-operator and ossmc kick off at 7am Monday, and the helm-charts an hour later (to give time for quay to get the images). But that hour isn't enough. We should kick off OSSMC at 1am or something like that to give it plenty of time.

So these can stay the same:

But OSSMC should kick off at 1am:

cron: "00 1 * * MON"

jmazzitelli added a commit to jmazzitelli/openshift-servicemesh-plugin that referenced this issue Mar 4, 2024
This is a temp fix to kiali/kiali#6865 - read that issue to find out why this is being done.
jmazzitelli added a commit to jmazzitelli/openshift-servicemesh-plugin that referenced this issue Mar 4, 2024
This is a temp fix to kiali/kiali#6865 - read that issue to find out why this is being done.
@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented Mar 4, 2024

Here's a suggested (temporary) fix - kiali/openshift-servicemesh-plugin#272

@jshaughn
Copy link
Collaborator

jshaughn commented Mar 4, 2024

Seems like a reasonable approach.

jmazzitelli added a commit to kiali/openshift-servicemesh-plugin that referenced this issue Mar 4, 2024
This is a temp fix to kiali/kiali#6865 - read that issue to find out why this is being done.
@jshaughn
Copy link
Collaborator

jshaughn commented May 6, 2024

@leandroberetta , @jmazzitelli , It seems that maybe we don't have a high priority here. In general we should not block Helm based on OSSMC. OSSMC is only installed via operator.

We do still need an action to lazily release OSSMC.

@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented May 6, 2024

Soon the OSSMC release process will happen after the server is fully released. This is because the OSSMC vX.Y release build needs to pull in the source code for the X.Y server. So OSSMC release builds need to wait for the server release to go out first. Therefore, we can't have the X.Y release smoke test check for OSSMC vX.Y image to exist because it will not exist. See kiali/helm-charts#259

@jmazzitelli
Copy link
Collaborator Author

I have two PRs to address the new way we want to release OSSMC

  1. Turn off the smoke testing of the OSSMC image (the helm chart smoke test will no longer check for ossmc image to be on quay) -- comment out the ossmc image check - we will not smoke test it. helm-charts#259
  2. In the OSSMC release action, it now copies over the Kiali server source code. The version of the source code is, by default, the same version as the OSSMC release being built. -- copy the kiali source prior to building release openshift-servicemesh-plugin#307

After the OSSMC release action finishes, what you will see happen is:
a. The release branch is created and will contain the Kiali server source code as one commit, with another commit changing the version. You can see an example here - this was the test I did in my fork: https://github.com/jmazzitelli/openshift-servicemesh-plugin/commits/v1.85/
b. The main branch will have a PR ready for merge - it will include the commits that went into the release branch as well as a commit that moves the version to the next SNAPSHOT version. Again, an example from my test run is here: https://github.com/jmazzitelli/openshift-servicemesh-plugin/pull/5/commits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
github_actions Pull requests that update GitHub Actions code
Projects
Development

Successfully merging a pull request may close this issue.

4 participants