Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: leverage retry-action to increase reliability of builds. #503

Merged
merged 3 commits into from
Feb 20, 2024

Conversation

dylanmtaylor
Copy link
Contributor

This is a proposed implementation of #502. It needs to run the GitHub action to be tested, but should improve reliability without manual build re-triggers.

@dylanmtaylor dylanmtaylor marked this pull request as draft February 18, 2024 09:21
@dylanmtaylor
Copy link
Contributor Author

I need to figure out why this is failing before it's merged.

@dylanmtaylor
Copy link
Contributor Author

dylanmtaylor commented Feb 18, 2024

The issue is related to this section

        labels: |
          ${{ steps.meta.outputs.labels }}

I think the retry action is sensitive to the indentation here.

@dylanmtaylor
Copy link
Contributor Author

This is now fixed using a workaround:

I pad labels in another step and output it to the environment. This is necessary because we need it to remain a multiline string without using any character substitution. I then pull the string into the inputs for the retry-action. You can confirm the inputs work on the build step.

@dylanmtaylor
Copy link
Contributor Author

dylanmtaylor commented Feb 18, 2024

It's not the cleanest in the world, but if it gets us working retries, that is a huge win.

      # Workaround to fix indentation for the labels so that retries work.
      - name: Left pad labels
        id: left_pad_labels
        shell: bash
        env:
          LABELS: ${{steps.meta.outputs.labels}}
        run: |
          echo "Labels:"
          echo "${LABELS}"
          export TRANSFORMED=$(echo "$LABELS" | sed 's,^,  ,g')
          echo "Transformed:"
          echo "${TRANSFORMED}"
          echo "LABELS<<EOF" >> $GITHUB_ENV
          echo "$TRANSFORMED" >> $GITHUB_ENV
          echo "EOF" >> $GITHUB_ENV

^ This could be a single set of echo commands to the GITHUB_ENV but I left these in for troubleshooting purposes so it is clear what is going on.

@bsherman
Copy link
Contributor

bsherman commented Feb 18, 2024

I'm not confident we want to always retry the image build job. Usually when image build fails it is due to legit issues with package dependencies (rpmfusion etc). There are a few places where scripts can fail in ways that warrant retry, but I'd like to see those particular bits improved instead.

I DO think it would valuable to auto retry when the image push job fails since those failures are almost always due to some random ghcr/github hiccup.

@dylanmtaylor
Copy link
Contributor Author

I'm not confident we want to always retry the image builds. Usually when image builds fail it is due to legit issues with package dependencies (rpmfusion etc). There are a few places where scripts can fail in ways that warrant retry, but I'd like to see those particular bits improved instead.

I DO think it would valuable to auto retry when the image push fails since those failures are almost always due to some random ghcr/github hiccup.

This can be modified to make the push retry.

Copy link
Contributor

@bsherman bsherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your desire to help improve the ublue build reliability!

I do have some requests which I believe will make the solution more relevant to areas where I believe we most often see spurious failures.

List of requests follows:

  1. when specifying uses: Wandalen/wretry.action@master please use an actual version selector not master (I believe @1 would work). This will cause dependabot to alert us with PRs when a new major version update is available.
  2. please add retry action to Get current version labels step (this step is known to cause spurious failures)
  3. please add retry action to Push to GHCR push step (this step is known to cause spurious failures)
  4. please remove retry action build_image step (I believe the few spurious failures I've seen in this step should be fixed in the respective shell scripts, not at the top level)
  5. since not using retry action for build_image step, please remove the left padding step

@dylanmtaylor
Copy link
Contributor Author

I have force pushed a new version of this commit. @bsherman could you please review it?

.github/workflows/build.yml Outdated Show resolved Hide resolved
.github/workflows/build.yml Outdated Show resolved Hide resolved
@dylanmtaylor
Copy link
Contributor Author

Updated with the requested changes.

@bsherman
Copy link
Contributor

@dylanmtaylor why did you just force push?

@dylanmtaylor
Copy link
Contributor Author

@dylanmtaylor why did you just force push?

I missed one line of whitespace change at the end of the file. I thought that it would be ideal to not have any changes to whitespace.

@dylanmtaylor
Copy link
Contributor Author

@dylanmtaylor
Copy link
Contributor Author

Interesting - the last attempt actually failed during the build action.

error creating build container: writing blob: storing blob to file "/var/tmp/storage1964624803/9": happened during read: unexpected EOF
time="2024-02-19T17:54:53Z" level=error msg="exit status 125"
Error: Error: buildah exited with code 125

@dylanmtaylor
Copy link
Contributor Author

I think it may be beneficial to add the retry to the build image after all. I am testing a revised version of this with it in there too. If it builds successfully the first time it'll do nothing, and if not it gets rerun only a couple of times.

@dylanmtaylor
Copy link
Contributor Author

I opened #506 as an alternative version of this which I feel is a bit better due to retrying the build action as well.

@dylanmtaylor
Copy link
Contributor Author

Converting to draft as I feel the merits of this vs #506 should be discussed before one is merged.

@bsherman
Copy link
Contributor

@dylanmtaylor why did you just force push?

I missed one line of whitespace change at the end of the file. I thought that it would be ideal to not have any changes to whitespace.

Understood.

In general, it's preferable to not force push in PRs as it removes context from code reviews, etc. We've been looking at using a tool to enforce policy across all ublue-os org repos, which would include preventing force-push, but for now, I'm just asking to please not do it anymore. :-)

Our auto-merge settings already squash commits, so it force push of pre-squashed commits doesn't really gain anything.

@dylanmtaylor dylanmtaylor marked this pull request as ready for review February 20, 2024 04:02
@dylanmtaylor
Copy link
Contributor Author

This is ready to go in, assuming it fully passes CI. @bsherman

@dylanmtaylor
Copy link
Contributor Author

I think I can see what happened here.

Error: The template is not valid. .github/workflows/build.yml (Line: 243, Col: 17): Error reading JToken from JsonReader. Path '', line 0, position 0.

^ The step output is blank because we only run the push action on the main branch and not on PRs. Signatures should also not run on PRs due to this.

@bsherman
Copy link
Contributor

Error: The template is not valid. .github/workflows/build.yml (Line: 243, Col: 17): Error reading JToken from JsonReader. Path '', line 0, position 0.

Hmm... annoying that the fromJSON function must be running in the template even if the step is skipped.

Oh well, thank you for fixing.

@bsherman bsherman requested a review from a team February 20, 2024 18:56
@bsherman bsherman added this pull request to the merge queue Feb 20, 2024
Merged via the queue into ublue-os:main with commit 4dda9bc Feb 20, 2024
24 checks passed
@dylanmtaylor dylanmtaylor deleted the patch-1 branch February 21, 2024 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants