Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI step "Set toolchain version" can fail without stopping CI job #225

Closed
joshlf opened this issue Aug 3, 2023 · 3 comments · Fixed by #289
Closed

CI step "Set toolchain version" can fail without stopping CI job #225

joshlf opened this issue Aug 3, 2023 · 3 comments · Fixed by #289
Labels
compatibility-nonbreaking Changes that are (likely to be) non-breaking

Comments

@joshlf
Copy link
Member

joshlf commented Aug 3, 2023

In this CI job, it appears that a network error causes issues with the "Set toolchain version" step, but the GitHub runner does not fail at that point. Instead, later steps are executed with bad data (namely, the ZC_TOOLCHAIN environment variable is set to an empty string), causing confusing errors.

I suspect that what's happening is that the shell script executed for that step has a bug that results in errors inside of $(...) to not respect the set -e at the top of the script.

@joshlf joshlf added the compatibility-nonbreaking Changes that are (likely to be) non-breaking label Aug 12, 2023
@joshlf joshlf mentioned this issue Aug 20, 2023
@zoo868e
Copy link
Contributor

zoo868e commented Aug 25, 2023

Hi @joshlf , I would like to try it.

@joshlf
Copy link
Member Author

joshlf commented Aug 25, 2023

Hey @zoo868e, sounds good, thanks! Let me know if you have any questions at any point!

@zoo868e
Copy link
Contributor

zoo868e commented Aug 25, 2023

Hello @joshlf,

It seems that the issue is arising intermittently, possibly due to a GitHub action facing difficulties in fetching dependencies during the execution of cargo metadata. After replicating the problem, I managed to execute all the tasks successfully, including a re-run of the previously failed job.

Therefore, I'm interested to know if there is a reliable method to verify whether the issue has been resolved, given that it doesn't occur consistently.

The approach that comes to mind for me to assess the issue's current status is to run the CI process multiple times and hope that the problem shows up.

Moreover, the failures are consistently caused by difficulties in fetching the syn package. Perhaps we could consider changing the source of the crate?

Thank you.

zoo868e added a commit to zoo868e/zerocopy that referenced this issue Aug 25, 2023
Fixes google#225
The reason `set -e` failed to interrupt the workflow is due to the
output redirection to `jq`. Since `jq` returns a value of 0, indicating
success, the script continues to execute. You can try running these
scripts locally to gain insight into the issue.

To illustrate:

Script without interruption:
```sh
set -e

function pkg-meta {
  cargo metadata --format-version 1 | jq -r ".packages[] | select(.name\
  == \"zerocopy\").$1"
}

ZC_TOOLCHAIN=\"$(pkg-meta 'metadata.ci.\"pinned-nightly\"')\"

echo \"Discovered that the nightly toolchain is $ZC_TOOLCHAIN\"
echo \"ZC_TOOLCHAIN=$ZC_TOOLCHAIN\"
```

Script with interruption:
```sh
set -e

function pkg-meta {
  cargo metadata --format-version 1"
}

ZC_TOOLCHAIN=\"$(pkg-meta 'metadata.ci.\"pinned-nightly\"')\"

echo \"Discovered that the nightly toolchain is $ZC_TOOLCHAIN\"
echo \"ZC_TOOLCHAIN=$ZC_TOOLCHAIN\"
```

Additionally, you can display the return value after the command
execution. This output will confirm the successful execution of the
command.

```sh
set -e

function pkg-meta {
  cargo metadata --format-version 1 | jq -r \".packages[] | select(\
  .name == \"zerocopy\").$1\"
  echo
  echo \$?
}

ZC_TOOLCHAIN=\"$(pkg-meta 'metadata.ci.\"pinned-nightly\"')\"

echo \"Discovered that the nightly toolchain is $ZC_TOOLCHAIN\"
echo \"ZC_TOOLCHAIN=$ZC_TOOLCHAIN\"
```

Therefore, I've incorporated `cargo check` to validate the package
retrieval. Subsequent to this check, there's no necessity to fetch the
package again using `cargo metadata`. This modification should rectify
the point of failure in the Git action.
joshlf pushed a commit that referenced this issue Aug 27, 2023
To bolster reliability, incorporate the -o pipefail option, which
designates a command pipeline's overall return status as failed if any
individual command within it fails. Following this, integrating the -e
option can empower the script with the ability to promptly halt
execution.

Fixes #225
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility-nonbreaking Changes that are (likely to be) non-breaking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants