Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interrupting SDK install step in CI breaks CI #1052

Closed
oleflb opened this issue Jun 9, 2024 · 7 comments · Fixed by #1053
Closed

Interrupting SDK install step in CI breaks CI #1052

oleflb opened this issue Jun 9, 2024 · 7 comments · Fixed by #1053
Assignees
Labels

Comments

@oleflb
Copy link
Contributor

oleflb commented Jun 9, 2024

When a CI step is interruped while it is installing a SDK, it may result in a partial SDK.
This is fixed by deleting the broken SDK and ensuring that the SDK install step runs through successfully.
Maybe we could add a seperate sdk install step in the CI that is not interruptable. Alternatively, we could disable fail-fast entirely for the build step

@oleflb oleflb added the tools:CI label Jun 9, 2024
@oleflb oleflb changed the title Aborting SDK install step in CI breaks CI Interrupting SDK install step in CI breaks CI Jun 9, 2024
@knoellle
Copy link
Contributor

knoellle commented Jun 9, 2024

Another solution would be to make the SDK install atomic by installing to a temp directory first and then mving it into place like we do with the downloads since #747

@oleflb
Copy link
Contributor Author

oleflb commented Jun 9, 2024

Great idea @knoellle, would also fix such issues outside the CI

@knoellle knoellle self-assigned this Jun 9, 2024
@knoellle
Copy link
Contributor

knoellle commented Jun 9, 2024

I tried implementing this approach but it doesn't work since the installation process bakes the installation path into many of the files.
Some alternative approaches:

  1. Create an "hey, this didn't finish correctly" marker file which is removed after the sdk installation reports success.
    Requires changing the "is sdk already installed" detection which may break if we forget to do so at some point but it is probably the best option.
  2. Remove the installation directory on error.
    This would still break in cases where pepsi dies at the same time as the installer as would likely be the case in an aborted CI job.
  3. sed -i over the directory after installation to fix the paths.
    Very hacky, does not spark joy.

What do you think?

@schmidma
Copy link
Member

Is there anything against a pepsi sdk install action in the CI build jobs, that cannot be interrupted? If there is a released version of the SDK, it is a good idea to install it to the CI runners.

@knoellle
Copy link
Contributor

Sure, that would (probably) fix the CI issues.
However, I had hoped to also fix this issue for people installing the SDK on their machines.
If you Ctrl+C a pepsi upload during sdk installation, you will likely be met with very cryptic error messages when you run the command again.

@schmidma
Copy link
Member

We could also integrate such a feature to the SDK install script by patching poky

@knoellle
Copy link
Contributor

How Aufwand would that be? Patches break more easily when updating versions.
Also, which of the suggested solutions?

  1. The marker file would still have to be checked by pepsi before using the sdk and at that point pepsi might as well create/remove the marker file too.
  2. Removing the partial installation on error isn't reliable because the cleanup code may never be executed depending on the kind of error.
  3. The sed after mv I don't think we should do either way.

I'm favoring solution 1 implemented in pepsi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants