Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spack ci rebuild: Don't install in a login shell #31771

Merged
merged 1 commit into from Aug 2, 2022

Conversation

blue42u
Copy link
Contributor

@blue42u blue42u commented Jul 27, 2022

On some systems (e.g. OpenSUSE Leap 15) running the shell in login mode wipes important parts of the environment, such as PATH. Currently spack ci rebuild runs it's generated install.sh script in a login shell, which causes all the build-jobs to fail with a mysterious spack: command not found error.

This hotpatch changes to run the install.sh script in a normal shell without any extra modifications.

@spackbot-app spackbot-app bot added commands core PR affects Spack core functionality labels Jul 27, 2022
@blue42u
Copy link
Contributor Author

blue42u commented Jul 27, 2022

/cc @scottwittenburg as original author for this code

(I don't know why GitHub randomly cancels my checks sometimes...)

@scottwittenburg
Copy link
Contributor

@blue42u as I recall, we used the login shell when we were working on making failed gitlab builds reproducible outside of gitlab. When you're in your docker reproducer container, you'll run the install.sh script and that will be happening in a login shell, and we wanted the two environments to be as similar as possible.

@becker33 originally suggested this approach, so maybe he'll have some suggestion to add.

But at least one other person has complained about how the install is run in a login shell, so maybe we can do without it. However, unless this PR actually rebuilds some specs, the gitlab pipeline won't help us know whether the change is going to work. So maybe a WIP commit to add a new spec to one of the tested stacks, or some WIP change to one of the package files will be needed.

@blue42u
Copy link
Contributor Author

blue42u commented Jul 27, 2022

@scottwittenburg I pulled this commit out of my other WIP PR where I'm adding hpctoolkit-* stacks. The latest pipeline there successfully rebuilt a stack with OpenSUSE Leap 15.

@blue42u
Copy link
Contributor Author

blue42u commented Jul 27, 2022

we used the login shell when we were working on making failed gitlab builds reproducible outside of gitlab. When you're in your docker reproducer container, you'll run the install.sh script and that will be happening in a login shell, and we wanted the two environments to be as similar as possible.

Thinking about this a bit more, I think using a login shell actually makes this "reproducibility" harder in practice. In a full CI job the final spack install is run in an environment formed by scripts running in this order:

export AWS_SECRET=...                  # 1. Load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
source /etc/profile                    # 4. Bash login shell (from -l)
spack install ...

Whereas when a user launches their own container with (docker|podman) run -it, they end up running spack install in an environment formed in this order:

source /etc/bash.bashrc                # (not 4). Bash interactive shell (default with TTY)
export AWS_SECRET=...                  #~1. Manually load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
spack install ...

The big problem being that (4) has a completely different position and content (on Leap 15 and possibly other containers).

So in context, this PR removes (4) from the CI job case, leaving us with the simpler:

export AWS_SECRET=...                  # 1. Load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
spack install ...

@blue42u blue42u force-pushed the pr-spack-ci-rebuild branch 2 times, most recently from de3fc31 to 5d3f438 Compare August 1, 2022 04:03
@becker33
Copy link
Member

becker33 commented Aug 1, 2022

I find this argument convincing that I was wrong to put it in a login shell to begin with.

becker33
becker33 previously approved these changes Aug 1, 2022
@becker33
Copy link
Member

becker33 commented Aug 1, 2022

@scottwittenburg I'll leave this to you to merge when you're convinced that this is suitably tested from #31598

On some systems the shell in login mode wipes important parts of the
environment, such as PATH. This causes the build to fail since it can't
find `spack`.

For better robustness, don't use a login shell.
@blue42u
Copy link
Contributor Author

blue42u commented Aug 2, 2022

Rebased onto latest develop, no functional change.

@blue42u
Copy link
Contributor Author

blue42u commented Aug 2, 2022

#31598 was also rebased onto this PR last night, and the latest pipeline from it rebuilt (part of) the HPCToolkit stacks successfully on a range of OSs: https://gitlab.spack.io/spack/spack/-/pipelines/147102

@becker33 becker33 merged commit 8c50ea9 into spack:develop Aug 2, 2022
@becker33
Copy link
Member

becker33 commented Aug 2, 2022

Thanks @blue42u

@blue42u blue42u deleted the pr-spack-ci-rebuild branch August 2, 2022 18:59
bhatiaharsh pushed a commit to bhatiaharsh/spack that referenced this pull request Aug 8, 2022
On some systems the shell in login mode wipes important parts of the
environment, such as PATH. This causes the build to fail since it can't
find `spack`.

For better robustness, don't use a login shell.

In a full CI job the final spack install is run in an environment formed by scripts running in this order:

export AWS_SECRET=...                  # 1. Load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
source /etc/profile                    # 4. Bash login shell (from -l)
spack install ...
Whereas when a user launches their own container with (docker|podman) run -it, they end up running spack install in an environment formed in this order:

source /etc/bash.bashrc                # (not 4). Bash interactive shell (default with TTY)
export AWS_SECRET=...                  #~1. Manually load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
spack install ...
The big problem being that (4) has a completely different position and content (on Leap 15 and possibly other containers).

So in context, this PR removes (4) from the CI job case, leaving us with the simpler:

export AWS_SECRET=...                  # 1. Load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
spack install ...
marcost2 pushed a commit to marcost2/spack that referenced this pull request Aug 26, 2022
On some systems the shell in login mode wipes important parts of the
environment, such as PATH. This causes the build to fail since it can't
find `spack`.

For better robustness, don't use a login shell.

In a full CI job the final spack install is run in an environment formed by scripts running in this order:

export AWS_SECRET=...                  # 1. Load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
source /etc/profile                    # 4. Bash login shell (from -l)
spack install ...
Whereas when a user launches their own container with (docker|podman) run -it, they end up running spack install in an environment formed in this order:

source /etc/bash.bashrc                # (not 4). Bash interactive shell (default with TTY)
export AWS_SECRET=...                  #~1. Manually load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
spack install ...
The big problem being that (4) has a completely different position and content (on Leap 15 and possibly other containers).

So in context, this PR removes (4) from the CI job case, leaving us with the simpler:

export AWS_SECRET=...                  # 1. Load environment from GitLab project variables
source spack/share/spack/setup-env.sh  # 2. Load Spack into the environment (PATH)
spack env activate -V concrete_env     # 3. Activate the concrete environment
spack install ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
commands core PR affects Spack core functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants