Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self-hosted spack, or container spackOS #42082

Draft
wants to merge 53 commits into
base: develop
Choose a base branch
from
Draft

Conversation

trws
Copy link
Contributor

@trws trws commented Jan 15, 2024

This is a big, messy, WIP PR right now but collects all the changes I made to get spack to bootstrap a toolchain from glibc to a fully functional gcc, python and spack environment that can be relocated independently of the base system. An example container for x86_64 containing nothing but spack-built software in a spack root at /spack (and a couple of text files and symlinks) that is also a working bootstrapped spack installation and userland can be found at "trws/spackos-experimental:spackos" on dockerhub. For aarch64 use trws/spackos-experimental:spackos-aarch64

Things that work surprisingly well:

  • Relocation just worked. ALL THE WAY DOWN. I'm pleasantly gobsmacked, and built the entire stack in one path, relocated it to another in the container and stuff just works.
  • Because of 1, we can now build spack binary packages with dynamically linked executables that will work on any linux distro. Download them on a musl system, and they just go.
  • After slapping in a couple of symlinks to provide /bin/sh and /usr/bin/env from the spack env at /spack/base and adding super-basic passwd and shadow files, the base system in the container pretty much just works. Bash, coreutils, basically everything you would expect, and despite spack bootstrap claiming it doesn't have clingo on first entry, a "spack bootstrap now" cleans that up without actually installing anything. 🤷

Things that definitely need work:

  • Actually doing the bootstrap is a PITA right now, and could be substantially more automated. Current process:
    • On an ubuntu22.04 x86_64 system (yes, this is a requirement for now) activate each environment in bootstraps/stage{1-4} in sequence and build them with the following caveats. To make a "spackos" container, delete the .git/* line from the .dockerignore file, then use the dockerfile in the bootstraps folder with this command from the spack root: docker buildx build -t spackos-experimental --load -f bootstraps/Dockerfile --target=spackos .
    • In stage4 the solver really likes to build a second glibc, and because the gcc includes one by default, you end up mixing two glibc instances, which ends very, very badly. Before building stage4 the hash for the stage2/3 glibc has to be forced in the env file for stage4.
    • Also in stage4, the python build does something magical that pulls in headers from /usr/local/* or /usr/include/x86_64-linux-gnu for some reason. Currently work around this with bubblewrap like so: bwrap --dev-bind / / --tmpfs /usr/local --tmpfs /usr/include/x86_64-linux-gnu spack install -v python
    • Many stage1 items are not strictly "necessary" and can be externals, but if externals are used anywhere in here then the ubuntu and spack OS requirements get messed up and it becomes nearly impossible to build anything useful in stage3 or later. Best to just build a bunch of extra crud for now.
  • Spack currently doesn't really know what the "spack" OS is, and hasn't grown a mechanism to handle OS switches like that yet, so it has to be overridden all over the place.
  • Base system libraries need special handling to keep the wrapper from mangling them. If glibc is added to the include path with -isystem, C++ compilation will fail because the glibc headers will be moved before the libstdc++ ones. The current workaround for that is really nasty.
  • The whole thing currently only works by lying to spack and saying that glibc and libxcrypt have no dependencies... This is not good, but it works until we can work around the issues in concretizing the multiple copies of all their build deps and avoid having them stick around in the graph.
  • Spack currently doesn't provide a mechanism for a "durable" environment (a way to keep an attempted rebuild from killing the currently working env if it fails) or pinning a version of a base package like glibc other than very explicitly pinning the hash. This makes the experience of using it to manage everything a little rough. I would actually recommend playing with it to see how it works/feels, we'll need something new there for sure.

Copy link

spackbot-app bot commented Jan 15, 2024

Hi @trws! I noticed that the following package(s) don't yet have maintainers:

  • gmp
  • linux-headers
  • linux-pam
  • mpfr
  • ncurses
  • spackos-base

Are you interested in adopting any of these package(s)? If so, simply add the following to the package class:

    maintainers("trws")

If not, could you contact the developers of this package and see if they are interested? You can quickly see who has worked on a package with spack blame:

$ spack blame gmp

Thank you for your help! Please don't add maintainers without their consent.

You don't have to be a Spack expert or package developer in order to be a "maintainer," it just gives us a list of users willing to review PRs or debug issues relating to this package. A package can have multiple maintainers; just add a list of GitHub handles of anyone who wants to volunteer.

@haampie
Copy link
Member

haampie commented Jan 18, 2024

Few comments:

  • We could use this approach https://github.com/haampie/spack-intermediate-gcc-example to express the dependencies between the different bootstrap stages, and simplify configuration, so that bootstrapping is a single make invocation.
  • You're using Docker just to get a fixed toolchain to kick off bootstrapping from, right? But in principle it can also be the user's own toolchain? Is the only issue that in Spack you can't differentiate between the just-built gcc and the system toolchain if their versions coincide?

trws and others added 26 commits January 30, 2024 08:49
bootstraps/Makefile defines the full bootstrap in terms of the
environments in the four subdirectories, after converting to
requirements, this is greatly cleaned up and almost entirely declarative
now.  The remaining issues being having to enforce the glibc hash in
stages 3 and 4, and relying on only the spack built gcc being exactly
version 13.2.0 and the host offering at least one lower-version gcc.
This is getting much closer, full bootstrap works again, stage numbers
line up, depfile all the way through, much cleaner.
…working compiler that needs path tweaks to work, fails to build runtimes because of path issues
Co-authored-by: Harmen Stoppels <harmenstoppels@gmail.com>
Co-authored-by: Harmen Stoppels <harmenstoppels@gmail.com>
@trws
Copy link
Contributor Author

trws commented Mar 20, 2024

Quick update here:

I've had to focus on other things for the past while, I'm going to try to get this rebased today and at least basically straightened out. My understanding of the main thing that needs to happen before the overall thing can be merged is that it needs to build, at least the final stage, in CI. This should require:

  • rebase/cleanup
  • add a stage3 container or build cache built from this somewhere that we can use it as a CI stack base
  • add a stack that builds stage3 again (stage4/stable) with the stage3 container

I'm not entirely sure how to get the image into our container registry, so that's a bit of a blocker, but I'll see what I can do.

@Mic92
Copy link

Mic92 commented Apr 19, 2024

Thanks. I was waiting for it: #39712

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants