Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Ci docker use external mpich #3876

Conversation

scottwittenburg
Copy link
Collaborator

Build a non-spack mpich using ch3:sock:tcp, which spack doesn't support, and work it to spack environment in the ci image as an external. The ch3:sock:tcp build should handle the oversubscription in CI much better, resulting in faster tests when building with mpich.

- change base image (smaller image and newer tools)
- clone spack branch and configure mirror to match
- update packages file:
    - so spack finds the external mpich
    - so mgard matches the required version (now an adios2 dependency)
- fix external mpich install so the mpicc compiler actually works
    - need the spack hwloc added to the mpich rpath
    - may need same with other deps it should share with adios2 ci envs
- remove adios2-ci-deps environment and extra mgard install
    - adios2 already has it
    - but spack still concretizes two hashes of mgard
    - the two mgards project to one view location and spack dies
@scottwittenburg
Copy link
Collaborator Author

It's odd that using the ch3:sock:tcp version sped things up so much on my desktop, but didn't seem to make a difference at all here. I'll run again locally now that's it's built into the test image.

Copy link
Collaborator

@vicentebolea vicentebolea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, great work there 👍

Few questions to understand the changes


COPY packages.yaml /etc/spack/packages.yaml

# Install Base specs
RUN . /spack/share/spack/setup-env.sh && \
RUN git clone --depth 1 --single-branch --branch e4s-${E4S_VERSION} https://github.com/spack/spack && \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spack should be already installed from the ecpe4s image, is there any reason for this clone?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spack is not already installed with the new image I used as a base.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new base image is used in spack ci as one of the runner images, and is intended to have spack cloned before each run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, any reason to use the runners image rather than the minimal spack image as we used before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several reasons:

  • the spack in the image we used before failed to concretize the stack with the external mpich
  • according to the maintainer of the E4S images, the runner images are:
    • better: modern gnu make, patchelf, cmake, ccache, ninja exposed in default PATH
    • smaller: 1.91GB uncompressed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that makes sense

# # configure with "--enable-g=dbg,log" for runtime debug output
# #
RUN . /spack/share/spack/setup-env.sh && \
export HWLOC_LOCATION="$(spack location --install /$(spack find --format {hash:7} hwloc))" && \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not spack load hwloc before this cmd?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try that and see if it allows me to remove the specification of the hwloc path in the configure args.

@scottwittenburg
Copy link
Collaborator Author

I made a mistake when building the images, forgetting to switch the FROM statement in the images that build on the base. This is probably why mpich tests didn't complete any faster in the last run 🤦

I'm rebuilding/pushing now, and we'll see if it's any better.

@vicentebolea
Copy link
Collaborator

@scottwittenburg I guess that this was superseded by #3883, shall we close?

@scottwittenburg
Copy link
Collaborator Author

Yes, thanks for the reminder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants