-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shrink #66
Conversation
More seriously, I'll take a look at it. My binutils external detection PR merged, so that will help. The biggest things left are actually getting a sparse checkout, which I want to talk over with you, and possibly dealing with this: spack/spack#29031 |
Ok, I'm going to add the fixes for the gold issue and some other cleanup here now that it proved out in the clang update. |
This should now build a good bit faster, since it's no longer building cmake in spack, but it has more in it that downstream packages can use, and configures spack to automatically reuse what's there. If you don't like any of this, say so and I'll pull it out, but I also re-worked it to use an environment to manage the view, and configure spack through the environment to use that so a Was arguing with myself about getting rid of the git overhead for grabbing spack. We could grab the commit and unpack it with: That drops the container size by ~100mb, and fetches in less than 5 seconds, but of course the result is not a functioning git repo. If the point is to keep the one pinned commit, then that's probably fine, but if not... Not sure, what do you think? The follow-on question is, now that this builds in just over 3 minutes in github actions, what do you think about wiring up the build so the containers based on this one to build based on the version for the current day? I have no idea how complex this is, so treat that as a completely naive question. |
Sorry didn't see this been so distracted with dwarf and gather.town today lol! Let me take a look. |
I think we probably want a functioning git repo for anyone that wants to use a different spack history commit, etc. |
You mean the other matrix builds? They will already use whatever is latest for the base images. Or do you mean something else? |
You've mentioned a couple of times that there's a day delay, I assume because when the build starts each one picks up the base that exists when it's created rather than the one that would be generated that day. Perhaps I misunderstood? |
That can happen yes - but the base images workflow triggers before the matrix images, so I think they could be used the same day but I cannot guarantee it. What does have a day delay is a non base image (e.g., a matrix build) that uses another matrix build (since they build at the same time). |
Ah, got it, so probably nothing to do then. Great! If you're happy with this behavior, I can propagate it to the other Ubuntu base images. What do you think? |
I think that would be ok with me - ping @davidbeckingsale to check! I'm wondering if we have a lot of redundancy in chunks that we run if we should have scripts to copy and run instead? 🤔 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
okay looks like I can see GitHub on my other computer! That's super weird. Okay - so for next steps do we want to update the PR here with the other containers, and then we can see the recipes side by side and decide if there is any redundancy to put in one place? |
We have a lot of redundancy right now, especially after what I added. I was actually planning to ask you about scripts, shall I add a scripts directory either at the ubuntu level or at the top? At least the spack setup and the cmake setup could easily be scripts, will take a look at the others. |
@trws we can remove the autamus cache, it's going to be refactored to be something different and probably mostly broken given all the hash changes anyway. |
It seems like the clang builds are failing with a bus error in the clang build again, I think that means they are not pulling the up-to-date ubuntu images, since those configure spack to re-use the system python. 🤔 |
…inst the current built ubuntu bases we can not fix that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added some comments! Overall this is looking really good!
nvidia-ubuntu/Dockerfile
Outdated
COPY ./scripts /opt/scripts | ||
COPY ./ubuntu/scripts/* /opt/scripts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have two script directories? E.g., it would be ideal to have them more organized in one place, and then do one copy to interact with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the scripts in the upper directory aren't ubuntu specific. The spack setup and cmake fetch are both general operations. I can move them in if that's your preference, but the spack setup one would make sense for the alpine containers too if they are going to run spack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, one source with subdirectories like you mention below would work, I can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
./scripts/
ubuntu/
generic/
so we can just do one copy for everyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, definitely. I just put the general ones at ./scripts for now but that's basically what I'm doing.
nvidia-ubuntu/Dockerfile
Outdated
COPY ./scripts /opt/scripts | ||
COPY ./ubuntu/scripts/* /opt/scripts | ||
RUN ./scripts/apt-install-defaults-plus-args | ||
ENV CMAKE=3.20.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion - it's helpful to group envars in one place, and then group run commands to reduce layers. For these sections:
ENV CMAKE=3.20.4
ENV PATH=/opt/spack/bin:$PATH
ENV SPACK_ENV=/opt/env
WORKDIR /opt
COPY <one script source with subdirectories> /opt/scripts
RUN ./scripts/apt-install-defaults-plus-args && \
./scripts/install-cmake-binary && \
./scripts/set-up-spack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need to test this, the SPACK_ENV var may have to come after the set-up-spack run, having it set but not initialized may cause spack to fail, but I take your point, there's no reason not to merge out most of it, will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh that's weird, I can't believe spack would do that (I did not know).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvidia-ubuntu/Dockerfile
Outdated
cp -fR cmake-3.10.1-Linux-x86_64/* /usr && \ | ||
rm -rf cmake-3.10.1-Linux-x86_64 && \ | ||
rm cmake-3.10.1-Linux-x86_64.tar.gz | ||
# Add appropriate paths pointing to the view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These also should go at the top. For Dockerfile you generally do:
- ARG/FROM
- ENV/LABEL
- RUN that doesn't require code
- WORKDIR / COPY for getting needed assets in container
- RUN that requires code, grouping into logical blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think that should be fine. I normally order by when something would be valid, but since these are search paths it shouldn't matter, they just wont have anything in them yet.
scripts/set-up-spack
Outdated
@@ -0,0 +1,48 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make these clearly identified as shell scripts (e.g., set-up-spack.sh).
scripts/set-up-spack
Outdated
# Use the autamus build cache for maybe faster install? | ||
# NOTE: disabling on 2022-2-22 to try to work around binutils fetch errors | ||
#python3 -m pip install botocore boto3 | ||
#spack mirror add autamus s3://autamus-cache | ||
#curl http://s3.amazonaws.com/autamus-cache/build_cache/_pgp/FFEB24B0A9D81F6D5597F9900B59588C86C41BE7.pub > key.pub | ||
#spack gpg trust key.pub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah just delete all this for now, we will refactor to consider autamus when we know it's going to help and not hinder. And it's not autamus fault, it's spack breaking things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'm still confused about what's actually causing that binutils failure.
valgrind \ | ||
vim \ | ||
wget | ||
WORKDIR /opt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments for here and further down about order / consolidation.
Aside from the comments, which I'll work on in a second, I need to figure out how to get the clang builds to use the ubuntu builds from this PR, or how to get them out of this entirely. They depend on things from the new base, and because they're using an older version as base instead they consistently fail even if they build. Trying reverting the clang dockerfile for now. |
@trws for a PR if you make changes to a file it will build, so you are 100% correct! Just remove the clang changes and we can open a PR for that separately after the base ubuntu's are merged/deployed (and then the matrix ones won't fail). It's definitely not a perfect system having base containers --> used by --> matrix containers but if you have an idea to make it better we can definitely try! btw if the lab had peer bonuses I'd totally give you one for this work, I am extremely grateful for the time/attention you are putting into these improvements! |
Cool! And thanks, I'm happy to help get some of this stuff going again. We need a solid base to use for our testing and deployment stuff, it's well worth it, and I like how you set this up. My biggest gripe with our old way was getting newer or "latest" versions was a pain, having uptodate in the mix should make that a lot nicer. As a side-note, while we do not have peer bonuses (or real bonuses in general) comp does do "SPOT awards" nominations that one can put in for other employees. Not fishing here, but I've heard group leads asking for more nominations so it may be something to keep in mind. |
Ok, I think this one is cleaned up and ready to go. There will need to be at least one downstream PR to do clang and the other ubuntu downstream containers, and if you would like I can apply this to the alpine base as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Let's merge and then first work on the matrix builds that use it. If those work and people have need for the other bases (alpine?) we can go back and add another set of bases.
Thank you so much @trws !
They are deployed! 🥳 |
Replaces #65