Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Travis CI #134

Merged
merged 4 commits into from Mar 7, 2016
Merged

Use Travis CI #134

merged 4 commits into from Mar 7, 2016

Conversation

jakirkham
Copy link
Member

Related: #7

This doesn't include releases or anything else at this point. All this does is tries to use Travis CI to do the builds. It also makes sure all images have been refreshed so that hopefully Dockerfiles that were left unchanged and don't have changed dependencies won't be rebuilt.

@jakirkham
Copy link
Member Author

Fun. I guess Travis CI was already enabled here. So, we can just wait and see if this works or not. :)

@parente
Copy link
Member

parente commented Mar 4, 2016

If this works out, it'll be a nice way to vet changes before we merge.

@parente
Copy link
Member

parente commented Mar 4, 2016

Ah, it's rebuilding from scratch because of the FROM debian:jessie. It's not on the travis box, so it gets pulled and effectively resets the cache, possibly even if it's exactly the same image that was used in the make refresh-all images.

@jakirkham
Copy link
Member Author

If this works out, it'll be a nice way to vet changes before we merge.

That's my hope. It may even dual as a mechanism for releases, which would reduce the maintenance burden.

Ah, it's rebuilding from scratch because of the FROM debian:jessie.

I don't think that should be a problem. Unless, of course, debian:jessie has been updated since our last release. Though it doesn't hurt to pull it as well.

So, I saw this line in the Makefile. Is this something that exists on the VM? If so, maybe we need to create it here so the cache is not invalidated. Are there any other changes we need to make?

@jakirkham
Copy link
Member Author

Yeah, we should add a .dockerignore file and add stuff to it like .git. This stuff would likely change an invalidate dockers cache, which is exactly what we want to avoid here. We are always building from the top level directory, correct? If so, we can just add it there.

@parente
Copy link
Member

parente commented Mar 4, 2016

So, I saw this line in the Makefile. Is this something that exists on the VM? If so, maybe we need to create it here so the cache is not invalidated. Are there any other changes we need to make?

Yes. It was a safeguard to ensure I and others didn't accidentally release from a local box.

@jakirkham
Copy link
Member Author

So, the build procedure might need to change a little bit. As the cache includes the entire repo, if something changed in some unrelated part, then the whole cache will still be invalidated and we need to build from scratch. Also, it appears that progress bars are causing are log to be too long for Travis. ( travis-ci/travis-ci#3034 )

@parente
Copy link
Member

parente commented Mar 4, 2016

As the cache includes the entire repo, if something changed in some unrelated part, then the whole cache will still be invalidated and we need to build from scratch.

Does travis cache anything across builds? In the manual process today, the "cache" comes from building on the same VM again and again so that the history just happens to be there. When the latest source is fetched from git, none of the local files are touched except those that changed in master. So only the images that truly had changes on master break cache. Everything else builds speedily.

@jakirkham
Copy link
Member Author

Would I be able to access the VM to take a look? I think it would help me understand what is going on a bit more.

@parente
Copy link
Member

parente commented Mar 4, 2016

Added your github public keys to the VM. The instructions for sshing and navigating around are in the README https://github.com/jupyter/docker-stacks#maintainer-workflow.

If you go there right now and run make release-all it should rip through every step and wind up pushing nothing because the build is idempotent and the last build is already on Docker Hub. (In theory. :)

@jakirkham
Copy link
Member Author

Perfect. Thanks.

@jakirkham
Copy link
Member Author

So, it looks like our copy of debian:jessie in the VM is from 2 weeks ago, but the Docker Hub copy got updated 3 days ago. This is probably why everything is getting rebuilt from scratch. I am not sure whether we should do anything about this.

Admittedly, there will be times when our releases will be using an older copy of debian:jessie so they will need to also be rebuilt from scratch. If we can get this to work on Travis where everything is rebuilt from scratch, we will be fairly confident that our Travis setup will work in all cases.

@jakirkham
Copy link
Member Author

Even if this doesn't work, it looks like we will still get a few bug fixes.

@jakirkham jakirkham force-pushed the use_travis_ci branch 2 times, most recently from f2e0ddf to 2d467a5 Compare March 4, 2016 23:28
@jakirkham
Copy link
Member Author

Well, unfortunately, I don't think this is going to work. We are running into the maximum time limit here. Could we maybe consider using Docker Hub's automated builds for minimal-notebook and minimal-kernel? That way at least these base images would be already built for us and it would significantly cut down the build time here.

@parente
Copy link
Member

parente commented Mar 5, 2016

The problem with the Docker Hub auto build is that there's no way to tag according to the SHA. We'd have to make a tag/branch here every time, and then manually tell Docker Hub to build those.

@jakirkham
Copy link
Member Author

Sure, that's reasonable. What about an internal image that includes the apt-get step from minimal-notebook?

@jakirkham
Copy link
Member Author

There were some merge conflicts, which have been resolved. Though I think it will be awhile still before we need to worry about merging this (if at all).

@parente
Copy link
Member

parente commented Mar 5, 2016

I'm OK with merging it once it goes green and leaving it active as a way to test PRs, even if it's not tied to the release process. It beats having nothing at all while we continue to look for ways of automating the entire workflow. Plus, we'll start to get data on how many times it actually works on Travis vs fails because of timeouts or other problems.

@jakirkham
Copy link
Member Author

My concern is that the debian images gets updated so frequently that we will often have to rebuild everything from scratch through the CI and see these build failures. It would be nice to have a way to mitigate this.

A totally different thought might be to create a "mirror" of the debian image. Namely, we add a Dockerfile with just a FROM statement like so and place the built image on Docker Hub.

FROM debian:jessie

We could then just update this "mirror" image for releases. As it won't be updated between release, we won't have to deal with cache invalidation in the CI. Releases could continue to proceed through the VM as they do now.

That all being said, there is a chance that some change to the debian image breaks our release. So, this system will not show that sort of problem. Though this is normally a rare enough event that it shouldn't be that much of an issue. Besides simply merging our currently proposed change to the CI system won't be able to catch this problem anyways due to the fact that a complete rebuild doesn't not fit in the time limit. Thoughts?

@parente
Copy link
Member

parente commented Mar 5, 2016

My concern is that the debian images gets updated so frequently that we will often have to rebuild everything from scratch through the CI and see these build failures. It would be nice to have a way to mitigate this.

To put this in perspective, our current manual build system mitigates this problem by keeping debian cached on the build VM. It's nice and speedy, but the downside is that when debian is updated for good reason (e.g., CVE) unless we know to re-pull it, we're building off a base image with known defects.

There's no silver bullet here. We need to pick what we're willing to trade off: security/fixes to the base for build speed or vice versa. I honestly think "we're doing it wrong" today. We should be providing users with the most secure / bug-free images we can, not optimizing for what's convenient to run on Travis or the current build VM.

A totally different thought might be to create a "mirror" of the debian image. Namely, we add a Dockerfile with just a FROM statement like so and place the built image on Docker Hub.

I feel like we're catering to limitations of the free CI systems in doing this. But that's just my opinion.

@jakirkham
Copy link
Member Author

So, I think there are two different issues and it would be better if we looked at them separately.

  1. Testing changes to the stacks.
  2. Deploying changes made to the stacks.

My primary issue at present is to solve 1. Solving 2 would be nice, but IMHO it is secondary to solving 1. The reason I think 1 is more important is it is hampering our development cycle. Having reliable automated testing gives us some confidence in changes proposed to the stacks. It also cuts down on developer time spent verifying these changes. Also, it gives contributors feedback sooner. Once we have resolved the bottleneck in 1 we can focus on 2. However, if we focus on 2 now 1 remains a bottleneck and so it doesn't really do us any good.

If our current solution is to keep the debian image cached on the VM and that is how we are doing deployments, then I think having our testing infrastructure mimicking that is reasonable. To my understanding, that is what my proposal is above.

As clarification on another point, I don't think we should worry about doing releases with Travis. We should stick to the VM at present. Though we certainly can revisit this if we find we have a configuration of Travis that is reliable for automated releases.

In the long term we do need to have a discussion about fixing our testing and release system (and I think you have raised some valid concerns). Though I think that pertains to point 2. As stated before making sure 1 is being done reliably will speed up our development time and allow us to focus more on 2.

@parente
Copy link
Member

parente commented Mar 7, 2016

I agree that we should be tackling #1 here.

To my understanding, that is what my proposal is above.

Sorry. I'm losing track. I'm assuming this is the proposal:

Namely, we add a Dockerfile with just a FROM statement like so and place the built image on Docker Hub.

How about instead we use a digest reference in the minimal-notebook Dockerfile to pin to a specific debian image instead of the tag (which floats across rebuilds) or having our own internal image (which adds maintenance burden.)

# debian:jessie pinned on 2016-03-06
FROM debian@sha256:a9c958be96d7d40df920e7041608f2f017af81800ca5ad23e327bc402626b58e

# rest of Dockerfile here

When there are rebuilds of debian with critical security fixes, we do a PR with a new digest.

@jakirkham
Copy link
Member Author

Sure, that sounds fine. I'll update this PR to reflect that change.

This is done to more explicitly track what version of Debian Jessie is
being used as a base image. It will also ensure that it is properly
updated on the VM even if we forget. This also should help CI and VM
builds stay speedy by using the cache even when there is a newer version
of Debian Jessie. In the long run, we may wish to re-evaluate this
strategy and fix our CI and deployment systems so as to be able to use
the latest version of Debian Jessie with important CVE and other fixes.
@jakirkham
Copy link
Member Author

How does this look?

@jakirkham jakirkham changed the title WIP: Use Travis CI (Proposal) Use Travis CI Mar 7, 2016
@parente
Copy link
Member

parente commented Mar 7, 2016

It looks like the right stuff, but travis doesn't seem to think so. It used cache up to the line that installs miniconda, and at that point, it started building the layers from scratch.

@jakirkham
Copy link
Member Author

I added quiet flags to every conda command. This was done because the logs were exceeding a maximum log size on Travis (~4MB). As a result, it is doing the right thing and tossing the cache during the conda install as the quiet flag is new.

We could do another release after merging this. Alternatively, I could backport the quiet flags and we could release that. Either should help resolve this issue for future PRs.

@parente
Copy link
Member

parente commented Mar 7, 2016

Ah, you mean this hasn't been rebased? Go ahead and rebase it. We can kill and restart the build. Let's confirm that it works as expected before merging.

@@ -70,11 +71,11 @@ RUN cd /tmp && \
echo "9ea57c0fdf481acf89d816184f969b04bc44dea27b258c4e86b1e3a25ff26aa0 *Miniconda3-3.19.0-Linux-x86_64.sh" | sha256sum -c - && \
/bin/bash Miniconda3-3.19.0-Linux-x86_64.sh -f -b -p $CONDA_DIR && \
rm Miniconda3-3.19.0-Linux-x86_64.sh && \
$CONDA_DIR/bin/conda install --yes conda==3.19.1 && \
$CONDA_DIR/bin/conda install --quiet --yes conda==3.19.1 && \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why the Miniconda install line cannot come from cache. The change is not currently in master.

@jakirkham
Copy link
Member Author

Sorry, I was a bit unclear. Above is an example of the quiet flag added in this PR that was not in master. This is the issue I was running into with Travis CI that I was trying to resolve. ( travis-ci/travis-ci#3034 )

@parente
Copy link
Member

parente commented Mar 7, 2016

Ah. My bad. Wrong way. OK. Let's let the build go as far as it can, then we can merge. At that point, we can rebuild the image son the build VM because there were indeed changes (albeit superficial ones for the quiet flag). After that build/push, we can kick travis to have it rebuild again and see what happens.

@jakirkham
Copy link
Member Author

No worries. I had forgotten about the quiet flag busting the cache.

Ok, sounds good.

@jakirkham
Copy link
Member Author

Appears to have passed. :)

@parente
Copy link
Member

parente commented Mar 7, 2016

:shipit:

parente added a commit that referenced this pull request Mar 7, 2016
@parente parente merged commit 0cddf3e into jupyter:master Mar 7, 2016
@jakirkham jakirkham deleted the use_travis_ci branch March 7, 2016 03:19
@jakirkham
Copy link
Member Author

Alright, its building on the VM under tmux release. Might need your help with the wiki though.

@parente
Copy link
Member

parente commented Mar 7, 2016

The webhook automatically updates the build page on the wiki when the last image (currently all-spark-notebook) finishes pushing successfully. Nothing you should need to do .

@jakirkham
Copy link
Member Author

Thanks for clarifying.

@parente
Copy link
Member

parente commented Mar 7, 2016

Not a problem. I've added a similar note to the maintainer section of the root README as well.

@jakirkham
Copy link
Member Author

Sounds good.

Also, it looks like we are getting a little close to the disk space limit on the VM. I cleared out some dangling images from yesterday's build. Though we probably should clear out some more space.

@jakirkham
Copy link
Member Author

Released. Wiki is updated.

@jakirkham
Copy link
Member Author

If we decide to go the VM route in the future, conveyor might be worth checking out.

@parente
Copy link
Member

parente commented Mar 7, 2016

Ooooo.

@jakirkham
Copy link
Member Author

Yeah, I just came across it last night. Figured it would be better than us learning how to setup GitHub status updates and webhooks from scratch.

rochaporto pushed a commit to rochaporto/docker-stacks that referenced this pull request Jan 23, 2019
Making possible to specify custom claims for username in jupytherhub_config.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants