Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docker, scorecard] add docker base image, use in scorecard #5623

Merged
merged 5 commits into from
Mar 19, 2019

Conversation

cseed
Copy link
Collaborator

@cseed cseed commented Mar 18, 2019

FYI @danking @jigold @akotlar This might be contentious. The wisdom is to make docker images small, but I think this will greatly improve our docker pull/build/push times.

Watching some recent builds and deploys, it seems insane, the time we spend and the number of images we build with (mostly) duplicate contents. The goal here is to build a common base image for all containers in our cluster. I built on Ubuntu instead of Debian because it has Python 3.6 by default. It contains useful stuff I often find myself installing when debugging inside the cluster (htop, emacs, others should feel free to add tools they like), python3 and pip packages needed by any service (no conda, thank you very much), jvm and the google-cloud-sdk with kubectl. I also build a spark-base (built on base) with spark and the google hadoop connector.

Currently, I just use the base in scorecard. If we like this, I'll convert the other services (and pr-builder, which is hopefully not long for this world) on top of this. I will also make image fetcher fetch the base image as well as the notebook images.

The current scorecard image is ~900MB. The new base image is ~2GB, and scorecard adds an additional 15KB to that.

@cseed cseed changed the title add docker base image, use in scorecard [docker, scorecard] add docker base image, use in scorecard Mar 18, 2019
jigold
jigold previously requested changes Mar 18, 2019
pip decorator pylint pytest flake8 \
requests \
jinja2 \
aiohttp aiodns aiohttp_jinja2 uvloop>=0.12 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add asyncinit? https://github.com/kchmck/pyasyncinit I'm going to use this for the MySQL PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this file used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in the spark-base image to configure the Google hadoop connector.

@jigold
Copy link
Collaborator

jigold commented Mar 18, 2019

I agree about conda and having a set of docker images for the whole repo. I ran into other problems with conda and inter-project dependencies last week. The solution was to not rely on conda to specify the environment and just use setup.py. I can see this approach failing if different projects require different versions of the same package. Also, I think ci is using Python 3.7 while other projects are 3.6. We should probably pick a version for the entire project.

@cseed
Copy link
Collaborator Author

cseed commented Mar 18, 2019

I can see this approach failing if different projects require different versions of the same package. Also, I think ci is using Python 3.7 while other projects are 3.6. We should probably pick a version for the entire project.

Agreed. We just need to have the discipline to have all projects work with a fixed set of versions. What features of 3.7 does ci rely on?

@jigold
Copy link
Collaborator

jigold commented Mar 18, 2019

Not sure. @danking Do you know why ci uses Python 3.7?

@danking
Copy link
Contributor

danking commented Mar 18, 2019

It uses it because 3.7 was the newest when I started writing CI. Apparently asyncio got a lot better in 3.7. I think we should use 3.7 across the board for our non-user-distributed projects. https://docs.python.org/3/whatsnew/3.7.html#whatsnew37-asyncio

@cseed
Copy link
Collaborator Author

cseed commented Mar 18, 2019

OK, I'll upgrade to 3.7. There is a Python ppa with the latest version.

@cseed
Copy link
Collaborator Author

cseed commented Mar 18, 2019

I am (temporarily) defeated. Getting a working python 3.7 with pip on a standard distribution turns out to be non-trivial. Since ci doesn't specifically rely on 3.7 yet, I say we start with 3.6 and upgrade when it isn't too painful.

It seems everyone is onboard with this. I am happy to move forward if you are, @jigold.

@danking
Copy link
Contributor

danking commented Mar 19, 2019

We spoke about this in person, but I had this mostly typed up in this buffer so I'll leave it here for future us.

Some context on the docker build cache'ing situation.

Originally, Docker would use any image layer it had as a cache source. This was noted as a severe security vulnerability because I could make an image that claims to be the result of apt-get install curl but actually was the result of apt-get install virus.

In response, Docker banned the use of non-locally-built (i.e. from the Internet) image layers as cache sources. I believe there may have been other motivations as well, but I did not carefully investigate. moby/moby#26065 documents the desire for a way to use non-locally-built images as a cache source. moby/moby#26839 implements this. Unfortunately, and I cannot find documentation on this, --cache-from X means "cache only from X". If you pass multiple --cache-froms each one is used as a cache source, but it is not possible to say "use all local images as a cache source" (other than enumerating them all). --cache-from was included in v1.13.0, released January 2017.

Another subtlety of --cache-from is that it does not pull the image in question if it is not found locally. I only found this documented in a comment on the implementing PR.

Docker seems to be in maintenance mode and all new development is going into Moby. The replacement for docker build is called buildkit. Build Kit has a more reasonable cache'ing strategy wherein one exports and imports ones cache to a trusted repository.

@cseed cseed dismissed jigold’s stale review March 19, 2019 03:09

ready for another look

@cseed
Copy link
Collaborator Author

cseed commented Mar 19, 2019

@danking img: https://github.com/genuinetools/img

"Standalone, daemon-less, unprivileged Dockerfile and OCI compatible container image builder.

img is more cache-efficient than Docker and can also execute multiple build stages concurrently, as it internally uses BuildKit's DAG solver.

The commands/UX are the same as docker {build,tag,push,pull,login,logout,save} so all you have to do is replace docker with img in your scripts, command line, and/or life."

Oops, seems it doesn't quite work unprivileged yet, see: https://github.com/genuinetools/img#running-with-docker. Waiting on an upstream docker change, no movement in two months. Hrm.

@danking danking merged commit b33d185 into hail-is:master Mar 19, 2019
@danking
Copy link
Contributor

danking commented Mar 19, 2019

@cseed that seems good enough for our purposes. We can run it, unprivileged, on a VM outside of k8s. We have a thin service in k8s that has privilege to talk to that VM. It streams build context and docker file to said VM. That VM invokes img. That VM is secure as long as img doesn't allow builds to escalate their privileges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants