Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explicit gcc dependency in Dockerfile #262

Closed
wants to merge 1 commit into from

Conversation

orangejulius
Copy link
Member

@orangejulius orangejulius commented Nov 1, 2021

In pelias/docker-baseimage#23 we're leaning out our Docker baseimage used by all other Pelias images, and hopefully can remove the compiler toolchain all together.

The Polylines Docker images do need gcc (but not quite a full compiler toolchain), but didn't follow the convention in our other Dockerfiles of having an apt-get step to install it.

This PR adds such a step, and is a little clever in installing gcc only temporarily, and only for the go get step that requires it.

The polylines Docker image is already quite large (950MB uncompressed) since it includes Node.js, Go, and package dependencies for both. Skipping the installation of gcc cuts out 120MB of that.

Until pelias/docker-baseimage#23 this change won't really have any impact on the size or operation of this Docker image.

In pelias/docker-baseimage#23 we're leaning out
our Docker baseimage used by all other Pelias images, and hopefully can
remove the compiler toolchain all together.

The Polylines Docker images _do_ need `gcc` (but not quite a full
compiler toolchain), but didn't follow the convention in our other
Dockerfiles of having an `apt-get` step to install it.

This adds such a step, and is a little clever in installing `gcc` only
temporarily, and only for the `go get` step that requires it.

The polylines Docker image is already quite large (950MB uncompressed)
since it includes Node.js, Go, and package dependencies for both.
Skipping the installation of `gcc` cuts out 120MB of that.

Until pelias/docker-baseimage#23 this change
won't really have any impact on the size or operation of this Docker
image.
@missinglink
Copy link
Member

This should be fine, although I'm not convinced the cleanup is removing everything that was installed.

I believe apt remove doesn't delete some files which --purge does, maybe only user config files? I think the apt cache is also not being removed.

My preference for this sort of thing would be a multi-stage docker build where the binary is generated in an isolated stage and then just copied into the next stage.

eg: https://github.com/missinglink/pbf/blob/master/Dockerfile#L14

@missinglink
Copy link
Member

We could also drop the Go dependency in the final docker image since the go runtime is compiled into the binary?

@orangejulius
Copy link
Member Author

Oh yeah, a multistage build would work really well here. I'll take a look at that :)

@missinglink
Copy link
Member

missinglink commented Nov 1, 2021

I don't remember exactly what requires CGO in pbf but it's probably the SQLite binding, LevelDB binding etc.

In that case you'll need to have the headers installed at compile time, so like libsqlite, sqlite-dev or whatever it's called on Ubu.

They are dynamically linked by default so the runtime image will need a compatible version of the shared link library (.so on Linux, .dylib on Mac), these often have a similarly named apt package which can be much smaller since there is no source included.

@missinglink
Copy link
Member

If you have any issues with shared libs you can run ldd on the compiled binary to list all dynamically linked libs.

@orangejulius
Copy link
Member Author

Cool, I think shared libraries will be fine. I confirmed by looking at htop that it is something sqlite3 related that's building with gcc, and I don't think we use that functionality in the polylines importer. I'll test out a pelias prepare polylines but it should work, right?

@missinglink
Copy link
Member

I think it'll be fine since the baseimage will have libsqlite.so in it already anyway.

I'm actually not sure off the top of my head what would happen if that file didn't exist in the runtime docker image.

I'd assume it wouldn't even start the binary rather than erroring only on SQLite related functionality.

orangejulius added a commit that referenced this pull request Nov 1, 2021
The polylines Docker image is a bit of a large one currently, as it
includes not just Node.js and a `node_modules` directory, but a full
compiler toolchain, an install of the Go language, the dependencies of
the `pbf` repository from https://github.com/missinglink/pbf, and the
final `pbf` executable that comes from it.

All told, this brought the total image size to a whopping 950MB
uncompressed.

This PR makes use of multi stage builds to run the compiling of the
`pbf` executable in a separate container. After this, all the toolchain
and dependencies needed can be thrown away, and only the small
executable copied to the final image.

Using `container-diff` it looks like the image size, uncompressed, after
pelias/docker-baseimage#23 as well, will be only
322MB. That's a nice 600MB savings!

Before pelias/docker-baseimage#23 the image size
still drops to 500MB, still a healthy reduction.

Replaces #262
@orangejulius
Copy link
Member Author

Yup, it all worked out fine: #263

orangejulius added a commit that referenced this pull request Nov 1, 2021
The polylines Docker image is a bit of a large one currently, as it
includes not just Node.js and a `node_modules` directory, but a full
compiler toolchain, an install of the Go language, the dependencies of
the `pbf` repository from https://github.com/missinglink/pbf, and the
final `pbf` executable that comes from it.

All told, this brought the total image size to a whopping 950MB
uncompressed.

This PR makes use of multi stage builds to run the compiling of the
`pbf` executable in a separate container. After this, all the toolchain
and dependencies needed can be thrown away, and only the small
executable copied to the final image.

Using `container-diff` it looks like the image size, uncompressed, after
pelias/docker-baseimage#23 as well, will be only
322MB. That's a nice 600MB savings!

Before pelias/docker-baseimage#23 the image size
still drops to 500MB, still a healthy reduction.

Replaces #262
orangejulius added a commit that referenced this pull request Nov 1, 2021
The polylines Docker image is a bit of a large one currently, as it
includes not just Node.js and a `node_modules` directory, but a full
compiler toolchain, an install of the Go language, the dependencies of
the `pbf` repository from https://github.com/missinglink/pbf, and the
final `pbf` executable that comes from it.

All told, this brought the total image size to a whopping 950MB
uncompressed.

This PR makes use of multi stage builds to run the compiling of the
`pbf` executable in a separate container. After this, all the toolchain
and dependencies needed can be thrown away, and only the small
executable copied to the final image.

Using `container-diff` it looks like the image size, uncompressed, after
pelias/docker-baseimage#23 as well, will be only
322MB. That's a nice 600MB savings!

Before pelias/docker-baseimage#23 the image size
still drops to 500MB, still a healthy reduction.

Replaces #262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants