-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change dockerfile to use alpine instead of debian #77
Conversation
This updates the docker file to use alpine.
Update Dockerfile - Alpine Base
Thanks for the pull request, @munntjlx! I will take a look at this soon. |
Thank you for the PR, @munntjlx! I just took a look at this, and the alpine-based image does end up significantly smaller than the debian-based one: 57MB vs 216MB, nearly a 4x reduction. Frustratingly, the scan performance of the alpine-based image is several times slower! Using I suspect the performance different is somehow due to using musl instead of glibc? Some quick searching online indicates that it may be due to a malloc implementation in alpine having much more contention in multithreaded settings than glibc: https://www.linkedin.com/pulse/testing-alternative-c-memory-allocators-pt-2-musl-mystery-gomes/. This sounds plausible, as Nosey Parker does make heavy use of thread-based parallelism. Interesting to see that Nosey Parker can build with musl! That may be relevant for one day producing statically-linked binaries (which is trickier than one would like here, due to a number of native-code dependencies). I would take this PR if the performance did not drop significantly, as the images are many times smaller. But at present, I value the Nosey Parker scan throughput over container size, so won't be merging this back unless the performance can be addressed. It may be possible to sidestep the performance drop by switching Nosey Parker to use an alternative allocator such as mimalloc instead. (Nearly all the allocation that Nosey Parker does is in Rust code, not C++.) But this would be a larger-scale investigation. |
I am sad that the performance was so much slower. You CAN build glibc things in alpine, you just have to add the right packages, which most people sometimes forget (theres been a package for glibc stuff for AGES in alpine). I mostly did it for the fun, to see how much smaller the image is. I will probably switch to your default, just that the scans will be so much sadder. |
Which OS did you test in? AS the 'host' os that is? |
@munntjlx this was running on an x86_64 macOS machine. I'll also try on a big Linux machine. |
@munntjlx switching the global allocator in Nosey Parker to use |
The main reason I tend to prefer alpine is the smaller base, it does have a bit of strangeness (python for example) but once you get used to working with musl (I am an old hand at openwrt), I find that most things work (with a few exceptions). Thanks for being willing to entertain a MUSL build! The other 'old' complaint among the k8s folks was the lack of support for dns tcp (which has been fixed for about 6 months now), in that MUSL now directly supports TCP based dns requests. |
FYI I see similar performance characteristics from the Alpine image on a 32-core Ubuntu 22.04 machine -- a 6x slowdown relative to the glibc-based Docker image (~500MB/s vs ~3GB/s scan throughput). Similarly, switching the global allocator to Let me get the switchover to |
Now that #88 is done can we have a 'separate' dockerfile for alpine? |
@munntjlx Yes, a separate |
This was adapted from PR #77. Co-authored-by: Thomas Munn <48925191+munntjlx@users.noreply.github.com>
P.S. @munntjlx I did mark you as a co-author on that commit, so you should get credit for it. |
Thank you! |
I mostly did this for fun, but I prefer alpine images as they tend to be smaller.