Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stripping binaries in conda-forge packages #520

Open
xhochy opened this issue Feb 4, 2018 · 4 comments
Open

Stripping binaries in conda-forge packages #520

xhochy opened this issue Feb 4, 2018 · 4 comments

Comments

@xhochy
Copy link
Member

xhochy commented Feb 4, 2018

Related to discussions in ContinuumIO/anaconda-issues#8242 and related issues, I would like to see that we strip binaries in conda-forge packages. For example in the case of Pandas 0.22 on Linux & Py3.6 the current package is 26M large and uncompressed to 108M on disk. If we build this using CFLAGS=-Wl,-strip-all, the size reduces to 11M packaged and 52M on disk.

I'm wondering if it would be possible to add this to the environment setting in general for conda-forge builds or if I should raise an issue with conda-build that there should be an option to strip binaries after the build.

@ocefpaf
Copy link
Member

ocefpaf commented Feb 4, 2018

I'm always on the fence when adding flags to the build env. But I'm definitely not against stripping binaries. (I already do that in a few places actually.)

@xhochy
Copy link
Member Author

xhochy commented Feb 4, 2018

Is there an easy way to add this to the recipe.yml for Linux without the need to create build.sh for many packages that currently only need script: python setup.py install --single-version-externally-managed --record=record.txt in their meta.yml?

@jakirkham
Copy link
Member

As this came up in the meeting today and there was a question as to where this was being discussed...

cc @conda-forge/core

@jakirkham
Copy link
Member

At our last meeting @jjhelmus raised some concerns about strip all. Believe one concern was that it made debugging errors/crashes hard. Also he had mentioned that Cython generated kind of large binaries, which seems relevant in the case of Pandas. Not sure if there was a cause for this. I don't recall all of the details ATM. So maybe he can fill us in here.

More generally there is room for improvement even before we consider things like strip all. For instance, we haven't made very good use of optimization flags across conda-forge. So at least some level of -O in most packages would be welcome for performance increases and size reduction. New compilers that Anaconda has developed packages for are newer than conda-forge's default compilers and have several nice flags added by default that should aid in this problem. Also there has been demand for static libraries (in some cases to build outside conda-forge); however, there is no real need to have these installed in production environments (like Docker images). Other pieces we could cut out include docs, debug libraries, headers, etc.. So expect that generating split packages significantly cuts down on size due to conda packages in production environments. Raised issue ( #544 ) to discuss this further.

More particular to Docker, it's good to make sure that you are running conda clean after each install to clean out tarballs and anything else that is unnecessary. Fusing RUN commands together with as much stuff as possible also helps to make sure that removed files aren't lingering in old layers. Picking the lightest possible base image goes a long way as well (some users like BusyBox). If you are using a recent enough version of Docker, multistage builds can be used to simply copy relevant content from the build into one layer to keep things compact. Sorry if this portion is already obvious to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants