Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package C libraries independently of core #7660

Open
Cadair opened this issue Jul 17, 2018 · 20 comments
Open

Package C libraries independently of core #7660

Cadair opened this issue Jul 17, 2018 · 20 comments

Comments

@Cadair
Copy link
Member

Cadair commented Jul 17, 2018

Given the modern state of Python packaging I think there is a strong argument to be made for packing the contents of cextern (at least erfa) as separate packages (at least on conda). Conda is obviously well suited to having pure-C dependencies as packages, but it's also possible with wheels to do this on pypi as well.

@bsipocz
Copy link
Member

bsipocz commented Jul 17, 2018

Wasn't the main motivation behind the bundling of those libraries to make sure the "right" version is being used, the one we actually test with, etc?

@Cadair
Copy link
Member Author

Cadair commented Jul 17, 2018

If it was then cant you just hard pin the deps on every release?

@Cadair
Copy link
Member Author

Cadair commented Jul 17, 2018

Doing this would certainly make building astropy for development quicker, as you would have to compile a lot less stuff.

@nden
Copy link
Contributor

nden commented Jul 17, 2018

Isn't this putting unnecessary burden on the end user? In the case of two different astropy versions with different C libraries how do we ensure linking to the right version? How is this going to work outside a conda environment?

@Cadair
Copy link
Member Author

Cadair commented Jul 17, 2018

@nden surely if you have two astropy versions installed, be it pip in a venv or in conda envs you can just pip/conda install the correct version of the external lib? Am I missing a different way of installing multiple astropys?

@nden
Copy link
Contributor

nden commented Jul 17, 2018

Perhaps I misunderstand something but say you have 2 wcslib libraries installed on your system. When you pip install astropy it links to the library it finds on your system or on your LD_LIBRARY_PATH. Since you have two I think you will have to tell it which one is the correct version for each astropy version by using environment variables. Astropy will install just fine using pip. Ensuring that it links to the correct C libraries at runtime is the difficult part. I think.

@bsipocz
Copy link
Member

bsipocz commented Jul 17, 2018

I'm sorry, but unless there is a very strong compelling pro argument, I'm very much -1 on this with the motto, don't fix what's not broken. There are quite a few possible ways to get users trip over on this, I'm not keen to explore those while also bringing more possible burdens on ourselves with the extra dependencies and their packaging.

@drdavella
Copy link
Contributor

drdavella commented Jul 17, 2018

I agree with the "don't fix what's not broken" sentiment above, but I am curious about what other packages are doing about this (e.g. numpy, scipy, etc.). Is this going to be the future of packaging external dependencies? If so, maybe it's worth at least investigating further.

@Cadair
Copy link
Member Author

Cadair commented Jul 17, 2018

I still think this is worth doing on conda, even if not pip. Given that most system package managers already split out erfa etc already, I cant see any issues with doing this on conda.

pip I do not know the details of, so I cant be as sure it will work.

@pllim
Copy link
Member

pllim commented Jul 17, 2018

Is this going to be the future of packaging external dependencies?

Let's see how other packages do it first then before we jump off this cliff. 😉

@namurphy
Copy link
Contributor

@Cadair's suggestion sounds especially worthwhile for those of us who are working on newer packages that might add C libraries in the future but haven't done so already. I'm going to try to remember it. The advantage of speeding up build times sounds really appealing.

@MSeifert04
Copy link
Contributor

MSeifert04 commented Jul 17, 2018

I can see the benefits of having them as separate packages but also the drawbacks for the end-users (version mismatches) and for us - the developers (do we need to test against stable and devs versions of that package and how should we spot and handle problems?)

One alternative could be to make astropy-specific packages (like the test split-offs under pytest-astropy) that we handle ourselves a la astropy-wcslib-wrapper, astropy-fitsio-wrapper that essentially just split off the external c extensions but astropy needs to have explicit version requirements (==) for releases. That would essentially make these available without astropy but using a version that mismatches from astropy makes installing astropy impossible and would cut down the build times.

It's probably not really worth the trouble but it's certainly possible to distribute them independently without bringing havok to the end-user. Depending on the setup these are probably not even useful as independent packages (for example we only distribute very specific fitsio files), except that we have shorter CI build times. And we somehow would need to solve the problem how we can safely update these packages and make sure that astropy keeps working.

But in my opinion if someone really wants to do this and we try to keep them astropy-centric (the primary purpose of these should be that astropy keeps working correctly) then it would be fine by me.

@mhvk
Copy link
Contributor

mhvk commented Jul 17, 2018

Not sure this will help much: there's four packages in cextern, only one of which we use completely "as is" (expat, seems to be used for votable); for two of the others (cfitsio, wcslib), we remove at least some of the files, while the final one (erfa) is our own version of the SOFA library.

Might it be possible to tweak the build system a bit so that rebuilds don't happen so frequently? (Clearly, rebuilds do not happen all the time, but I have not been able to figure out what exactly triggers them). SImilarly, can some form of caching help to reduce the impact on travis?

@bsipocz
Copy link
Member

bsipocz commented Jul 18, 2018

@mhvk - I suspect we can still improve the travis experience by using more of its caching infrastructure as currently we don't really use it. We can also look more into doing nightly builds that can be used in place of building the latest dev version for the affiliated packages. So, surely there are lower hanging fruits for CI that need some brainstorming (e.g. at the coordination meeting) and then someone sitting down and implement the decisions. I would be willing to work on this CI aspect, but realistically I don't have any time for it in the next few months.

@astrofrog
Copy link
Member

astrofrog commented Jul 18, 2018

We do have infrastructure to build astropy without the bundled libraries, using --use-system-cfitsio for example (and same for the other libraries). Which makes me think that maybe we should have one Travis job that checks that this works...

Anyway I agree that we don't want to do this by default. The current setup has worked for years.

@saimn
Copy link
Contributor

saimn commented Jul 18, 2018

SImilarly, can some form of caching help to reduce the impact on travis?

There is already some caching on Travis, with ccache (#6909). Affiliated packages that run tests with Astropy dev could also benefit from this, though it could be even better to have nightly wheels as suggested by @bsipocz .

For the cfitsio part, installing it with conda is already possible, but it will not change the compilation time : we are not building cfitsio, but a Python extension that read compressed files with cfitsio. So here bundling cfitsio is mostly to simplify users' life, avoiding issues with paths at compile time or runtime (and with compilation options etc.).

@jamienoss
Copy link
Contributor

jamienoss commented Oct 31, 2018

Just to chime in and add to @nden point of multiple versions of the same lib - it's a very bad idea. From what I'm aware, this is not actually possible even with conda. Not unless there's a way to be specific, i.e. the versions get mangled into the lib names and they are installed as such, e.g. libsomething-4.3.0 vs libsomething. This will break things in itself though, in fact, some libraries already do this but they are then symbolically linked to generic forms dropping the version, this is such that the other packages can version-agnostically know what to link against.

To install multiple versions you'd have to use separate environments as is intended.

Without this, every time one astropy was installed its dependency would clobber the old versions unless you set clobber=False, but then it just wouldn't install it. Basically, you'd just have the last version of the dependency installed and never both. You might be able to have multiple versions of astropy installed within a single env but perhaps not always their dependencies.

Though, TBH, I'm not sure why this would be an issue against packaging the C libs independently. You shouldn't really have multiple version of anything within the same env anyhow.

@jamienoss
Copy link
Contributor

jamienoss commented Nov 1, 2018

Doing this would certainly make building astropy for development quicker, as you would have to compile a lot less stuff.

BTW, if this is a question of build time, what's the situation with parallel builds?

@pllim
Copy link
Member

pllim commented May 11, 2020

Looks like Erfa is coming out. Is this resolved?

@mhvk
Copy link
Contributor

mhvk commented May 11, 2020

@pllim - yes, pyerfa is happening, but we haven't yet decided on whether to use or bundle in. Discussion about erfa specifically in #10329. Once we decide there, I guess this can be about the other packages. I'm a bit weary about bundling in things that are used only in small parts of astropy (expat here; jquery in astropy/extern).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests