Link with the blosc shared library #10

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
4 participants
@tnorth

tnorth commented Mar 22, 2013

This patch removes the embedded blosc code and links with the existing shared library.

Not sure if my fix for setup.py is good enough. Perhaps there is a better way to explicitly require an external library ?
Fixes/feedback welcome!

Tested on Linux x86_64 only.

@esc

This comment has been minimized.

Show comment Hide comment
@esc

esc Apr 2, 2013

Member

I am 👎 on this PR in its current shape. However not on building against the shared library in general. Let me discuss my points.

Having a shared library is a great thing and certainly a clean way of doing things. However containing the sources like we do, has the great advantage of being able to install the extension from the PyPi using pip. If it were only possible to link against the shared lib, a potential user would have to compile the shared library first, and even use non-python tools (cmake or whatever) to do so, a IMHO significant hurdle.

So, instead I advocate for having two setup.py files. one for compiling statically and one for linking to the shared library. This way, we could still ship python-blosc from PyPi but packagers also have the option of linking against the shared library. You can even use import to share code between them. Alternatively, you could put both setup routines in the same file and control which one is uses using, for example environment variables.

We can talk about using submodules or subtree merge to manages to blosc sources in a different thread.

Member

esc commented Apr 2, 2013

I am 👎 on this PR in its current shape. However not on building against the shared library in general. Let me discuss my points.

Having a shared library is a great thing and certainly a clean way of doing things. However containing the sources like we do, has the great advantage of being able to install the extension from the PyPi using pip. If it were only possible to link against the shared lib, a potential user would have to compile the shared library first, and even use non-python tools (cmake or whatever) to do so, a IMHO significant hurdle.

So, instead I advocate for having two setup.py files. one for compiling statically and one for linking to the shared library. This way, we could still ship python-blosc from PyPi but packagers also have the option of linking against the shared library. You can even use import to share code between them. Alternatively, you could put both setup routines in the same file and control which one is uses using, for example environment variables.

We can talk about using submodules or subtree merge to manages to blosc sources in a different thread.

@tnorth

This comment has been minimized.

Show comment Hide comment
@tnorth

tnorth Apr 2, 2013

Makes sense. I had Linux systems in mind, where package managing enables a straightforwards dependency resolution.
If time allows it, I will have a look at your suggestions, and we can indeed discuss it further. I'll close this push request.
Thanks!
Thibault

tnorth commented Apr 2, 2013

Makes sense. I had Linux systems in mind, where package managing enables a straightforwards dependency resolution.
If time allows it, I will have a look at your suggestions, and we can indeed discuss it further. I'll close this push request.
Thanks!
Thibault

@tnorth tnorth closed this Apr 2, 2013

@esc

This comment has been minimized.

Show comment Hide comment
@esc

esc Apr 3, 2013

Member

Yeah, I am also a firm believer in the Linux way of doing everything by packages. However, much time can pass until a suitable package is built and accepted. Please do leave your branch lying around, I may try to create the things I suggested myself.

Member

esc commented Apr 3, 2013

Yeah, I am also a firm believer in the Linux way of doing everything by packages. However, much time can pass until a suitable package is built and accepted. Please do leave your branch lying around, I may try to create the things I suggested myself.

@FrancescAlted

This comment has been minimized.

Show comment Hide comment
@FrancescAlted

FrancescAlted Apr 3, 2013

Owner

Thanks @tnorth for the PR. Yes, I agree with @esc that we have to devise a way to make that compatible with pip installs. This PR can be handy in the future, but for the time being, including Blosc sources is best, IMO.

Owner

FrancescAlted commented Apr 3, 2013

Thanks @tnorth for the PR. Yes, I agree with @esc that we have to devise a way to make that compatible with pip installs. This PR can be handy in the future, but for the time being, including Blosc sources is best, IMO.

@avalentino

This comment has been minimized.

Show comment Hide comment
@avalentino

avalentino Apr 3, 2013

Member

Hi guys, just an idea but why blosc, python-blosc and bloscpack are not merged into a single project?

Member

avalentino commented Apr 3, 2013

Hi guys, just an idea but why blosc, python-blosc and bloscpack are not merged into a single project?

@esc

This comment has been minimized.

Show comment Hide comment
@esc

esc Apr 3, 2013

Member

@avalentino: what would be the advantages?

Member

esc commented Apr 3, 2013

@avalentino: what would be the advantages?

@esc

This comment has been minimized.

Show comment Hide comment
@esc

esc Apr 3, 2013

Member

incidentally, I was working on a PR to support pulling in blosc to python-blosc using submodules today when I started experiencing hardware issues.. This may potentially simplify managing sources in the future.

Member

esc commented Apr 3, 2013

incidentally, I was working on a PR to support pulling in blosc to python-blosc using submodules today when I started experiencing hardware issues.. This may potentially simplify managing sources in the future.

@FrancescAlted

This comment has been minimized.

Show comment Hide comment
@FrancescAlted

FrancescAlted Apr 3, 2013

Owner

@esc that would be great. I have heard people having issues with git subtrees. I hope submodules would work better :) Thanks for your work on this.

Owner

FrancescAlted commented Apr 3, 2013

@esc that would be great. I have heard people having issues with git subtrees. I hope submodules would work better :) Thanks for your work on this.

@esc

This comment has been minimized.

Show comment Hide comment
@esc

esc Apr 3, 2013

Member

@FrancescAlted : yeah? what kind of issues?

Sorry to create a prejudice, but submodules are not exactly user friendly... I see we will have to discuss some more, when the PR is ready...

Member

esc commented Apr 3, 2013

@FrancescAlted : yeah? what kind of issues?

Sorry to create a prejudice, but submodules are not exactly user friendly... I see we will have to discuss some more, when the PR is ready...

@FrancescAlted

This comment has been minimized.

Show comment Hide comment
@FrancescAlted

FrancescAlted Apr 3, 2013

Owner

Hmm, I think the issues where with using submodule not subtree. Apparently it is too easy (automatic) to loose your data. Sorry, I cannot give more details. Well, you are the expert, so I'll take your word on this.

Owner

FrancescAlted commented Apr 3, 2013

Hmm, I think the issues where with using submodule not subtree. Apparently it is too easy (automatic) to loose your data. Sorry, I cannot give more details. Well, you are the expert, so I'll take your word on this.

@avalentino

This comment has been minimized.

Show comment Hide comment
@avalentino

avalentino Apr 3, 2013

Member

@avalentino: what would be the advantages?

@esc, well it feels quite natural to me. Blosc is a relatively small library that has python binfings and a command line tool as front-end.

@avalentino yes, but the three libraries are pretty different in spirit. Blosc is an standalone library written in pure C. python-blosc is simple a python wrapper on it, and I can envision other wrappers (did not know about Haskell and node.js, thanks @esc). Finally bloscpack is more than a simply command line interface to Blosc: it is a true format and chances are that it can be converted into a really flexible tool for storing large amounts of binary data.

A single project means a single tar ball, so no more problems with pip installation and no more problems for syncing with the upstream compression library.

Yes, this is the price to pay, but that's life.

Member

avalentino commented Apr 3, 2013

@avalentino: what would be the advantages?

@esc, well it feels quite natural to me. Blosc is a relatively small library that has python binfings and a command line tool as front-end.

@avalentino yes, but the three libraries are pretty different in spirit. Blosc is an standalone library written in pure C. python-blosc is simple a python wrapper on it, and I can envision other wrappers (did not know about Haskell and node.js, thanks @esc). Finally bloscpack is more than a simply command line interface to Blosc: it is a true format and chances are that it can be converted into a really flexible tool for storing large amounts of binary data.

A single project means a single tar ball, so no more problems with pip installation and no more problems for syncing with the upstream compression library.

Yes, this is the price to pay, but that's life.

@esc

This comment has been minimized.

Show comment Hide comment
@esc

esc Apr 3, 2013

Member

I agree with the simplicity argument. However Blosc is spreading rapidly. It is included in PyTables, Blaze and there are bindings to haskell and node.js. So I think a pure, minimal project for Blosc is really a must. But yeah, I agree the cost of pip installs and syncing with upstream is quite a drag...

Member

esc commented Apr 3, 2013

I agree with the simplicity argument. However Blosc is spreading rapidly. It is included in PyTables, Blaze and there are bindings to haskell and node.js. So I think a pure, minimal project for Blosc is really a must. But yeah, I agree the cost of pip installs and syncing with upstream is quite a drag...

@FrancescAlted

This comment has been minimized.

Show comment Hide comment
@FrancescAlted

FrancescAlted Apr 3, 2013

Owner

@avalentino yes, but the three libraries are pretty different in spirit. Blosc is an standalone library written in pure C. python-blosc is simple a python wrapper on it, and I can envision other wrappers (did not know about Haskell and node.js, thanks @esc). Finally bloscpack is more than a simply command line interface to Blosc: it is a true format and chances are that it can be converted into a really flexible tool for storing large amounts of binary data.

And yes, the price to pay is that you have to pack many different packages, but that's life.

Owner

FrancescAlted commented Apr 3, 2013

@avalentino yes, but the three libraries are pretty different in spirit. Blosc is an standalone library written in pure C. python-blosc is simple a python wrapper on it, and I can envision other wrappers (did not know about Haskell and node.js, thanks @esc). Finally bloscpack is more than a simply command line interface to Blosc: it is a true format and chances are that it can be converted into a really flexible tool for storing large amounts of binary data.

And yes, the price to pay is that you have to pack many different packages, but that's life.

@FrancescAlted

This comment has been minimized.

Show comment Hide comment
@FrancescAlted

FrancescAlted Apr 3, 2013

Owner

BTW, hope we can somewhat improve the situation by solving the shared library issue, or by using git sumdoule/subtree. Let's see.

Owner

FrancescAlted commented Apr 3, 2013

BTW, hope we can somewhat improve the situation by solving the shared library issue, or by using git sumdoule/subtree. Let's see.

@esc

This comment has been minimized.

Show comment Hide comment
@esc

esc Apr 3, 2013

Member

Great... so I tried to submit a PR and the hub tool just ate the message I spent half an hour writing:

defunkt/hub#178

👎 :octocat:

not my day...

Member

esc commented Apr 3, 2013

Great... so I tried to submit a PR and the hub tool just ate the message I spent half an hour writing:

defunkt/hub#178

👎 :octocat:

not my day...

@FrancescAlted

This comment has been minimized.

Show comment Hide comment
@FrancescAlted

FrancescAlted Apr 4, 2013

Owner

@esc sorry to hear that.

Owner

FrancescAlted commented Apr 4, 2013

@esc sorry to hear that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment