Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for lzma filter #76

Closed
migueldvb opened this issue Mar 17, 2015 · 8 comments
Closed

Add support for lzma filter #76

migueldvb opened this issue Mar 17, 2015 · 8 comments

Comments

@migueldvb
Copy link
Contributor

In addition to the supported zlib compression type, it is useful to have support ifor other algorithms like liblzma and bzip2 for lossless data compression. The interface of these modules in the python standard library is similar and lzma has some advantages for some types of data compared to zlib. If these methods are not supported by the standard it would be useful to have user-defined filters to implement the compression.

@mdboom
Copy link
Contributor

mdboom commented Mar 17, 2015

It's important to remember that ASDF aims to be an interchange language beyond just for Python. Anything that we put into the spec that makes implementations in other languages more difficult is something we want to avoid.

By that standard, the criteria for inclusion of a compression algorithm in the spec needs to be:

  1. widely available (i.e. libraries available for all major programming languages)
  2. patent unencumbered
  3. has an implementation under a permissive license

bzip2 probably meets those criteria, though I could only find one Javascript implementation, where there are many for zlib (it's a much simpler algorithm).

lzma is less ideal. They don't explicitly claim that they are patent free, which can really matter for compression algorithms. If someone claims a patent on it, users could be obliged to pay royalties. It's also later to the Python game (only in the stdlib in 3.3 and later).

If these methods are not supported by the standard it would be useful to have user-defined filters to implement the compression.

The point of the standard is to ensure interoperability with other libraries and implementations, so I don't think we should knowingly veer from the standard in any implementation.

@migueldvb
Copy link
Contributor Author

That sounds good. I was thinking about having user-defined compression filters that could be part of the standard like the registered compression filters in HDF5, apart from the supported zlib compression method. But I agree that zlib is a good choice of compression library.because of those requirements.

@mdboom
Copy link
Contributor

mdboom commented Mar 17, 2015

I think bzip2 is probably fine to add, though. There may be an argument to only support only that perhaps, in the interest of not creating too many ways to do things. It's not too late to take zlib away, as we haven't made a release yet. @embray: Thoughts?

@embray
Copy link
Contributor

embray commented Mar 17, 2015

I don't have strong opinions about it, though I agree with the guidelines @mdboom laid out. Interestingly the talk of optional compression filters relates pretty well to my comments elsewhere today about output filters...

@migueldvb
Copy link
Contributor Author

bzip2 compresses more effectively than zlib but it is also generally slower. I think that it would be nice that the user can choose the compression method but I understand the guidelines and don't have a strong preference about any of them.

@embray
Copy link
Contributor

embray commented Mar 18, 2015

I don't see too much of a problem with allowing one or the other (or possibly other schemes that could be added later so long as they meet the criteria).

@mdboom
Copy link
Contributor

mdboom commented Apr 1, 2015

I'm going to close this. We now have bzip2 support and I think lzma support is a little problematic for right now.

@mdboom mdboom closed this as completed Apr 1, 2015
@migueldvb
Copy link
Contributor Author

That is fine. It is good to have bzip2 support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants