Skip to content

Conversation

@aleks-v-k
Copy link
Contributor

See #61

@martindurant
Copy link
Member

Thank you for the PR, @aleks-v-k . I won't have the time immediately to look through this, I hope others chime in. Am I right in thinking that the intent is to apply decompress on any data/stream, and detect the framing format in order to call the right method?

@aleks-v-k
Copy link
Contributor Author

Am I right in thinking that the intent is to apply decompress on any data/stream, and detect the framing format in order to call the right method?

No, the patch doesn't introduce any format detection. The default format is the original framing format implemented in this python-snappy library. The patch just adds an optional support for hadoop snappy format. To use it in (de)compression a user should specify it explicitly:

  • -t hadoop_snappy option for (de)comressing via python -m snappy
  • call snappy.hadoop_stream_compress or snappy.hadoop_stream_decompress inside a python script

@martindurant
Copy link
Member

OK, I guess that's a safe way to proceed. Is there a way to determine the framing?

@aleks-v-k
Copy link
Contributor Author

I think there it is a way. At least it's implemented in snzip project. Ok, if you will this (format detection) to be implemented, then I will do it till end of this week.

@aleks-v-k
Copy link
Contributor Author

I've implemented format autodetection, please review the last commit in the PR.

@martindurant
Copy link
Member

Nice tests in test_formats.py.
I'll merge this now, thank you for the work.

@martindurant martindurant merged commit 2aa7353 into intake:master Dec 29, 2017
@aleks-v-k
Copy link
Contributor Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants