-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix zlib and blosc imports in to_msgpack #9783
Conversation
Could you please add tests for these? Looks like this functionality was not tested before. |
this needs some tests they were not in the local imports originally because of perf |
What's a perf check? |
since this is importing each time in a lower level function I would like to see if that impacts things show to/from msgpack with a good sized frame with mixed types with compression and without look at prun and see if the imports cause an issue |
What is prun? In my experience, module initialization happens only on the first import. You can test this by creating
and creating
and then run |
The following code also only prints "hey" once.
|
Lastly, this variation also only prints "hey" once.
These examples hold true for python2.7 and python3.4. |
use ipython of course they are only imported once but the function call and import checks themselves are non trivial |
Here are some performance results using the following code saved as
When zlib and blosc are imported globally$ ipython -i compress.py
$ ipython -i compress.py
$ ipython -i compress.py
When zlib and blosc are imported locallyprun z = read_msgpack(table.to_msgpack())
$ ipython -i compress.py
$ ipython -i compress.py
|
I didn't expect such a quick response to the pull request. Thanks for maintaining the pandas package. |
class TestCompression(TestPackers): | ||
|
||
def setUp(self): | ||
super(TestCompression, self).setUp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number as a comment (on the class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ok, seems reasonable to use the local imports.
|
This seems to be a documented issue between blosc and conda. |
hmm ok let me see if I can have this fixed |
@@ -16,6 +16,8 @@ fi | |||
"$TRAVIS_BUILD_DIR"/ci/build_docs.sh 2>&1 > /tmp/doc.log & | |||
# doc build log will be shown after tests | |||
|
|||
pip install -U blosc # See https://github.com/pydata/pandas/pull/9783 | |||
python -c 'import blosc; blosc.print_versions()' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect!
@invisibleroads this looks good! This needs to be tested slightly differently. BUT, you are adding the compression type to the objects themselves (again which is ok), I don't think this is actually used on decompression (its not tested that way now). AND I am not sure that you CAN use it, as you would have to decompress first, look at the compression variable, but then you are done already. So I am not sure how to get around this problem, other than having a top-level section which has some uncompressed attributes. So either we need to take off the |
I added the After realizing this, I added the
I think the |
The other alternative is to remove |
Are you at PyCon? Maybe we can negotiate this during sprints on Monday. |
sure at pycon starting tomorrow (thru Tuesday) |
@jreback |
@jreback
Since the user would not have been able to compress and decompress with to_msgpack/read_msgpack in any version of pandas, we decided that a try/except is not necessary and the code is good as is and is ready to be merged. |
6717aa0 removed zlib and blosc from the global namespace.
This pull request restores zlib and blosc compression in to_msgpack via local imports.