Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.4.3 gives error: Error -1 while decompressing data #115

Closed
solarix888 opened this issue Sep 2, 2016 · 10 comments
Closed

v1.4.3 gives error: Error -1 while decompressing data #115

solarix888 opened this issue Sep 2, 2016 · 10 comments

Comments

@solarix888
Copy link

specifically in blosc.toplevel, decompress_ptr() returns with error. This is called from Pandas v.0.18.1 pd.read_msgpack()

@FrancescAlted
Copy link
Member

Hmm, I have tested decompress_ptr() quite extensively before releasing 1.4.3. Could you please post a minimal example reproducing the problem?

@jreback
Copy link

jreback commented Sep 3, 2016

@FrancescAlted we just received the same issue on our builds: pandas-dev/pandas#14143

this works on macosx, failing on linux though.

@jreback
Copy link

jreback commented Sep 3, 2016

@FrancescAlted ahh I see @solarix888 reported this from pandas ok, let me see if i can repro

@jreback
Copy link

jreback commented Sep 3, 2016

hmm, I cannot repro this locally (on a 64-bit linux vm). maybe some conflict with some other libs on linux? (tried 3.5, now trying 2.7)

@solarix888
Copy link
Author

Hmm, my specific use case is that I save a pandas DataFrame to msgpack using the pd.to_msgpack() method, compressed using blosc. Then I read it back later. In this case, the compressed DataFrame was compressed with the previous version python-blosc, and then version update, I wasn't able to decompress.

@jreback
Copy link

jreback commented Sep 3, 2016

-> return _ext.decompress(bytesobj, as_bytearray)
(Pdb) l
480         ...                                      as_bytearray=True)) is bytearray
481         True
482  
483         """
484  
485  ->     return _ext.decompress(bytesobj, as_bytearray)
486  
487  
488     def decompress_ptr(bytesobj, address):
489         """decompress_ptr(bytesobj, address)
490  
(Pdb) p bytesobj
'\x02\x01\x01\x08\xf0\x00\x00\x00\xf0\x00\x00\x00L\x00\x00\x00\x14\x00\x00\x004\x00\x00\x00!\x00\x00\xe0\xaa\x00\x1f\xf0\x00\x08\x10\x14\x18\x1c "$&(*,.0123456789:;<=\x00?@\xe0\t\x00\x08@@@@@@@@@'

so this appears to be a compress-decompress ONLY on 2.7/linux. In this example we are using latin-1 encoding (but not sure if that matters)

The data is coming from a file that we created
here.

I don't know the exact version of blosc that was used originally, maybe https://github.com/kawochen remembers (this was checked in 7 months ago)

And working on 1.4.1 just fine.

lmk if any other data can provide

@FrancescAlted
Copy link
Member

Yep, I can reproduce that. There is a fix in: Blosc/c-blosc@07d4bb0. I'll try to release new versions of C-Blosc and Python-Blosc as soon as possible,

@FrancescAlted
Copy link
Member

I have released new versions of C-Blosc (1.11.1) and Python-Blosc (1.4.4). Please give them a test.

@jreback
Copy link

jreback commented Sep 3, 2016

thanks @FrancescAlted looks great!

@eapetitfils
Copy link

I had a very similar issue with version 1.5.1, getting an exit code -1073741819 while decompressing data generated with pandas msgpack function.

Upgrading to blosc 1.6.1 on the environment doing the decompression fixed the issue. The compression is still performed with blosc 1.5.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants