Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrdcp remote zip extraction not working #922

Closed
mlassnig opened this issue Mar 6, 2019 · 11 comments
Closed

xrdcp remote zip extraction not working #922

mlassnig opened this issue Mar 6, 2019 · 11 comments

Comments

@mlassnig
Copy link
Contributor

mlassnig commented Mar 6, 2019

Hi,

xrdcp root://eosatlas.cern.ch//eos/atlas/atlasscratchdisk/rucio/user/dcameron/test.zip test.zip and then manual unzip correctly extracts two scripts.

if we use xrdcp root://eosatlas.cern.ch//eos/atlas/atlasscratchdisk/rucio/user/dcameron/test.zip?xrdcl.unzip=writefiles.sh writefiles.sh, the extracted file is binary and scrambled.

Cheers,
Mario

@mlassnig
Copy link
Contributor Author

mlassnig commented Mar 6, 2019

(with v4.9.0)

@simonmichal
Copy link
Contributor

Hi Mario,

From what I see this is a compressed ZIP file:

zipinfo test.zip 
Archive:  test.zip
Zip file size: 624 bytes, number of entries: 2
-rw-r--r--  3.0 unx      576 tx defN 16-Jul-29 13:36 writefiles.sh
-rw-r--r--  3.0 unx      238 tx defN 16-Feb-04 12:03 jobtimes.py
2 files, 814 bytes uncompressed, 298 bytes compressed:  63.4%

Currently, we support only extraction from ZIP files without decompression. This has been implemented with ROOT files in mind, which use ZIP format only for bundling (without compression).

If there is a strong desire on your side please create a feature request for decompression in GitHub.

Cheers,
Michal

@simonmichal
Copy link
Contributor

@mlassnig : I suppose I can close this issue?

@mlassnig
Copy link
Contributor Author

@simonmichal David told me he had a chat with @abh3 during our comp.workshop and it was mentioned that it actually should support compressed zips (i.e., bug?). For now it's fine, we can leave it like this and we use uncompressed zips only. It would be good to understand though if support for compressed zips will be supported some time in the future?

@simonmichal
Copy link
Contributor

@mlassnig : it must have been a miscommunication, by design xrootd client does not support decompression. We can add support for decompression however we need to discuss first how to prioritize this task and also which compression method do we want to support:

4.4.5 compression method: (2 bytes)

    0 - The file is stored (no compression)
    1 - The file is Shrunk
    2 - The file is Reduced with compression factor 1
    3 - The file is Reduced with compression factor 2
    4 - The file is Reduced with compression factor 3
    5 - The file is Reduced with compression factor 4
    6 - The file is Imploded
    7 - Reserved for Tokenizing compression algorithm
    8 - The file is Deflated
    9 - Enhanced Deflating using Deflate64(tm)
   10 - PKWARE Data Compression Library Imploding (old IBM TERSE)
   11 - Reserved by PKWARE
   12 - File is compressed using BZIP2 algorithm
   13 - Reserved by PKWARE
   14 - LZMA
   15 - Reserved by PKWARE
   16 - IBM z/OS CMPSC Compression
   17 - Reserved by PKWARE
   18 - File is compressed using IBM TERSE (new)
   19 - IBM LZ77 z Architecture (PFS)
   96 - JPEG variant
   97 - WavPack compressed data
   98 - PPMd version I, Rev 1
   99 - AE-x encryption marker (see APPENDIX E)

   4.4.5.1 Methods 1-6 are legacy algorithms and are no longer
   recommended for use when compressing files.

Michal

@mlassnig
Copy link
Contributor Author

Ok, it's certainly not urgent in any way. I'll discuss with the team and come back to you. For now you can close this ticket. Thanks a lot!

@simonmichal
Copy link
Contributor

OK, let me know once you've discussed it internally.

Michal

@jmuf
Copy link
Contributor

jmuf commented Mar 12, 2019

Hm, would it perhaps be feasible to throw at least an error if the file format is not handled (compressed/encrypted) by the client?
Leaving binary junk on the destination (and a "happy" exit code) might conceivably lead to data loss, e.g if the user then decides to delete the source after the apparently-successful transfer/extract.

@simonmichal
Copy link
Contributor

@jmuf : sure, but lets wait for RUCIO team so we have a clear idea when we would like to support decompression.

@mlassnig
Copy link
Contributor Author

We discussed it briefly today, and in general we will continue to use uncompressed zips. A nice error message and exitcode would be very appreciated though.

@simonmichal
Copy link
Contributor

@mlassnig : absolutely, this we can have for the next release, alternatively I could add a zip header to the compressed data and then I think it should be possible to pipe xrdcp with something like funzip in order to decompress the data, let me know if this sounds like a good idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants