Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing a ZIP file with PPMd compression fails to show warning box in importer UI #4112

Closed
thadguidry opened this issue Aug 19, 2021 · 4 comments · Fixed by #4286 or nikhilp3/OpenRefine#2 · May be fixed by nikhilp3/OpenRefine#4
Closed
Labels
Good First Issue Indicates issues suitable for newcomers to design or coding, providing a gentle introduction. import About importers in general - add a label for the data format if available Type: Bug Issues related to software defects or unexpected behavior, which require resolution.
Milestone

Comments

@thadguidry
Copy link
Member

thadguidry commented Aug 19, 2021

ZIP files that were compressed with PPMd are not able to be imported into OpenRefine and fail to show any friendly warning in the UI during import that the compression format is unknown

To Reproduce

Steps to reproduce the behavior:

  1. Compress a ZIP file with PPMd compression
  2. Try to import into OpenRefine
  3. Notice blank Preview page in UI without any warning but shows Java exception in console.

Current Results

06:56:14.430 [                   refine] GET /command/core/get-languages (32ms)
06:56:14.805 [                   refine] GET /command/core/get-version (375ms)
06:56:28.177 [                   refine] GET /command/core/get-csrf-token (13372ms)
06:56:28.180 [                   refine] POST /command/core/create-importing-job (3ms)
06:56:28.205 [                   refine] POST /command/core/importing-controller (25ms)
java.util.zip.ZipException: invalid compression method
        at java.util.zip.ZipInputStream.read(ZipInputStream.java:225)
        at java.io.FilterInputStream.read(FilterInputStream.java:107)
        at com.google.refine.importing.ImportingUtilities.saveStreamToFile(ImportingUtilities.java:532)
        at com.google.refine.importing.ImportingUtilities.explodeArchive(ImportingUtilities.java:687)
        at com.google.refine.importing.ImportingUtilities.postProcessRetrievedFile(ImportingUtilities.java:556)
        at com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest(ImportingUtilities.java:382)
        at com.google.refine.importing.ImportingUtilities.loadDataAndPrepareJob(ImportingUtilities.java:116)
        at com.google.refine.importing.DefaultImportingController.doLoadRawData(DefaultImportingController.java:118)
        at com.google.refine.importing.DefaultImportingController.doPost(DefaultImportingController.java:87)
        at com.google.refine.commands.importing.ImportingControllerCommand.doPost(ImportingControllerCommand.java:68)
        at com.google.refine.RefineServlet.service(RefineServlet.java:187)

Expected Behavior

Display a Warning dialog in UI whenever ImportingUtilities.explodeArchive fails for any reason, I think.

Screenshots

Compressed using 7zip and selecting PPMd compression method for large file support
image

Versions

  • Operating System: Windows 10
  • Browser Version: Firefox
  • JRE or JDK Version: JDK 15
  • OpenRefine: 3.5.1 beta1

Datasets

Small test ZIP file compressed with PPMd - chicago_tribute-covid.zip

Additional context

@thadguidry thadguidry added Type: Bug Issues related to software defects or unexpected behavior, which require resolution. Good First Issue Indicates issues suitable for newcomers to design or coding, providing a gentle introduction. import About importers in general - add a label for the data format if available labels Aug 19, 2021
@atharvagadkari05
Copy link

Hey, @thadguidry I'm interested to work on this issue, Can you guide me??

@thadguidry
Copy link
Member Author

@atharvagadkari05 sorry not really.

@wetneb
Copy link
Member

wetneb commented Aug 30, 2021

Hi @atharvagadkari05, thanks for your interest in this issue!

The first step would be to do a bit of research about what compression methods are currently supported by the standard library that we are using here, java.util.zip.ZipInputStream. Then you could look for alternatives which support more compression methods. If that involves adding a new dependency to the project we need to weigh the pros and cons, checking that its license is compatible with ours, for instance.

For a general guide about doing your first code contribution to the project, check out this page: https://docs.openrefine.org/technical-reference/contributing#your-first-code-pull-request

Good luck and feel free to ping me on our gitter channel if you are getting stuck!

@atharvagadkari05
Copy link

Thanks @wetneb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good First Issue Indicates issues suitable for newcomers to design or coding, providing a gentle introduction. import About importers in general - add a label for the data format if available Type: Bug Issues related to software defects or unexpected behavior, which require resolution.
Projects
None yet
3 participants