Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JENA-1554, JENA-1555: Support bz2 compressed files directly from Java. #427

Merged
merged 4 commits into from
Jun 5, 2018

Conversation

afs
Copy link
Member

@afs afs commented Jun 3, 2018

JENA-1555, JENA-1554: Update awaitility ; add Apache Commons compress
JENA-1554: Add bz2 compression/decompression

afs added 2 commits June 3, 2018 09:51
Add Snappy
  default 32k block
  decompress only; compressor not available

Update javadoc (RDFLanguages, BinRDF) that mentions gz.
pom.xml Outdated
@@ -68,6 +68,7 @@
<ver.commonslang3>3.4</ver.commonslang3>
<ver.commonscsv>1.5</ver.commonscsv>
<ver.commons-codec>1.11</ver.commons-codec>
<ver.commons-compress>1.16.1</ver.commons-compress>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.17 was just released... maybe worth using it instead? Just received Stefan's announcement message about it in the commons mailing list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks for the pointer.

Copy link
Member

@rvesse rvesse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! +1

return filename.substring(0, filename.length()-ext.length());
}
return filename;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead

    /** The filename without any compression extension, or the original filename.
     *  It tests for compression types handled by {@link #openFileEx}.
     */
    static public String filenameNoCompression(String filename) {
        if ( FilenameUtils.isExtension(filename, extensions) ) {
            return FilenameUtils.removeExtension(filename);
        }
        return filename;
    }

I believe we have commons-io already in the dependencies list. There's some extra check for null bytes in the extension check... but that's not so important. Just simpler I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -158,15 +177,18 @@ static public OutputStream openOutputFileEx(String filename) throws FileNotFound
filename = IRILib.decode(filename) ;
}
OutputStream out = new FileOutputStream(filename) ;
if ( filename.endsWith(".gz") )
out = new GZIPOutputStream(out) ;
String ext = FileOps.extension(filename);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Digressing; but as we have FilenameUtils.getExtension() in the classpath, from commons-io, perhaps this could later be marked as deprecated?

Copy link
Member Author

@afs afs Jun 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - is it OK to separate out a "clean up FileOps/FileUtils" task and let this PR go in now? Got to finish sometime!

@asfgit asfgit merged commit 5304fa4 into apache:master Jun 5, 2018
@afs afs deleted the compressed branch June 5, 2018 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants