Skip to content

Conversation

@sekruse
Copy link
Contributor

@sekruse sekruse commented May 7, 2015

  • add a decorateInputStream() method as hook in FileInputFormat
  • provide a InputStreamFSInputWrapper to conveniently wrap InputStreams
  • base existing .deflate file support on these changes
  • add a test to verify the decoration

@rmetzger
Copy link
Contributor

rmetzger commented May 8, 2015

Thank you for this pull request. I'll review it now.

The first thing I saw is that the build has failed:

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 10.728 s
[INFO] Finished at: 2015-05-07T12:08:03+00:00
[INFO] Final Memory: 25M/448M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.11:check (default) on project flink-parent: Too many files with unapproved license: 1 See RAT report in: /home/travis/build/apache/flink/target/rat.txt -> [Help 1]

https://travis-ci.org/apache/flink/jobs/61608003

You can also do a local mvn clean install -DskipTests to run the rat plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing license header, that's also why the build fails ;)

@rmetzger
Copy link
Contributor

rmetzger commented May 8, 2015

The change looks good, its not completely consistent with our coding guidelines. But a few runs of "mvn install -DskipTests" will resolve this ;)
(also, the indentation is done using spaces, we use tabs).

@sekruse
Copy link
Contributor Author

sekruse commented May 8, 2015

Alright, I incorporated all the feedback. The build seems to be passing now.

@rmetzger
Copy link
Contributor

rmetzger commented May 8, 2015

+1 to merge.

@StephanEwen
Copy link
Contributor

Looks good. One comment, though:

  • The position tracking in the stream wrapper is a nice idea, but since the skip() method is anyways only a hint (it may skip less if it wants) and seeking works only in one direction, why not skip seek support completely? Also simplifies the code and makes it even more lightweight.

As a followup: Should we add more built-in decompressors for other file endings, like *.gz ?

@rmetzger
Copy link
Contributor

I think Sebastian has already filed a JIRA for adding gz read support.

@sekruse
Copy link
Contributor Author

sekruse commented May 11, 2015

I wanted to integrate the gz support with the decorateStream method, therefore I was waiting for this PR to be merged.
I can of course skip the seek, but I thought that forward seeking would be useful as it enables splittable file formats.

@StephanEwen
Copy link
Contributor

That is a good point. It is a bit tricky, though, since the skip method does not guarantee efficient skipping. It also reserves the right to skip less (and return the number of bytes skipped, iirc).

@sekruse
Copy link
Contributor Author

sekruse commented May 12, 2015

I see your point and double-checked it with the Java doc. I will adjust the seek method accordingly.

Sebastian Kruse added 4 commits May 12, 2015 16:57
* add a decorateInputStream() method as hook in FileInputFormat
* provide a InputStreamFSInputWrapper to conveniently wrap InputStreams
* base existing .deflate file support on these changes
* add a test to verify the decoration
* use tabs instead of spaces
* avoid asterisk imports
* include license header
…tWrapper

* do not alter the stream position if the stream is at the end
@rmetzger
Copy link
Contributor

I think the PR is good to merge.

@StephanEwen
Copy link
Contributor

Looks good.

Will merge this in my next batch...

@asfgit asfgit closed this in 7bd8068 May 20, 2015
marthavk pushed a commit to marthavk/flink that referenced this pull request Jun 9, 2015
* add a decorateInputStream() method as hook in FileInputFormat
* provide a InputStreamFSInputWrapper to conveniently wrap InputStreams
* base existing .deflate file support on these changes
* add a test to verify the decoration

This closes apache#658
nltran pushed a commit to nltran/flink that referenced this pull request Jan 8, 2016
* add a decorateInputStream() method as hook in FileInputFormat
* provide a InputStreamFSInputWrapper to conveniently wrap InputStreams
* base existing .deflate file support on these changes
* add a test to verify the decoration

This closes apache#658
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants