[BEAM-1257] Compressed text source emit an extra element when reading records #1757

Closed
wants to merge 1 commit into
from

Projects

None yet

3 participants

@sb2nov
sb2nov commented Jan 10, 2017 edited

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

When reading from a text file with N rows. The textIO reader was reading an extra element if the file was compressed so the pCollection would have N+1 elements instead of N. This omits the last EOF that was being read and also adds unittests to prevent this in the future.

@asfbot
asfbot commented Jan 10, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6480/

Build result: FAILURE

[...truncated 11096 lines...] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.MojoExecutionException: Command execution failed. at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:276) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 moreCaused by: org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1) at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404) at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166) at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:660) at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:265) ... 33 more2017-01-10T09:18:40.747 [ERROR] 2017-01-10T09:18:40.747 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-01-10T09:18:40.747 [ERROR] 2017-01-10T09:18:40.747 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-01-10T09:18:40.747 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException2017-01-10T09:18:40.747 [ERROR] 2017-01-10T09:18:40.747 [ERROR] After correcting the problems, you can resume the build with the command2017-01-10T09:18:40.747 [ERROR] mvn -rf :beam-sdks-pythonchannel stoppedSetting status of 9e8d598 to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6480/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@sb2nov sb2nov Compressed file with missing last EOF create a fake element
c7df753
@sb2nov
sb2nov commented Jan 10, 2017

R: @chamikaramj PTAL

@asfbot
asfbot commented Jan 10, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6483/

Build result: FAILURE

[...truncated 11096 lines...] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.MojoExecutionException: Command execution failed. at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:276) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 moreCaused by: org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1) at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404) at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166) at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:660) at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:265) ... 33 more2017-01-10T18:59:34.574 [ERROR] 2017-01-10T18:59:34.574 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-01-10T18:59:34.574 [ERROR] 2017-01-10T18:59:34.574 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-01-10T18:59:34.574 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException2017-01-10T18:59:34.574 [ERROR] 2017-01-10T18:59:34.574 [ERROR] After correcting the problems, you can resume the build with the command2017-01-10T18:59:34.574 [ERROR] mvn -rf :beam-sdks-pythonchannel stoppedSetting status of 9db83b2 to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6483/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@asfbot
asfbot commented Jan 10, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6484/
--none--

@chamikaramj
Contributor

LGTM

@chamikaramj
Contributor

Please create a JIRA and update the PR description.

@sb2nov sb2nov changed the title from Compressed file with missing last EOF create a fake element to [BEAM-1257] Compressed text source emit an extra element when reading records Jan 10, 2017
@sb2nov
sb2nov commented Jan 10, 2017

Hmm i thought I had done that

@sb2nov
sb2nov commented Jan 10, 2017

R: @robertwb please merge

@asfgit asfgit pushed a commit that referenced this pull request Jan 11, 2017
@robertwb robertwb Closes #1757 86d4203
@sb2nov sb2nov closed this Jan 11, 2017
@sb2nov sb2nov deleted the sb2nov:BEAM-compressed-file-testing branch Jan 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment