Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-59] Beam FileSystems: add match(), copy(), rename(), delete() utilities. #2175

Closed
wants to merge 5 commits into from

Conversation

peihe
Copy link
Contributor

@peihe peihe commented Mar 7, 2017

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

@asfbot
Copy link

asfbot commented Mar 7, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8159/
--none--

@asfbot
Copy link

asfbot commented Mar 10, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8296/
--none--

@asfbot
Copy link

asfbot commented Mar 10, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8303/
--none--

@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 70.311% when pulling 0bc6057 on peihe:file-system-FileSystems into 2c2424c on apache:master.

@peihe
Copy link
Contributor Author

peihe commented Mar 14, 2017

R: @dhalperi
CC: @davorbonaci

@asfbot
Copy link

asfbot commented Mar 14, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8379/

Build result: FAILURE

[...truncated 553.79 KB...] at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.compiler.CompilationFailureException: Compilation failure/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/runners/core-java/src/test/java/org/apache/beam/runners/core/ReduceFnTester.java:[364,25] no suitable method found for add(W) method java.util.Collection.add(org.apache.beam.runners.core.StateNamespace) is not applicable (argument mismatch; W cannot be converted to org.apache.beam.runners.core.StateNamespace) method java.util.Set.add(org.apache.beam.runners.core.StateNamespace) is not applicable (argument mismatch; W cannot be converted to org.apache.beam.runners.core.StateNamespace) at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:1029) at org.apache.maven.plugin.compiler.TestCompilerMojo.execute(TestCompilerMojo.java:170) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 more2017-03-14T09:47:07.477 [ERROR] 2017-03-14T09:47:07.477 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-03-14T09:47:07.477 [ERROR] 2017-03-14T09:47:07.477 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-03-14T09:47:07.477 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException2017-03-14T09:47:07.478 [ERROR] 2017-03-14T09:47:07.478 [ERROR] After correcting the problems, you can resume the build with the command2017-03-14T09:47:07.478 [ERROR] mvn -rf :beam-runners-core-javachannel stoppedSetting status of f40cd72 to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8379/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@peihe peihe force-pushed the file-system-FileSystems branch 2 times, most recently from f4c437e to bac345c Compare March 17, 2017 02:43
@asfbot
Copy link

asfbot commented Mar 17, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8511/

Build result: FAILURE

[...truncated 557.75 KB...] at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.compiler.CompilationFailureException: Compilation failure/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall/runners/core-java/src/test/java/org/apache/beam/runners/core/ReduceFnTester.java:[364,25] no suitable method found for add(W) method java.util.Collection.add(org.apache.beam.runners.core.StateNamespace) is not applicable (argument mismatch; W cannot be converted to org.apache.beam.runners.core.StateNamespace) method java.util.Set.add(org.apache.beam.runners.core.StateNamespace) is not applicable (argument mismatch; W cannot be converted to org.apache.beam.runners.core.StateNamespace) at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:1029) at org.apache.maven.plugin.compiler.TestCompilerMojo.execute(TestCompilerMojo.java:170) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 more2017-03-17T02:49:16.927 [ERROR] 2017-03-17T02:49:16.927 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-03-17T02:49:16.927 [ERROR] 2017-03-17T02:49:16.927 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-03-17T02:49:16.928 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException2017-03-17T02:49:16.928 [ERROR] 2017-03-17T02:49:16.928 [ERROR] After correcting the problems, you can resume the build with the command2017-03-17T02:49:16.928 [ERROR] mvn -rf :beam-runners-core-javachannel stoppedSetting status of bac345c to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8511/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@asfbot
Copy link

asfbot commented Mar 17, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8512/

Build result: FAILURE

[...truncated 557.79 KB...] at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.compiler.CompilationFailureException: Compilation failure/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall/runners/core-java/src/test/java/org/apache/beam/runners/core/ReduceFnTester.java:[364,25] no suitable method found for add(W) method java.util.Collection.add(org.apache.beam.runners.core.StateNamespace) is not applicable (argument mismatch; W cannot be converted to org.apache.beam.runners.core.StateNamespace) method java.util.Set.add(org.apache.beam.runners.core.StateNamespace) is not applicable (argument mismatch; W cannot be converted to org.apache.beam.runners.core.StateNamespace) at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:1029) at org.apache.maven.plugin.compiler.TestCompilerMojo.execute(TestCompilerMojo.java:170) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 more2017-03-17T03:08:40.894 [ERROR] 2017-03-17T03:08:40.894 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-03-17T03:08:40.894 [ERROR] 2017-03-17T03:08:40.894 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-03-17T03:08:40.894 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException2017-03-17T03:08:40.894 [ERROR] 2017-03-17T03:08:40.894 [ERROR] After correcting the problems, you can resume the build with the command2017-03-17T03:08:40.894 [ERROR] mvn -rf :beam-runners-core-javachannel stoppedSetting status of 7b9d764 to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8512/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 70.317% when pulling a9b3c9f on peihe:file-system-FileSystems into 25b52c5 on apache:master.

@asfbot
Copy link

asfbot commented Mar 17, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8513/
--none--

@peihe
Copy link
Contributor Author

peihe commented Mar 20, 2017

PTAL

Look to me travis failing is unrelated

@dhalperi
Copy link
Contributor

Apologies for the delay; I've been traveling.

Copy link
Contributor

@dhalperi dhalperi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good!

* <li>{@code spec} could be a glob or a uri. {@link #match} should be able to tell and
* choose efficient implementations.
* <li>The user-provided {@code spec} might refer to files or directories. It is common that
* users that wish to indicate a directory will omit the trailing {@code /}, such as in a spec of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe hint that / is not the only delimiter, such as \ for windows or other. ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

return parseScheme(spec);
}})
.toSet();
checkArgument(schemes.size() == 1, "Expect specs have the same scheme.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the set of schemes to the message for debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}})
.toSet();
checkArgument(schemes.size() == 1, "Expect specs have the same scheme.");
return getFileSystemInternal(schemes.iterator().next()).match(specs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterables.getOnlyElement or something like it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*/
public static List<MatchResult> match(List<String> specs) throws IOException {
checkArgument(!specs.isEmpty(), "Expect specs are not empty.");
Set<String> schemes = FluentIterable.from(specs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like it would be good to extract this logic to a common place -- "getOnlyScheme"? This would take a list of resources and assert that they all have the same scheme.

Note that we need this here for String but also for ResourceId in all the bulk APIs below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public static void copy(
List<ResourceId> srcResourceIds,
List<ResourceId> destResourceIds,
MoveOptions... moveOptions) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MoveOptions or CopyOptions or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the nio interface, which uses CopyOption in both copy() and move().
https://docs.oracle.com/javase/7/docs/api/java/nio/file/CopyOption.html
(And, I also used the same options in delete()).

I think CopyOptions and RenameOptions could share many configuration options.
If we need copy specific options, we can make CopyOptions to extend MoveOptions. This interface accept both.

srcToCopy = new ArrayList<>();
destToCopy = new ArrayList<>();

List<MatchResult> matchResults = matchResources(srcResourceIds);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract this logic for ignoring missing files to a common place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public static void rename(
List<ResourceId> srcResourceIds,
List<ResourceId> destResourceIds,
MoveOptions... moveOptions) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RenameOptions or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -24,16 +24,17 @@
import com.google.common.collect.ImmutableList;
import com.google.common.collect.Iterables;
import com.google.common.collect.Lists;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

@peihe peihe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

@@ -24,16 +24,17 @@
import com.google.common.collect.ImmutableList;
import com.google.common.collect.Iterables;
import com.google.common.collect.Lists;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* <li>{@code spec} could be a glob or a uri. {@link #match} should be able to tell and
* choose efficient implementations.
* <li>The user-provided {@code spec} might refer to files or directories. It is common that
* users that wish to indicate a directory will omit the trailing {@code /}, such as in a spec of
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

return parseScheme(spec);
}})
.toSet();
checkArgument(schemes.size() == 1, "Expect specs have the same scheme.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*/
public static List<MatchResult> match(List<String> specs) throws IOException {
checkArgument(!specs.isEmpty(), "Expect specs are not empty.");
Set<String> schemes = FluentIterable.from(specs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}})
.toSet();
checkArgument(schemes.size() == 1, "Expect specs have the same scheme.");
return getFileSystemInternal(schemes.iterator().next()).match(specs);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

srcToCopy = new ArrayList<>();
destToCopy = new ArrayList<>();

List<MatchResult> matchResults = matchResources(srcResourceIds);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public static void copy(
List<ResourceId> srcResourceIds,
List<ResourceId> destResourceIds,
MoveOptions... moveOptions) throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the nio interface, which uses CopyOption in both copy() and move().
https://docs.oracle.com/javase/7/docs/api/java/nio/file/CopyOption.html
(And, I also used the same options in delete()).

I think CopyOptions and RenameOptions could share many configuration options.
If we need copy specific options, we can make CopyOptions to extend MoveOptions. This interface accept both.

public static void rename(
List<ResourceId> srcResourceIds,
List<ResourceId> destResourceIds,
MoveOptions... moveOptions) throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@jbonofre
Copy link
Member

That's awesome ! I gonna make try and experiment a filesystem (HDFS/S3 first). Thanks again !

@asfbot
Copy link

asfbot commented Mar 28, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8860/
--none--

@dhalperi
Copy link
Contributor

dhalperi commented Mar 28, 2017

LGTM once green. (Please self-merge.)

Right now, seems like Findbugs is finding null pointer exceptions:

2017-03-28T16:36:28.974 [INFO] --- findbugs-maven-plugin:3.0.4:check (default) @ beam-sdks-java-core ---
2017-03-28T16:36:29.015 [INFO] BugInstance size is 3
2017-03-28T16:36:29.019 [INFO] Error size is 0
2017-03-28T16:36:29.044 [INFO] Total bugs: 3
2017-03-28T16:36:29.049 [INFO] resourceId must be non-null but is marked as nullable [org.apache.beam.sdk.io.FileSystems$1] At FileSystems.java:[line 130] NP_PARAMETER_MUST_BE_NONNULL_BUT_MARKED_AS_NULLABLE
2017-03-28T16:36:29.049 [INFO] matchResult must be non-null but is marked as nullable [org.apache.beam.sdk.io.FileSystems$4] At FileSystems.java:[line 271] NP_PARAMETER_MUST_BE_NONNULL_BUT_MARKED_AS_NULLABLE
2017-03-28T16:36:29.049 [INFO] resourceId must be non-null but is marked as nullable [org.apache.beam.sdk.io.FileSystems$5] At FileSystems.java:[line 329] NP_PARAMETER_MUST_BE_NONNULL_BUT_MARKED_AS_NULLABLE
2017-03-28T16:36:29.049 [INFO] 

@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 70.268% when pulling 89a11b6 on peihe:file-system-FileSystems into 25b52c5 on apache:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 70.322% when pulling 7ab3ea6 on peihe:file-system-FileSystems into 25b52c5 on apache:master.

@asfbot
Copy link

asfbot commented Mar 30, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8946/
--none--

@asfgit asfgit closed this in 769398e Mar 30, 2017
@peihe peihe deleted the file-system-FileSystems branch August 15, 2017 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants