-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-77] Move hadoop contrib as hdfs IO #96
Conversation
@@ -129,6 +129,7 @@ | |||
<module>runners</module> | |||
<module>examples/java</module> | |||
<module>sdks/java/maven-archetypes</module> | |||
<module>io</module> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely need a rebase / manual resolve (since #90 is in process of merging).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I will tackle that.
Left a (minor) few comments. The only real one is that I'd prefer |
There is bunch of changes in GoogleCloudPlatform/DataflowJavaSDK#103 - should those be incorporated here or applied later on? |
@ravwojdyla definitely, the changes proposed in DataflowJavaSDK#103 should go there. |
I think it is fair to say... R: @davorbonaci |
I'm resuming my work on this PR. |
Rebase and updated according to @davorbonaci comments. |
@davorbonaci I would like to include changes proposed in GoogleCloudPlatform/DataflowJavaSDK#103 ? Does it sound reasonable to you ? |
|
||
<artifactId>hdfs</artifactId> | ||
<name>Apache Beam :: SDKs :: Java :: IO :: HDFS</name> | ||
<description>Library to read and write Hadoop file formats from Dataflow.</description> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dataflow -> Beam
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I missed this one. Thanks.
LGTM. (Left a few super-minor comments). No problems in including GoogleCloudPlatform/DataflowJavaSDK#103 whatsoever. I'd prefer a separate pull request, which gives authorship credit to @nevillelyh / @ravwojdyla. |
+1, let me first update this PR and I will create another one to include the changes from GoogleCloudPlatform/DataflowJavaSDK#103. Thanks ! |
Rebased and updated according to @davorbonaci comments. |
Rebase. |
LGTM. Merging. |
Done. Thanks JB. |
Reduces the heartbeat interval from 5 seconds to 2 seconds so that is within the default checkpointing interval of Google Dataflow. This should cover the case when no records are produced in the stream and a checkpoint is made.
PiperOrigin-RevId: 306324714
Support fusion on the ParDos with user states
* remove v1beta1 code * remove v1beta1 unit tests * remove v1beta1 gapic tests
[BEAM-77] Move hadoop into io module as hdfs
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
mvn clean verify
. (Even better, enableTravis-CI for on your fork and ensure the whole test matrix passes).
number, if there is one.
Individual Contributor License Agreement.