Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-77] Move hadoop contrib as hdfs IO #96

Closed
wants to merge 1 commit into from

Conversation

jbonofre
Copy link
Member

[BEAM-77] Move hadoop into io module as hdfs

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI for on your fork and ensure the whole test matrix passes).
  • Replace "<Jira issue #>" in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

@@ -129,6 +129,7 @@
<module>runners</module>
<module>examples/java</module>
<module>sdks/java/maven-archetypes</module>
<module>io</module>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely need a rebase / manual resolve (since #90 is in process of merging).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I will tackle that.

@davorbonaci
Copy link
Member

Left a (minor) few comments. The only real one is that I'd prefer sdks/java/io as opposed to top-level io, given that all IOs are SDK-specific.

@ravwojdyla
Copy link
Contributor

There is bunch of changes in GoogleCloudPlatform/DataflowJavaSDK#103 - should those be incorporated here or applied later on?

@jbonofre
Copy link
Member Author

@ravwojdyla definitely, the changes proposed in DataflowJavaSDK#103 should go there.
@davorbonaci thanks buddy, I will update according to your comments.

@kennknowles
Copy link
Member

I think it is fair to say...

R: @davorbonaci
R: @dhalperi

@jbonofre
Copy link
Member Author

jbonofre commented Apr 7, 2016

I'm resuming my work on this PR.

@jbonofre
Copy link
Member Author

Rebase and updated according to @davorbonaci comments.

@jbonofre
Copy link
Member Author

@davorbonaci I would like to include changes proposed in GoogleCloudPlatform/DataflowJavaSDK#103 ? Does it sound reasonable to you ?


<artifactId>hdfs</artifactId>
<name>Apache Beam :: SDKs :: Java :: IO :: HDFS</name>
<description>Library to read and write Hadoop file formats from Dataflow.</description>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dataflow -> Beam

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I missed this one. Thanks.

@davorbonaci
Copy link
Member

LGTM. (Left a few super-minor comments).

No problems in including GoogleCloudPlatform/DataflowJavaSDK#103 whatsoever. I'd prefer a separate pull request, which gives authorship credit to @nevillelyh / @ravwojdyla.

@jbonofre
Copy link
Member Author

+1, let me first update this PR and I will create another one to include the changes from GoogleCloudPlatform/DataflowJavaSDK#103.

Thanks !

@jbonofre
Copy link
Member Author

Rebased and updated according to @davorbonaci comments.

@jbonofre
Copy link
Member Author

Rebase.

@davorbonaci
Copy link
Member

LGTM. Merging.

@asfgit asfgit closed this in c8ed398 Apr 15, 2016
@davorbonaci
Copy link
Member

Done. Thanks JB.

@jbonofre jbonofre deleted the HDFS_IO branch May 8, 2016 06:14
hengfengli pushed a commit to hengfengli/beam that referenced this pull request Mar 21, 2022
Reduces the heartbeat interval from 5 seconds to 2 seconds so that is
within the default checkpointing interval of Google Dataflow. This
should cover the case when no records are produced in the stream and a
checkpoint is made.
Abacn pushed a commit to Abacn/beam that referenced this pull request Jan 31, 2023
alnzng added a commit to alnzng/beam that referenced this pull request Jun 12, 2023
Support fusion on the ParDos with user states
pl04351820 pushed a commit to pl04351820/beam that referenced this pull request Dec 20, 2023
* remove v1beta1 code

* remove v1beta1 unit tests

* remove v1beta1 gapic tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants