-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-79] Merge branch 'master' into gearpump_runner #750
[BEAM-79] Merge branch 'master' into gearpump_runner #750
Conversation
Transform Evaluator Factories must be reused for the entire execution of a Pipeline and must not be reused across pipelines. Remove EvaluatorKey, and key explicitly by the transform application.
Ignores tests and examples
This makes the static constructors for withAllowedLateness symmetric to the PTransform builder methods. It also allows references to Window#withAllowedLateness(Duration, ClosingBehavior).
A DoFn application is the scope of reuse. Factor CloningThreadLocal as the top-level class instead of SerializableCloningThreadLocalCacheLoader, and extract the Fn from the AppliedPTransform when loading an absent element.
* Move package from io to io.gcp.bigquery * Move from SDK core into GCP-IO module * Fixup references and import orders * Separate AvroUtils into generic AvroUtils and BigQueryAvroUtils * Rewrite a unit test in sdk core to not depend on BigQueryIO * Fixup Javadoc in SDK core that need not depend on BigQueryIO * Make utility classes package-private
… to encoded bytes This closes apache#695
* Use the uber jar * Remove OS classifier mumbo jumbo * Move common dependency versioning to root pom
This closes apache#701
This removes the duplication of "DirectRunner" and "DirectOptions" classes.
* Register FileIOChannelFactory for file scheme * Modify FileIOChannelFactory to dynamically remove the file:// scheme string.
Previously, the situation was this: - All runners inherit a RunnableOnService integration-test execution referencing runnableOnServicePipelineOptions whether or not the variable was set. Basically an unbound variable reference. - The Dataflow runner had a profile disabling it if runnableOnServicePipelineOptions was not set. - Before they got configured, Flink and Spark had to do extra work to explicitly prevent the invalid configuration from being used. After this change: - All runners inherit the same integration-test execution but only if the variable it requires is present. - Dataflow doesn't have any special profile. - Flink and Spark are unchanged, since they do set up the variable themselves. When they move to running only as postcommit, like Dataflow does, the hardcoding is expected to either move to a profile or move to the Jenkins invocation.
Duplicate WordCount into spark examlpes package. Duplicate parts of TfIdf from beam examlpes. Better reuse of WordCount and its parts. Remove dependency on beam-examples-java
- updates the README - repairs broken exec configuration
A new rebase is needed for new version. I plan to do it asap. |
…an interpretation of the Pipeline's windows This closes apache#808
I forked it and updated the version. It looks like it is passing: https://travis-ci.org/kennknowles/incubator-beam/builds/151282293. I've opened manuzhang#1 which bumps the version on this branch, or you could just do it yourself. |
Timers are equal if the domain, timestamp, and namespace are equal. Compare these values in compareTo. The ordering of TimerData that are not in the same namespace or domain is arbitrary.
Update Gearpump runner version to 0.3.0-incubating
Add a field that is modified per output, which should occur twice.
@kennknowles updated but hit by downloading timeout again 😞 |
# Conflicts: # runners/pom.xml
@kennknowles @jbonofre @dhalperi merged with recent master. Previous |
Guys, can we advance with this PR ? |
I will be back from vacation Wednesday. I will work on it then. |
guys, any updates ? |
Yes, I rebased and tested. I noted some stuff to fix. I will comment today. |
@manuzhang would you be available on Slack or Hangout ? I would like to discuss a couple of topics with you. Thanks ! |
The tests are green. Are there any outstanding issues, or are you ready to merge it in? Since it is a feature branch, presumably you can just roll forwards addressing any issues. |
I plan to discuss with Manu about some topics tomorrow. I will do the merge. Thanks ! |
I discussed with Manu about couple of topics:
I also tested the gearpump runner with the last gearpump incubating release with simple pipelines. I also checked:
@kennknowles I'm proposing to merge this runner on master. It would give more visibility and would allow people to experiment and provide feedback. If you agree, I will do more than happy to do the merge ;) |
That's a different discussion for the mailing list, where we have discussed criteria a bit. This PR is merging the other way, so that the |
I have shuffled the commits for readability and pushed the changes from this PR into I recommend merging |
Fully agree. It's the plan and what I did. I think we have to move forward quickly to merge gearpump runner on master. It would give more visibility. |
Thanks for both of you |
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
[BEAM-<Jira issue #>] Description of pull request
mvn clean verify
. (Even better, enableTravis-CI on your fork and ensure the whole test matrix passes).
<Jira issue #>
in the title with the actual Jira issuenumber, if there is one.
Individual Contributor License Agreement.