Skip to content

Merge branch 'master' upto commit 686b774 into jstorm-runner#2672

Merged
asfgit merged 195 commits intoapache:jstorm-runnerfrom
peihe:jstorm-runner
Apr 25, 2017
Merged

Merge branch 'master' upto commit 686b774 into jstorm-runner#2672
asfgit merged 195 commits intoapache:jstorm-runnerfrom
peihe:jstorm-runner

Conversation

@peihe
Copy link
Contributor

@peihe peihe commented Apr 25, 2017

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

mdshalda and others added 30 commits April 7, 2017 17:12
Allows a restriction type to implement HasDefaultTracker,
in that case the splittable DoFn itself does not need to
implement NewTracker - only ProcessElement and GetInitialRestriction.
1. There was a race in which pipelines without PAsserts might
   erroneously pass because they would be canceled, which would
   in turn cause the watermarks to reach max infinity, which
   would in turn (because there are no PAsserts) cause the main
   streaming poll loop think the pipeline succeeded.

   Fix this by making the error presence available to the main
   polling loop, and only canceling from there.

2. The fact we were canceling from two places meant we could
   get double-cancelations that led to test failures.

Fix both these issues (I hope).
…ith a single output.

remove use of EvaluationContext in DStream lambda, it is not serializable and also redundant in this
case.

implement pardo as multido.

cache only if this is an actual MultiDo.
jkff and others added 26 commits April 17, 2017 09:58
Remove references to DataflowPipelineOptions in PipelineOptions javadoc.
Callers should instead get the Default WindowMappingFn if no explicit
WindowMappingFn is available.

Migrate all existing callers within the SDK and runners.
Instead of using the project at job submission time, use the project at
job execution time.
This is left over from commit 78a360e

 - [x] Make sure the PR title is formatted like:
   `[BEAM-<Jira issue #>] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
       Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `<Jira issue #>` in the title with the actual Jira issue
       number, if there is one.
 - [ ] If this contribution is large, please file an Apache
       [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
…lFactories

Additionally, drop an unnecessary use of `GcsOptions` in
`PipelineRunner`.
@peihe
Copy link
Contributor Author

peihe commented Apr 25, 2017

R: @kennknowles

@kennknowles
Copy link
Member

LGTM

@asfgit asfgit merged commit f1e170a into apache:jstorm-runner Apr 25, 2017
asfgit pushed a commit that referenced this pull request Apr 25, 2017
…orm-runner

  [BEAM-1993] Remove special unbounded Flink source/sink
  Remove flink-annotations dependency
  Fix Javadoc warnings on Flink Runner
  Enable flink dependency enforcement and make dependencies explicit
  [BEAM-59] Register standard FileSystems wherever we register IOChannelFactories
  [BEAM-1991] Sum.SumDoubleFn => Sum.ofDoubles
  clean up description for sdk_location
  Set the Project of a Table Reference at Runtime
  Only compile HIFIO ITs when compiling with java 8.
  Update assertions of source_test_utils from camelcase to underscore-separated.
  Add no-else return to pylintrc
  Remove getSideInputWindow
  Remove reference to the isStreaming flag
  Javadoc fixups after style guide changes
  Update Dataflow Worker Version
  [BEAM-1922] Close datasource in JdbcIO when possible
  Fix javadoc warnings
  Add javadoc to getCheckpointMark in UnboundedSource
  Removes final minor usages of OldDoFn outside OldDoFn itself
  [BEAM-1915] Removes use of OldDoFn from Apex
  Update Signature of PTransformOverrideFactory
  [BEAM-1964] Fix lint issues and pylint upgrade
  Rename DoFn.Context#sideOutput to output
  [BEAM-1964] Fix lint issues for linter upgrade -3
  [BEAM-1964] Fix lint issues for linter upgrade -2
  Avoi repackaging bigtable classes in dataflow runner.
  ApexRunner: register standard IOs when deserializing pipeline options
  Add PCollections Utilities
  Free PTransform Names if they are being Replaced
  [BEAM-1347] Update protos related to State API for prototyping purposes.
  Update java8 examples pom files to include maven-shade-plugin.
  fix the simplest typo
  [BEAM-1964] Fix lint issues for linter upgrade
  Merge PR#2423: Add Kubernetes scripts for clusters for Performance and Integration tests of Cassandra and ES for Hadoop Input Format IO
  Remove Triggers.java from SDK entirely
  [BEAM-1708] Improve error message when GCP not installed
  Improve gcloud logging message
  [BEAM-1101, BEAM-1068] Remove service account name credential pipeline options
  Update user_score.py
  Pin versions in tox script
  Improve Empty Create Default Coder Error Message
  Represent a Pipeline via a list of Top-level Transforms
  Test all Known Coders to ensure they Serialize via URN
  [BEAM-1950] Add missing 'static' keyword to MicrobatchSource#initReaderCache
  Move Triggers from sdk-core to runners-core-construction
  [BEAM-1222] Chunk size should be FS dependent
  Move HIFIO k8s scripts into shared dir
  Move jdbc's postgres k8s scripts into shared k8s dir
  Move travis/jenkins folders in a test-infra folder
  [BEAM-911] Mark IO APIs as @experimental
  Revert "Revert "Revert "Add ValueProvider class for FileBasedSource I/O Transforms"""
  Revert "Throw specialized exception in value providers"
  Removes FlatMapElements.MissingOutputTypeDescriptor
  Removes MapElements.MissingOutputTypeDescriptor
  [BEAM-1882] Update postgres k8 scripts & add scripts for running local dev test
  [BEAM-115] Update timer/state fields on ParDoPayload to use a map field for consistent tag usage
  Use SdkComponents in WindowingStrategy.toProto
  [BEAM-1722] Move PubsubIO into the google-cloud-platform module
  Triggers: handle missing case
  Clean HFIOWithEmbeddedCassandraTest before Execution
  DataflowRunner: remove dead code
  Throw specialized exception in value providers
  DataflowRunner: send windowing strategy using Runner API proto
  DataflowRunner misc cleanups
  Improve Work Rejection handling
  Remove Orderdness of Input, Output expansions
  Ignore more python build artifacts.
  Fix build breaks caused by overlaps between b615013 and c08b7b1
  Remove Jdk1.8-tests/.toDelete
  Improve HadoopInputFormatIO DisplayData and Cassandra tests
  Add Coder utilities for Proto conversions
  Flip dependency edge between Dataflow runner and IO-GCP
  Move HashingFn to io/common, switch to better hash
  PubsubIO: remove support for BoundedReader
  Bump Dataflow worker to 20170410
  Removes DoFn.ProcessContinuation completely
  Move WindowingStrategies to runners-core-construction
  Fix GroupByKeyInputVisitor for Direct Runner
  Skip query metrics when creating a template
  Upgrade dependencies.
  Add SdkComponents
  Create as custom source
  BEAM-1053 ApexGroupByKeyOperator serialization issues
  enable test_multi_valued_singleton_side_input test
  [BEAM-386] Move UnboundedReadFromBoundedSource to core-construction-java
  BEAM-1390 Update top level README.md to include Apex Runner
  better log message for bigquery temp tables
  [BEAM-1921] Expose connection properties in JdbcIO
  [BEAM-1294] Long running UnboundedSource Readers
  [BEAM-1737] Implement a Single-output ParDo as a Multi-output ParDo with a single output.
  Fix for potentially unclosed streams in ApexYarnLauncher
  TestDataflowRunner: better error handling
  BEAM-1887 Switch Apex ParDo to new DoFn.
  Adds tests for the watermark hold (previously untested)
  Fixes SDF issues re: watermarks and stop/resume
  Clarifies doc of ProcessElement re: HasDefaultTracker
  [BEAM-65] Adds HasDefaultTracker for RestrictionTracker inference
  Cleanup: removes two unused constants
  [BEAM-1823] Improve ValidatesRunner Test Log
  Clean up in textio and tfrecordio
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.