-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-79] merge gearpump-runner into master #3611
Closed
Closed
Changes from all commits
Commits
Show all changes
207 commits
Select commit
Hold shift + click to select a range
9478f41
[BEAM-79] add Gearpump runner
manuzhang 02b2248
This closes #323
kennknowles 2a0ba61
Merge branch master into gearpump-runner
kennknowles 1672b54
move integration tests to profile
manuzhang 276a2e1
add package-info.java
manuzhang 40be715
Update Gearpump runner version to 0.3.0-incubating
kennknowles bc1b354
Rename DoFn to OldDoFn in Gearpump runner
manuzhang 091a15a
This closess #750
kennknowles fb74c93
gearpump: switch to stable version
dhalperi bf0a2ed
Closes #895
dhalperi 0dfb8ff
Made byteArrayCoder final static
b9f8263
CompressedSource: CompressedReader is never splittable
dhalperi 011bea9
Do not add DataDisks to windmill service jobs.
drieber 1d86335
Remove timeout in DirectRunnerTest
tgroh 36a9aa2
Improve Write Error Message
tgroh d564155
Remove Streaming Write Overrides in DataflowRunner
tgroh 89921c4
Remove Counter and associated code
7fc2c68
[BEAM-495] Create General Verifier for File Checksum
b47549e
Add output checksum to WordCountITOptions
37ce2a3
More unit test and code style fix
046e36e
Using IOChannelUtils to resolve file path
58cd781
Added unit tests and error handling in removeTemporaryTables
d99a652
[flink] add missing maven config to example pom
mxm 39f763e
Remove DataflowPipelineJob from examples
peihe 424c4c4
[BEAM-432] Corrected BigQueryIO javadoc
mariusz89016 b80d967
Add TransformEvaluatorFactory#cleanup
tgroh 77c90d0
Replace CloningThreadLocal with DoFnLifecycleManager
tgroh d056f46
Add DoFn @Setup and @Teardown
tgroh 6603307
Move ParDo Lifecycle tests to their own file
tgroh d6cf4f2
Exclude ParDoTest from Dataflow @RunnableOnService
dhalperi 0f1f114
Exclude guava-testlib from shading relocation
swegner 09cd1b7
ByteKeyRangeTracker: synchronize toString
dhalperi bd53cdc
Fix repackaging exclude pattern for guava-testlib
swegner cc189b4
Rewrites DoFnReflector to go via DoFnSignature
jkff da638b6
Replace ParDo with simpler transforms where possible
kennknowles 235bf3b
Set Gcs upload buffer size to 1M in streaming mode in DataflowRunner
peihe 530b9c0
addressed feedback
peihe aa541e7
fix unused imports
peihe bbd0e6b
DatastoreIO Sink as ParDo
vikkyrk 245c3ce
Change name of result returned by BigQueryIO.Read
fyellin 16bcf78
Fix NPE in BigQueryIO.TransformingReader
peihe 178898f
Add inEarlyPanesInGlobalWindow as a PAssert Extractor
tgroh 6c82321
Add TestStream to the Testing package
tgroh f37dba8
Implement TestStream in the DirectRunner
tgroh dab9efc
Incorporate private IP option for Dataflow runner
sammcveety a4053ac
Datastore Sink support for writing Mutations
vikkyrk c996c1e
Mark JAXBContext as Volatile in JAXBCoder
tgroh 4ad78b2
Modify example dependencies to only add runners as optional dependenc…
lukecwik 5c1b9f1
DatastoreIO v1beta3 to v1
vikkyrk c3c11b1
Remove unused constant in ExecutorServiceParallelExecutor
tgroh 64a2d51
Remove extra timer firings in WatermarkManager
tgroh a60806a
FileBasedSink: improve parallelism in GCS copy/remove
dhalperi 780ffcb
[BEAM-574] Remove log when new records have not been read yet (KafkaIO)
iemejia 730e7b0
Write: Remove invalid import
dhalperi 67efb17
JUnit tests: add @RunWith annotation
dhalperi 438d8bd
Remove ParDoTest Suppression in Google Cloud Dataflow
tgroh 67e095d
Fix Emission in startBundle/finishBundle in Flink Wrappers
aljoscha 686a286
[BEAM-253] Unify Flink-Streaming Operator Wrappers
aljoscha 9eef8a6
Fix Checkstyle Errors in FlinkStreamingTransformTranslators
aljoscha 9179e93
[BEAM-102] Add Side Inputs in Flink Streaming Runner
aljoscha de744c5
Allow DoFn Reuse in ParDoTest.TestDoFnWithContext
aljoscha b4a38c3
Don't Suppress Throwable in PAssert in Streaming Mode
aljoscha 7a2cccd
Fix Flink Runner Pom for Batch RunnableOnService tests
aljoscha f2a992e
Enable Flink Streaming Runner RunnableOnService tests
aljoscha a07b29f
Fix combine tests with Accumulation Mode
tgroh 97e093c
Use AllPanes as the PaneExtractor in IterableAssert
tgroh b7ba1d6
Make ParDoLifecycleTest Serializable to Fix Test with TupleTag
aljoscha 7012a22
Fix Exception Unwrapping in TestFlinkRunner
aljoscha 93f7955
Update checkstyle.xml to put all imports in one group
00441f8
Optimize imports
8d32196
BigQueryIO.Write: raise size limit to 11 TiB
dhalperi 186fe28
Cleanup some javadoc that referring Dataflow
peihe 433842b
Move the samples data to gs://apache-beam-samples/
peihe 32928c3
[BEAM-545] Promote JobName to PipelineOptions
peihe f05fbe7
Update DoFn javadocs to remove references to OldDoFn and Dataflow
swegner 4ec73d8
Make WriteTest more resilient to Randomness
tgroh b3be7b7
checkstyle: prohibit API client repackaged Guava
dhalperi 1f8b534
Modified BigtableIO to use DoFn setup/tearDown methods instead of sta…
5b425ac
[BEAM-294] Rename dataflow references to beam
jbonofre ef312e9
Added support for reporting aggregator values to Spark sinks
staslev 0fbd9c8
travis.yml: disable updating snapshots
dhalperi 79491eb
Query latest timestamp
vikkyrk 4023167
[BEAM-589] Fixing IO.Read transformation
bce9aef
kinesis: a connector for Amazon Kinesis
przemekpastuszka aee5fbf
Organize imports in Kinesis
dhalperi 973081e
Fix javadoc in Kinesis
dhalperi 1c1115e
[BEAM-592] Fix SparkRunner Dependency Problem in WordCount
8454d5c
DataflowRunner: get PBegin from PInput
dhalperi 07dd978
[BEAM-313] Provide a context for SparkRunner
amarouni 435054b
Update Dataflow Container Version
tgroh cf9ce2f
[BEAM-572] Remove Spark Reference in WordCount
a58afd3
Returned KafkaIO getWatermark log line in debug mode
aviemzur 74d0195
take advantage of setup/teardown for KafkaWriter
00b4e95
Add LeaderBoardTest
tgroh 8007bdf
[BEAM-569] Define maxNumRecords default value to Long.MAX_VALUE in JmsIO
jbonofre 6ae4b6a
Address comments of Flink Side-Input PR
aljoscha 1524494
Fix condition in FlinkStreamingPipelineTranslator
aljoscha 798566c
Correct some accidental renames
4251761
Test that multiple instances of TestStream are supported
tgroh 28ad44d
Remove empty unused method in TestStreamEvaluatorFactory
tgroh 6ee7b62
Add Latest CombineFn and PTransforms
swegner 0312f15
DatastoreIO SplitQueryFn integration test
vikkyrk f44fa2c
Cloud Datastore naming clean-up
vikkyrk 9943fd7
Fixed Combine display data
60d8cd9
Delegate populateDipslayData to wrapped combineFn's
swegner 4bf3a3b
Put classes in runners-core package into runners.core namespace
kennknowles c92e45d
Remove the DataflowRunner instructions from examples
peihe 3f48566
FluentBackoff: a replacement for a variety of custom backoff implemen…
dhalperi 9ae5cc7
[BEAM-456] Add MongoDbIO
jbonofre 5eb44aa
[BEAM-242] Enable checkstyle and fix checkstyle errors in Flink runner
jbonofre 958f3fe
BigQuery: limit max job polling time to 1 minute
dhalperi c8052b6
Be more accepting in UnboundedReadDeduplicatorTest
tgroh 8f68085
Remove timeout in JAXBCoderTest
tgroh 50c1c88
[BEAM-242] Enable and fix checkstyle in Flink runner examples
jbonofre b235595
Add header/footer support to TextIO.Write
staslev 1b420db
Revised according to comments following a code review.
staslev 092a187
Reverted header and footer to be of type String.
staslev 5084580
Added javadoc to TextIO#withHeader and TextIO#withFooter.
staslev e5db1c7
Added even more javadoc to TextIO#withHeader and TextIO#withFooter.
staslev 34c731f
Added even more javadoc to TextIO#withHeader and TextIO#withFooter (2).
staslev 6cd48c4
!fixup Minor javadoc clean-up
lukecwik 59ae94c
fix import order
manuzhang ed7c4aa
Closes #943
dhalperi 272fe9f
[BEAM-79] fix integration-test failure
manuzhang 8f4334c
Closes #956
dhalperi 9dc9be9
Merge branch 'master' into gearpump-runner
manuzhang 8f013cb
post-merge fix
manuzhang 94bd47c
remove "pipeline" in runner name
manuzhang 3f06382
upgrade gearpump-runner to 0.4.0-incubating-SNAPSHOT
manuzhang 3933b55
Closes #1193
dhalperi 45570b9
[BEAM-79] Port Gearpump runner from OldDoFn to new DoFn
manuzhang 323ec11
This closes #1234
kennknowles 0c36228
Merge branch 'master' into gearpump-runner
manuzhang 2a96a17
[BEAM-79] update GearpumpPipelineResult
manuzhang a14927f
This closes #1306
kennknowles 68363d0
Merge branch 'master' of https://github.com/apache/incubator-beam int…
manuzhang 2812405
Merge branch 'master' of https://github.com/apache/incubator-beam int…
manuzhang 86414c0
Merge remote-tracking branch 'upstream/master' into gearpump-runner-sync
manuzhang 2afc0cd
[BEAM-79] fix gearpump runner build failure
manuzhang 88de0cb
This closes #1507
kennknowles 46d3563
Upgrade Gearpump version
manuzhang 85d54ab
Add Window.Bound translator
manuzhang c37de00
Skip window assignment when windows don't change
manuzhang cb8c5e5
Remove print to stdout
manuzhang 8e0e819
Fix NoOpAggregatorFactory
manuzhang b6e7bb6
This closes #1623: [BEAM-1086] Upgrade to latest Gearpump snapshot
kennknowles 81d94cf
Merge branch 'master' of https://github.com/apache/incubator-beam int…
manuzhang c2fb7c0
[BEAM-79] Update to latest Gearpump API
manuzhang 647034c
[BEAM-79] Upgrade to beam-0.5.0-incubating-SNAPSHOT
manuzhang 4c445dd
This closes #1663: Merge master (b3de17b) into gearpump-runner
kennknowles 2155476
[BEAM-1180] Implement GearpumpPipelineResult
manuzhang cfdc971
update ROS configurations
manuzhang ea633d2
activate ROS on Gearpump by default
manuzhang e63d42d
fix group by window
manuzhang 3bf8263
update to latest gearpump dsl function interface
manuzhang f6aaf0d
support OutputTimeFn
manuzhang 364a3f0
return encoded key for GroupByKey translation
manuzhang b2d326f
fix ParDo.BoundMulti translation
manuzhang 7613ec4
reduce timeout to wait for result
manuzhang 85dcfbd
Remove cache for Gearpump on travis
manuzhang d814857
note thread is interrupted on InterruptedException
manuzhang 1ed16f1
This closes #1661: Implement GearpumpPipelineResult
kennknowles 4fd216b
[BEAM-79] Fix PostCommit test confs for Gearpump runner
manuzhang 4001aeb
This closes #1828: Fix PostCommit test confs for Gearpump runner
kennknowles 7af6472
[BEAM-79] Support merging windows in GearpumpRunner
manuzhang 2d0aed9
This closes #1935: Support merging windows in GearpumpRunner
kennknowles 4eb50d1
[BEAM-79] Add SideInput support for GearpumpRunner
manuzhang 3dc8fc8
enable ParDoTest
manuzhang 15a8ad6
This closes #2150: Add SideInput support for GearpumpRunner
kennknowles 3f91798
Merge branch 'master' of https://github.com/apache/incubator-beam int…
manuzhang 3eab6a6
[BEAM-79] Fix gearpump-runner merge conflicts and test failure
manuzhang 555842a
This closes #2241: merge master to gearpump-master and fixup
kennknowles eb0d333
[BEAM-972] Add unit tests to Gearpump runner
huafengw f4f2333
This closes #2302: Add unit tests to Gearpump runner
kennknowles f3138dd
[BEAM-972] Add more unit test to Gearpump runner
huafengw ebbb613
This closes #2521: Add more unit test to Gearpump runner
kennknowles 46c41fc
Merge branch 'master' of https://github.com/apache/incubator-beam int…
manuzhang 44d21ac
Update gearpump-runner against master changes.
manuzhang 4078c22
This closes #2610: Merge master into gearpump-runner branch
kennknowles 9a59ea3
Merge remote-tracking branch 'upstream/master' into gearpump-runner
manuzhang 12b9719
Update gearpump-runner against master changes
manuzhang 58546ac
This closes #2888: Merge master into gearpump-runner branch
kennknowles bc8da29
Merge branch 'master' of https://github.com/apache/beam into sync-master
manuzhang 6c06967
Update gearpump-runner against master changes
manuzhang 99221e7
This closes #3172: Sync gearpump-runner with master
kennknowles 9e6c906
Merge branch 'master' of https://github.com/apache/beam into sync-master
manuzhang c9aac96
Update against master changes
manuzhang 3c7e3e6
Activate Gearpump local-validates-runner-tests in precommit
kennknowles 98854d4
Respect WindowFn#getOutputTime in gearpump-runner
manuzhang 7653e7e
Fix side input handling in DoFnFunction
manuzhang 559e3c3
This closes #3292: Merge master into gearpump-runner
kennknowles fed98c8
Merge branch 'master' of https://github.com/apache/beam into sync-master
manuzhang f61822d
upgrade to gearpump 0.8.4-SNAPSHOT
manuzhang a7b5d98
Fix PCollectionView translation
manuzhang 11caa97
Fix kryo exception
manuzhang b21fa04
Remove unused codes
manuzhang 99f4f8b
This closes #3388: Sync gearpump-runner branch with master
kennknowles f158257
Merge branch 'master' of https://github.com/apache/beam into sync-master
manuzhang 627ae0b
This closes #3479: [BEAM-79] Merge master into gearpump-runner branch
kennknowles c2d3fbc
Merge branch 'master' of https://github.com/apache/beam into sync-master
manuzhang 2206827
Upgrade to gearpump 0.8.4
manuzhang 725f547
Fix ParDoTest#testPipelineOptionsParameter
manuzhang 1ce60b4
This closes #3515: Sync gearpump-runner with master and upgrade to ge…
kennknowles e655f53
Revert accidental changes to sdks/java/pom.xml
manuzhang daa7566
Upgrade BEAM version to 2.2.0-SNAPSHOT in gearpump-runner
manuzhang 49d4ed5
Add beam-runners-gearpump dependency to javadoc
manuzhang b0ed584
Deactivate integration-tests for gearpump-runner by default
manuzhang File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> | ||
|
||
## Gearpump Beam Runner | ||
|
||
The Gearpump Beam runner allows users to execute pipelines written using the Apache Beam programming API with Apache Gearpump (incubating) as an execution engine. | ||
|
||
##Getting Started | ||
|
||
The following shows how to run the WordCount example that is provided with the source code on Beam. | ||
|
||
###Installing Beam | ||
|
||
To get the latest version of Beam with Gearpump-Runner, first clone the Beam repository: | ||
|
||
``` | ||
git clone https://github.com/apache/beam | ||
git checkout gearpump-runner | ||
``` | ||
|
||
Then switch to the newly created directory and run Maven to build the Apache Beam: | ||
|
||
``` | ||
cd beam | ||
mvn clean install -DskipTests | ||
``` | ||
|
||
Now Apache Beam and the Gearpump Runner are installed in your local Maven repository. | ||
|
||
###Running Wordcount Example | ||
|
||
Download something to count: | ||
|
||
``` | ||
curl http://www.gutenberg.org/cache/epub/1128/pg1128.txt > /tmp/kinglear.txt | ||
``` | ||
|
||
Run the pipeline, using the Gearpump runner: | ||
|
||
``` | ||
cd examples/java | ||
mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args="--inputFile=/tmp/kinglear.txt --output=/tmp/wordcounts.txt --runner=TestGearpumpRunner" -Pgearpump-runner | ||
``` | ||
|
||
Once completed, check the output file /tmp/wordcounts.txt-00000-of-00001 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look in this file for the
jenkins-precommit
bits. You can add an execution for the Gearpump runner.