New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-7] Initial Dataflow code drop #1

Merged
merged 1,575 commits into from Feb 26, 2016

Conversation

Projects
None yet
@francesperry
Member

francesperry commented Feb 26, 2016

Initial contribution of the Google Cloud Dataflow Java SDK to Apache Beam.

Caveat: There is still a lot to do before this becomes usable as Apache Beam. In particular:

  • Reorganize directories.
  • Incorporate additional drops by Google, Cloudera, and dataArtisans.
  • Make major backwards incompatible API changes.
  • Rename from Dataflow to Beam.

Beaming with joy ;-D

peihe and others added some commits Jan 14, 2016

Fix javadoc @link warnings
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112105439
Adding "DocInclude" metadata comments to the "game" example
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112118850
Adding worker ID to the upload id logging
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112173826
DataflowAssert: throw when .equals(Object) is called
Users should not need to compare DataflowAssert objects on Java equality.
Instead, it's nearly always a broken test that will silently fail.

Throw an UnsupportedOperationException instead, and direct users to
isEqualTo (Singleton) or containsInAnyOrder (Iterable).

This change caught a broken test.

----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112200184
Generalize the 'game' example BigQuery write classes
Generalize the 'game' example BigQuery write classes to take a map that specifies how
to generate the output fields.

----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112253306
Use .jar for staged directory packages
Some tools don't support .zip in the class path.

----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112261905
DefaultProjectFactory: make it use new gcloud properties files
gcloud moved where it stores the credentials configured on the command line.
Since there is still no support in standard libraries to get the default
project, update DefaultProjectFactory to support the new location.

Note that users who have not upgraded gcloud are still supported.

----Release Notes----
The DataflowPipelineRunner will now prefer the default project configuration
produced by newer versions of the gcloud utility. Users with old gcloud clients
are still supported.
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112281533
Checkstyle: support disabling specific analyzers
From: http://stackoverflow.com/a/4023351/1715495

----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112287922
Add README.md for the "game" example series
Fix Javadoc issue in HourlyTeamScore pipeline.

----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112311676
Stops holding initializationStateLock while opening the reader
initializationStateLock should be held for short, bounded amounts of time,
because it is acquired on the dynamic work rebalancing code path
(requestDynamicSplit) which must be effectively non-blocking.
NativeReader.iterator() can do I/O and thus can take unbounded amount
of time, so it shouldn't be done under the lock.
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112375806
Upgrade JaCoCo to 0.7.5
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112415033
Fix typo: wrong table column name
----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112466110
Fix typo in OutputTimeFn
----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112480742
Implement typeFromId(DatabindContext,String) within CoderUtils
This resolves the user issue on SO:
http://stackoverflow.com/questions/34780459/runtimeexception-from-cloud-dataflow-related-to-serializing-coder
Since Jackson 2.3, TypeIdResolvers were meant to implement
this method since typeFromId(String) became deprecated.
This newer versions of Jackson enforce this.

----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112487029
Fixes a bug in custom unbounded readers
Custom unbounded readers are read in bundles of at most
10k elements or 10 seconds. A recent change accidentally removed
the 10k element limit. This change reintroduces it and
adds a test.

The previous test also was passing vacuously because
the iteration limit was incorrect (it would always
have only one iteration).
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112723469
Merge pull request #104 from mrunesson/maven-central
Adapt join-library module to be able to upload to maven-central
BigQueryTableRowIterator: elide columns with null values
As in 6a11a72, this makes BigQueryIO.Read work in the
DirectPipelineRunner as it does in the DataflowPipelineRunner.

----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112496161
Rollback "BigQueryTableRowIterator: elide columns with null values"
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112515243
Deterministically choose freshest aggregations in pipeline results
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112529131
Split streaming status pages into servlets
Also updates /heapz so that it downloads the heapdump rather than just
telling you where on the worker it is.

----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112535088
Expose dependent realtime watermark via Windmill protos
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112546981
Add ByteStringCoder, a coder for ByteStrings
This is a deterministic coder for ByteString. In the
wholeStream context, it simply writes the string. Otherwise,
it writes the string delimited with its length (encoded as a
VarInt).

----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112586805
CustomSources: remove dead code
----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112587034
Ignore Eclipse project files in root .gitignore
Users who check out and edit the SDK in Eclipse should
use m2e's Eclipse import wizard, and should not want to
commit their actual project configurations.

----Release Notes----
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=112597945
Version management
Updating version numbers from 1.4.0-SNAPSHOT to 1.5.0-SNAPSHOT

----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=113022038
@coveralls

This comment has been minimized.

coveralls commented Sep 12, 2016

Coverage Status

Changes Unknown when pulling 2efe761 on francesperry:master into * on apache:master*.

@coveralls

This comment has been minimized.

coveralls commented Sep 12, 2016

Coverage Status

Changes Unknown when pulling 2efe761 on francesperry:master into * on apache:master*.

@coveralls

This comment has been minimized.

coveralls commented Sep 13, 2016

Coverage Status

Changes Unknown when pulling 2efe761 on francesperry:master into * on apache:master*.

@coveralls

This comment has been minimized.

coveralls commented Sep 13, 2016

Coverage Status

Changes Unknown when pulling 2efe761 on francesperry:master into * on apache:master*.

@coveralls

This comment has been minimized.

coveralls commented Sep 13, 2016

Coverage Status

Changes Unknown when pulling 2efe761 on francesperry:master into * on apache:master*.

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Sep 22, 2016

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Oct 14, 2016

@sammcveety sammcveety referenced this pull request Nov 29, 2016

Closed

Fix double-close bug #1441

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Dec 29, 2016

Merge pull request apache#1 from jkff/BEAM-425-ELASTICSEARCH-IO
Various finishing touches on ElasticsearchIO

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Feb 22, 2017

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Feb 23, 2017

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Apr 10, 2017

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Apr 22, 2017

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Apr 24, 2017

Merge pull request apache#1 from robertwb/do-fn-sig
Statically bind output processor.

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Apr 25, 2017

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request May 18, 2017

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Jun 5, 2017

cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Jun 7, 2017

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Jan 9, 2018

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Jan 24, 2018

lukecwik referenced this pull request in lukecwik/incubator-beam Mar 12, 2018

Merge pull request #1 from kennknowles/hacking-job-server
hack an Impulse thing for the flink example

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Apr 4, 2018

Merge pull request apache#1 from lukecwik/pr5002
[BEAM-3993] Remove duplicate definitions between .gitignore and build.gradle

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Apr 16, 2018

Merge pull request apache#1 from pupamanyu/add_valueprovider
[BEAM-3925] Allow ValueProvider for KafkaIO so that we can create Beam Templates using KafkaIO

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Apr 20, 2018

Merge pull request apache#1 from swegner/pr5180
Merge apache/beam/master into szewi/beam/analysis

mareksimunek referenced this pull request in mareksimunek/beam May 9, 2018

Merge pull request #1 from seznam/pete/readme-fixes
Correct spelling and showcase example

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request May 22, 2018

Merge pull request apache#1 from mareksimunek/vasek/package-change-re…
…base

[BEAM-4294] [BEAM-4360] Join translation and ReduceByKey test suite where moved to org.apache.beam.* package.

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Sep 9, 2018

Merge pull request apache#1 from lukecwik/pr6328
Fix tests expectations and minor code fix up.

charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Oct 18, 2018

Merge pull request apache#1 from akedin/timeseries-pr
Initial OrderOutput review

jasonkuster pushed a commit that referenced this pull request Oct 24, 2018

Merge pull request #1 from apache/master
Update forked repository.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment