Skip to content

Forward integrate from master to DSL_SQL#2878

Merged
asfgit merged 547 commits intoDSL_SQLfrom
master
May 4, 2017
Merged

Forward integrate from master to DSL_SQL#2878
asfgit merged 547 commits intoDSL_SQLfrom
master

Conversation

@mingmxu
Copy link

@mingmxu mingmxu commented May 4, 2017

To fix the unit test failure org.apache.beam.examples.WordCountIT.testE2EWordCount

dhalperi and others added 30 commits April 28, 2017 11:44
We have assertions that the files are still big enough, so this is safe. Speeds up TextIOTest
2x.
…ampCombiner enum"

  Update Dataflow worker container to 20170428-2
  Rollforwards "Replace OutputTimeFn UDF with TimestampCombiner enum""
StandardCoder has improper connotations - mainly, "Standard" as in
"Standardized" as opposed to "Standard" as in "normal". StructuredCoder
communicates the important part of the class, which is that the coder
has some meaningful structure, and that structure can be used by a
runner.

Update Dataflow Worker Version
Known Coders will be serialized via a known path. Unknown
Coders will be serialized as a java object.
Removes uses of Coder.toCloudObject
tgroh and others added 13 commits May 3, 2017 14:12
Move inner classes of the DirectRunner to reduce total API Surface.
This is particularly important for fallback coders that claim
to provide a coder for Object (or equivalently an unconstrained
type parameter).  See BEAM-1642.
String.class was being encoded with both StringUtf8Coder.of() and
NullableCoder.of(UserStringCoder.of()) in the same transform,
and the wrong one was being chosen.
…es at pipeline construction time

  Remove job name usages from BigQueryIO at pipeline construction time
@mingmxu
Copy link
Author

mingmxu commented May 4, 2017

Retest this please

@mingmxu
Copy link
Author

mingmxu commented May 4, 2017

R: @davorbonaci

dhalperi and others added 13 commits May 3, 2017 17:44
This converts FileBasedSink from IOChannelFactory to FileSystems, with
fallout changes on all existing Transforms that use WriteFiles.

We preserve the existing semantics of most transforms, simply adding the
ability for users to provide ResourceId in addition to String when
setting the outputPrefix.

Other changes:

* Rethink FilenamePolicy as a function from ResourceId (base directory)
  to ResourceId (output file), moving the base directory into the
  context. This way, FilenamePolicy logic is truly independent from the
  base directory. Using ResourceId#resolve, a filename policy can add
  multiple path components, say, base/YYYY/MM/DD/file.txt, in a
  fileystem independent way.

  (Also add an optional extension parameter to the function, enabling an
  owning transform to pass in the suffix from a separately-configured
  compression factory or similar.)

* Make DefaultFilenamePolicy its own top-level class and move
  IOChannelUtils#constructName into it. This the default FilenamePolicy
  used by FileBasedSink.
This has exactly one implementation, and this is not expected to change.
@asfgit asfgit merged commit ff6bb35 into DSL_SQL May 4, 2017
asfgit pushed a commit that referenced this pull request May 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.