Conversation
We have assertions that the files are still big enough, so this is safe. Speeds up TextIOTest 2x.
…tions through as PipelineOptions
…tions through as PipelineOptions This closes #2762
…ampCombiner enum" Update Dataflow worker container to 20170428-2 Rollforwards "Replace OutputTimeFn UDF with TimestampCombiner enum""
StandardCoder has improper connotations - mainly, "Standard" as in "Standardized" as opposed to "Standard" as in "normal". StructuredCoder communicates the important part of the class, which is that the coder has some meaningful structure, and that structure can be used by a runner. Update Dataflow Worker Version
Known Coders will be serialized via a known path. Unknown Coders will be serialized as a java object.
Removes uses of Coder.toCloudObject
Move inner classes of the DirectRunner to reduce total API Surface.
This is particularly important for fallback coders that claim to provide a coder for Object (or equivalently an unconstrained type parameter). See BEAM-1642.
String.class was being encoded with both StringUtf8Coder.of() and NullableCoder.of(UserStringCoder.of()) in the same transform, and the wrong one was being chosen.
…es at pipeline construction time Remove job name usages from BigQueryIO at pipeline construction time
Author
|
Retest this please |
Author
|
R: @davorbonaci |
This converts FileBasedSink from IOChannelFactory to FileSystems, with fallout changes on all existing Transforms that use WriteFiles. We preserve the existing semantics of most transforms, simply adding the ability for users to provide ResourceId in addition to String when setting the outputPrefix. Other changes: * Rethink FilenamePolicy as a function from ResourceId (base directory) to ResourceId (output file), moving the base directory into the context. This way, FilenamePolicy logic is truly independent from the base directory. Using ResourceId#resolve, a filename policy can add multiple path components, say, base/YYYY/MM/DD/file.txt, in a fileystem independent way. (Also add an optional extension parameter to the function, enabling an owning transform to pass in the suffix from a separately-configured compression factory or similar.) * Make DefaultFilenamePolicy its own top-level class and move IOChannelUtils#constructName into it. This the default FilenamePolicy used by FileBasedSink.
This has exactly one implementation, and this is not expected to change.
…construction time
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
To fix the unit test failure
org.apache.beam.examples.WordCountIT.testE2EWordCount