Skip to content

[BEAM-6114] Add isBounded() to BeamRelNode and BeamSqlTable, use for JOIN#7121

Merged
kennknowles merged 1 commit intoapache:masterfrom
kennknowles:sql-isbounded
Nov 27, 2018
Merged

[BEAM-6114] Add isBounded() to BeamRelNode and BeamSqlTable, use for JOIN#7121
kennknowles merged 1 commit intoapache:masterfrom
kennknowles:sql-isbounded

Conversation

@kennknowles
Copy link
Member

Adds boundedness information to the SQL/Calcite planner's relational algebra representation and uses it to select a join transform.

This builds on #7118 which separates the lookup join into its own PTransform. In this PR, the non-lookup side input join and "standard" join are also made into separate PTransforms.


Follow this checklist to help us incorporate your contribution quickly and easily:

  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

It will help us expedite review of your Pull Request if you tag someone (e.g. @username) to look at it.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status Build Status Build Status Build Status Build Status
Python Build Status --- Build Status
Build Status
Build Status --- --- ---

@kennknowles
Copy link
Member Author

R: @apilloud I think you modified the pipeline representation here - did I keep it intact?
R: @akedin and you rejected unsupported joins; this PR should be a noop for that so please catch my errors

Copy link
Contributor

@akedin akedin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

POutput buildIOWriter(PCollection<Row> input);

/** Whether this table is bounded (known to be finite) or unbounded (may or may not be finite). */
PCollection.IsBounded isBounded();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want to ultimately get rid of BeamSqlTable and just use raw PCollections

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the best way to split up the metadata about a table and its implementation. I believe there might be a different way to organize BeamSqlTable and TableProvider. Good to keep in mind.

@kennknowles kennknowles merged commit 048471b into apache:master Nov 27, 2018
@kennknowles kennknowles deleted the sql-isbounded branch November 27, 2018 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants