Skip to content

Conversation

@wuchong
Copy link
Member

@wuchong wuchong commented May 9, 2020

What is the purpose of the change

Support the LookupTableSource interface in planner.

Brief change log

Update CommonLookupJoin to both handling legacy LookupableTableSource and new LookupTableSource. We didn't copy a LegacyCommonLookupJoin, because there are too many code can be reused.

Verifying this change

  • Make the existing LookupJoinTest and LookupJoinITCase to be Parameterized, and test both on legacy lookup and new lookup interface.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): yes
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented May 9, 2020

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit b718cd9 (Sat May 09 03:12:28 UTC 2020)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented May 9, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

private final int[][] lookupKeys;

public LookupRuntimeProviderContext(int[][] lookupKeys) {
this.lookupKeys = lookupKeys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is lookupKeys two-dimensional array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the Javadoc org.apache.flink.table.connector.source.LookupTableSource.Context#getKeys, it is in order to support nested lookup keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But IMO, two-dimensional array is not enough, what if there is a column a.b.c ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be expressed. If we have a lookup table , and the lookup key is r.f.q2

CREATE TABLE t1 (r ROW< d DOUBLE, f ROW<q1 STRING, q2 BOOLEAN> >);

the array will be [[0, 1, 1]] which is equivalent to [["r", "f", "q2"]].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The array is a path for selection.


public static final String IDENTIFIER = "values";
public static final AtomicInteger RESOURCE_COUNTER = new AtomicInteger();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used for checking open and close are both invocked. The RESOURCE_COUNTER will be checked in https://github.com/apache/flink/pull/12047/files#diff-46716d3945346195315abf04aa338171R79

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.

@wuchong
Copy link
Member Author

wuchong commented May 9, 2020

The build CI is passed except a python test failed which is related to FLINK-17597.

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I have no other objections now, except that there are many if else clause for the legacy planner, I think they are to be cleaned soon once we upgrade to the new connector totally.

@wuchong
Copy link
Member Author

wuchong commented May 10, 2020

Thanks for the reviewing @danny0405 . Yes, they can be cleaned once legacy source is removed.
Will merge this one.

@wuchong wuchong merged commit 39d70f0 into apache:master May 10, 2020
@wuchong wuchong deleted the lookup-planner branch May 10, 2020 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants