-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-13503][API] Add contract in LookupableTableSource
to specify the behavior when lookupKeys contains null.
#9335
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit e134e61 (Tue Aug 27 09:12:47 UTC 2019) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
If I understand this issue correctly, should it just convert the |
@lincoln-lil The issue aims to specify the behavior when the lookupKeys contains null value. It has no relationship with |
@@ -101,7 +105,7 @@ public JDBCLookupFunction( | |||
this.maxRetryTimes = lookupOptions.getMaxRetryTimes(); | |||
this.keySqlTypes = Arrays.stream(keyTypes).mapToInt(JDBCTypeUtil::typeInformationToSqlType).toArray(); | |||
this.outputSqlTypes = Arrays.stream(fieldTypes).mapToInt(JDBCTypeUtil::typeInformationToSqlType).toArray(); | |||
this.query = options.getDialect().getSelectFromStatement( | |||
this.query = options.getDialect().getSelectNotDistinctFromStatement( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this query performance poor against the older one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, I would add two prepareStatement with two query template, if arguments does not contain any null value, it using original query template; else use the new one. Ok?
* @param lookupKeys the chosen field names as lookup keys, it is in the defined order | ||
* | ||
* <p>IMPORTANT: | ||
* If the returned {@link TableFunction} receives a lookup request with null value in lookup keys, expect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give more detailed comments here maybe more clearly for understanding,
- lookupKeys are nullable
- should carefully deal with
null
value lookup in each concrete store engine, e.g., in MySQL this can ais null
query
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll try to explain more clearly in the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Only minor comments here.
...onnectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc/JDBCLookupFunction.java
Outdated
Show resolved
Hide resolved
...e/flink-table-common/src/main/java/org/apache/flink/table/sources/LookupableTableSource.java
Outdated
Show resolved
Hide resolved
...e/flink-table-common/src/main/java/org/apache/flink/table/sources/LookupableTableSource.java
Outdated
Show resolved
Hide resolved
...onnectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc/JDBCLookupFunction.java
Outdated
Show resolved
Hide resolved
...onnectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc/JDBCLookupFunction.java
Outdated
Show resolved
Hide resolved
… the behavior when lookupKeys contains null.
…or to avoid illegalArgument exception when create Get object.
… when there is null value in input data of eval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @beyond1920 , I left some comments.
* name | STRING | ||
* ----------------- | ||
* For the external system which does not support null value (E.g, HBase does not support null value on rowKey), | ||
* it could throw an exception or discard the request when receiving a request with null value on lookup key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please never throw an exception. We should discard the request because HBase don't have null rowkeys.
* @param lookupKeys the chosen field names as lookup keys, it is in the defined order | ||
*/ | ||
TableFunction<T> getLookupFunction(String[] lookupKeys); | ||
|
||
/** | ||
* Gets the {@link AsyncTableFunction} which supports async lookup one key at a time. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update javadoc of this method too.
Get get; | ||
try { | ||
get = readHelper.createGet(row); | ||
} catch (IllegalArgumentException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't use try catch to do this job in performance critical code. We can return if length of row
is zero.
* @return serialize bytes. | ||
*/ | ||
public byte[] serialize(Object rowKey) { | ||
byte[] key = HBaseTypeUtils.serializeFromObject( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return HBaseTypeUtils.serializeFromObject(...
rowKeyType, | ||
charset); | ||
Get get = new Get(rowkey); | ||
public Get createGet(byte[] row) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
row
-> rowkey
import org.apache.hadoop.hbase.client.HTable; | ||
import org.apache.hadoop.hbase.client.Mutation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless import
@@ -20,6 +20,7 @@ | |||
|
|||
import org.apache.flink.addons.hbase.util.HBaseConfigurationUtil; | |||
import org.apache.flink.addons.hbase.util.HBaseReadWriteHelper; | |||
import org.apache.flink.addons.hbase.util.HBaseTypeUtils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless import
JDBCUtils.setField(statement, keySqlTypes[i], keys[i], i); | ||
if (containsNull) { | ||
JDBCUtils.setField(statement, keySqlTypes[i], keys[i], 2 * i); | ||
JDBCUtils.setField(statement, keySqlTypes[i], keys[i], 2 * i + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep this special logic for now. But please open a JIRA to improve this. We can introduced a custom NamedPreparedStatement
to pass each field only once. see https://www.javaworld.com/article/2077706/named-parameters-for-preparedstatement.html
@@ -101,7 +113,8 @@ public JDBCLookupFunction( | |||
this.maxRetryTimes = lookupOptions.getMaxRetryTimes(); | |||
this.keySqlTypes = Arrays.stream(keyTypes).mapToInt(JDBCTypeUtil::typeInformationToSqlType).toArray(); | |||
this.outputSqlTypes = Arrays.stream(fieldTypes).mapToInt(JDBCTypeUtil::typeInformationToSqlType).toArray(); | |||
this.query = options.getDialect().getSelectFromStatement( | |||
this.nonNullableQuery = options.getDialect().getSelectFromStatement(options.getTableName(), fieldNames, keyNames); | |||
this.nullableQuery = options.getDialect().getSelectNotDistinctFromStatement( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a consensus on the new methods of SqlDialect
. we can call the field nullSafeQuery
and nonNullableQuery
-> nullUnsafeQuery
.
@@ -76,7 +87,8 @@ | |||
private final int maxRetryTimes; | |||
|
|||
private transient Connection dbConn; | |||
private transient PreparedStatement statement; | |||
private transient PreparedStatement fastStatement; | |||
private transient PreparedStatement slowStatement; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a consensus on the new methods of SqlDialect
. We can rename these fields to nullSafeStatemen
and nullUnsafeStatement
, we can add comment on the fields that we should use nullUnsafeStatement
as much as possible because it is faster.
What is the purpose of the change
Add contract in
LookupableTableSource
to specify the behavior when lookupKeys contains null.And update existed connector to comply with this contract.
Brief change log
LookupableTableSource
to specify the behavior when lookupKeys contains null.Verifying this change
existed IT.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: yesDocumentation