New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-12237][hive]Support Hive table stats related operations in HiveCatalog #8636
Conversation
cc @bowenli86 @lirui-apache @xuefuz to review this pr. More test cases are needed to consider. |
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zjuwangg for your PR. I just roughly went thru it really quickly, and found it's too big and hard to review and question on. Given that stats is a big area, can you break the PR (as well as the JIRA) down into smaller ones? to 1) make it easy to review 2) allow you to add enough test coverage for each new functionality.
I'd recommend to open 1st PR for table stats and all the stats type conversions, 2nd for partition stats, and 3rd for table/partition column stats. 2nd and 3rd PR won't be blocked since you can always base them on the 1st.
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...k-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveCatalogUtil.java
Outdated
Show resolved
Hide resolved
...k-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveCatalogUtil.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...k-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveCatalogUtil.java
Outdated
Show resolved
Hide resolved
...k-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveCatalogUtil.java
Outdated
Show resolved
Hide resolved
...k-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveCatalogUtil.java
Outdated
Show resolved
Hide resolved
...k-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveCatalogUtil.java
Outdated
Show resolved
Hide resolved
...k-table/flink-table-common/src/test/java/org/apache/flink/table/catalog/CatalogTestBase.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution. I left some minor comments.
Thanks for @bowenli86 and @xuefuz detailed review, I'll address comments later and break the PR into smaller pieces. |
ccc @xuefuz again to review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I left some comments
One issue I found is that, APIs that alter stats will first convert existing table stats from string to long in needToUpdateStatistics()
and then convert new ones from long to string in updateStatisticsParameters()
. Can we just convert new stats from long to string at the first place, and just compare and update the strings if needed?
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Show resolved
Hide resolved
...k-table/flink-table-common/src/test/java/org/apache/flink/table/catalog/CatalogTestBase.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...k-table/flink-table-common/src/test/java/org/apache/flink/table/catalog/CatalogTestBase.java
Outdated
Show resolved
Hide resolved
...k-table/flink-table-common/src/test/java/org/apache/flink/table/catalog/CatalogTestBase.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
@zjuwangg can you rebase the PR? |
1 similar comment
@zjuwangg can you rebase the PR? |
Thanks for your guys' detailed review. @bowenli86 @lirui-apache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rebase to the latest master?
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
...tors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
Outdated
Show resolved
Hide resolved
The last commit as "Merge branch 'master' into FLINK-12237" is not right. Instead of merging master into dev branch in a big chunk, it's better for us to rebase dev branch to latest master. Otherwise the changed files would include whatever you merged from master, not just showing what you modified in this PR. I'd recommend revert that commit, rebase to latest master, and force push this PR. |
rebase and force to push |
@bowenli86 Sorry a little late to update this pr, please review again when you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job! LGTM overall, with just a few minor comments
|
||
@Override | ||
public void alterTable(IMetaStoreClient client, String databaseName, String tableName, Table table) throws InvalidOperationException, MetaException, TException { | ||
// For Hive-1.2.1, we need to tell HMS not to update stats. Otherwise, the stats we put in the table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Can you also add comments to explain how Hive-2.3.4 handles it too? Will Hive 2.3.4 just not update any table stats in parameters when client.alter_table is invoked?
flink-table/flink-table-common/src/test/java/org/apache/flink/table/catalog/CatalogTest.java
Show resolved
Hide resolved
flink-table/flink-table-common/src/test/java/org/apache/flink/table/catalog/CatalogTest.java
Show resolved
Hide resolved
LGTM, thanks for the contribution! @flinkbot approve all Merging |
What is the purpose of the change
Brief change log
Verifying this change
This change added tests and can be verified as follows:
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation