-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-32760][Connectors/Hive] Reshade parquet in flink-sql-connector-hive #23166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| <relocations> | ||
| <relocation> | ||
| <pattern>org.apache.parquet</pattern> | ||
| <shadedPattern>org.apache.hive.shaded.parquet</shadedPattern> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit) Shading naming convention looks inconsistent with the one in flink-sql-connector-2.3.9.
How about chaing this to org.apache.flink.hive.shaded.parquet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I dont' need we need. Otherwise it'll cause some problems.
See https://issues.apache.org/jira/browse/FLINK-23074?focusedCommentId=17374459&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17374459
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongwoo6kim Thanks for the pr and sorry for late. I left some comments. PTAL.
| <relocations> | ||
| <relocation> | ||
| <pattern>org.apache.parquet</pattern> | ||
| <shadedPattern>org.apache.hive.shaded.parquet</shadedPattern> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I dont' need we need. Otherwise it'll cause some problems.
See https://issues.apache.org/jira/browse/FLINK-23074?focusedCommentId=17374459&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17374459
| </relocation> | ||
| <relocation> | ||
| <pattern>shaded.parquet</pattern> | ||
| <shadedPattern>org.apache.flink.hive.reshaded.parquet</shadedPattern> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking whether it's a good shade pattern in there. it's same to the shade pattern in connector-hive jar. It may overwride the classes in connector-hive jar.
How about reshading to org.apache.hive.reshaded.parquet like we don in FLINK-23074?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luoyuxia Thanks for the reply. I've understood that the pattern should be different from connector-hive jar.
org.apache.hive.reshaded.parquet looks good to me.
However I have one more question about this below part
| <shadedPattern>org.apache.flink.hive.shaded.parquet</shadedPattern> |
Don't we need to also change flink-sql-connector-hive-2.3.9, to avoid overwriding classes from connector-hive.jar?
This part seems to be changed from this pr, I'm wondering if you are aware of this change made before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't been aware of that before. But from the comment, seems we also should change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luoyuxia I've applied your review. Thanks for the help, PTAL
e01183c to
1719a6f
Compare
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Have you ever checked the modify still work? If so, I'll merge then..
|
@luoyuxia I have checked that it still works for current master branch and also QuerySET table.sql-dialect = default;
CREATE TEMPORARY TABLE IF NOT EXISTS test_table
(
id STRING,
status STRING,
click_id STRING
) WITH (
'connector' = 'filesystem',
'path' = '{hdfs_path}',
'format' = 'parquet'
);
SELECT id
FROM test_table
GROUP BY id;Before fixing this issueAfter fixing this issueCan get the result |
|
@dongwoo6kim Thanks for verifying. Merging.... Also, could you please help backport to 1.16 & 1.17 & 1.18. Thanks... |

What is the purpose of the change
To fix thrift dependency conflict when
flink-sql-connector-hiveandflink-parquetboth exist in the path.Brief change log
Reshade dependencies that are already shaded in parquet. For naming I referred to this
Verifying this change
This change is about dependency without code change, so I believe it should be covered by existing tests
Does this pull request potentially affect one of the following parts:
@Public(Evolving): noDocumentation