-
Couldn't load subscription status.
- Fork 13.8k
[FLINK1919] add HCatOutputFormat #1064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
java api and scala api fix scala HCatInputFormat bug for complex type pull in cloudera Hcatalog jar for end to end test
flink-staging/flink-hcatalog/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a HCatalog expert. But I'm not sure that this 3rd-party repository is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not depend on vendor specific repositories / versions in the normal builds.
In the parent pom, there is a profile to enable vendor repositories.
|
Hi @jamescao, Thanks for your pull request! About the version of hcatalog release, it would be better to use vanila release. |
|
@chiwanpark @rmetzger |
|
I need to work offline to debug the travis builds So close the pr for now. Thanks for all your time and comments! I will reopen once all the tests are fixed. |
|
@jamescao: It seems that you also wrote tests for the HCatInputFormat, right? is it possible to split the PR into a OutputFormat part and open a separate PR for the HCatInputFormat tests. I'm still working on FLINK-2167 and require a HCatalog testing infrastructure. Otherwise I have to write it my own. Anyway, I wonder why all HCat I/O format classes have no tests so far... |
|
@twalthr : sorry I missed your message, this pr is reopened in |
[FLINK1919]
Add
HCatOutputFormatfor Tuple data types for java and scala api also fix a bug for the scala api'sHCatInputFormatfor hive complex types.Java api includes check for whether the schema of the HCatalog table and the Flink tuples match if the user provides a
TypeInformationin the constructor. For data types other than tuples, the OutputFormat requires a preceding Map function that converts toHCatRecordsscala api includes check if the schema of the HCatalog table and the Scala tuples match. For data types other than scala
Tuple, the OutputFormat requires a preceding Map function that converts toHCatRecordsscala api require suser to importorg.apache.flink.api.scala._to allow the type be captured by the scala macro.The Hcatalog jar in maven central is compiled using hadoop1, which is not compatible with hive jars for testing, so a cloudera hcatalog jar is pulled into the pom for testing purpose. It can be removed if not required.
java
ListandMapcan not be cast to scalaListandMap,JavaConvertersis used to fix a bug in HcatInputFormat scala api