Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType #2634

Merged
merged 1 commit into from Mar 8, 2021

Conversation

xiarixiaoyao
Copy link
Contributor

Tips

What is the purpose of the pull request

fixed the bug that: Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType
test step:
step1: prepare raw DataFrame with DateType, and insert it to HudiMorTable

df_raw.withColumn("date", lit(Date.valueOf("2020-11-10")))

merge(df_raw, "bulk_insert", "huditest.bulkinsert_mor_10g")

step2: prepare update DataFrame with DateType, and upsert into HudiMorTable

 df_update = sql("select * from huditest.bulkinsert_mor_10g_rt").withColumn("date", lit(Date.valueOf("2020-11-11")))

merge(df_update, "upsert", "huditest.bulkinsert_mor_10g")

step3: use hive-beeeline/ spark-sql query mor_rt table

use beeline/spark-sql   execute   statement select * from huditest.bulkinsert_mor_10g_rt where primary_key = 10000000;

then the follow error will occur:

java.lang.ClassCastExceoption: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.hive.serde2.io.DateWritableV2

Brief change log

when hudi read log file and convert avro INT type record to writable,logicalType is not respected which lead the dateType will cast to IntWritable。so cast avro INT type  to writable,  logicalType must be  considered

Verify this pull request

Existing UT tests

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@xiarixiaoyao
Copy link
Contributor Author

xiarixiaoyao commented Mar 5, 2021

cc @garyli1019 , could you take a look

@garyli1019 garyli1019 self-assigned this Mar 5, 2021
Copy link
Member

@garyli1019 garyli1019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiarixiaoyao thanks for your contribution, LGTM.
Looks like the CI is not triggered, would you try to force push again?

@xiarixiaoyao
Copy link
Contributor Author

Thank you @garyli1019
but I don't have permission to trigger CI, could you help.
BYT, could you give me the contributor permission? thank you

@garyli1019
Copy link
Member

@xiarixiaoyao your force push already triggered the CI. Do you mean JIRA contributor access? If so, would you send an email to the dev mailing list with your JIRA ID? That's how we usually grant access to new contributors. Thanks

@codecov-io
Copy link

codecov-io commented Mar 5, 2021

Codecov Report

Merging #2634 (f57ce9c) into master (899ae70) will increase coverage by 17.90%.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff              @@
##             master    #2634       +/-   ##
=============================================
+ Coverage     51.58%   69.48%   +17.90%     
+ Complexity     3285      363     -2922     
=============================================
  Files           446       53      -393     
  Lines         20409     1963    -18446     
  Branches       2116      235     -1881     
=============================================
- Hits          10528     1364     -9164     
+ Misses         9003      465     -8538     
+ Partials        878      134      -744     
Flag Coverage Δ Complexity Δ
hudicli ? ?
hudiclient ? ?
hudicommon ? ?
hudiflink ? ?
hudihadoopmr ? ?
hudisparkdatasource ? ?
hudisync ? ?
huditimelineservice ? ?
hudiutilities 69.48% <ø> (+0.04%) 0.00 <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ Complexity Δ
...apache/hudi/utilities/deltastreamer/DeltaSync.java 70.00% <0.00%> (-0.72%) 52.00% <0.00%> (ø%)
...s/deltastreamer/HoodieMultiTableDeltaStreamer.java 78.39% <0.00%> (ø) 18.00% <0.00%> (ø%)
...src/main/java/org/apache/hudi/cli/TableHeader.java
.../common/table/view/RocksDbBasedFileSystemView.java
...penJ9MemoryLayoutSpecification64bitCompressed.java
.../apache/hudi/common/config/SerializableSchema.java
...pache/hudi/hadoop/HoodieColumnProjectionUtils.java
...che/hudi/operator/partitioner/BucketAssigners.java
...pache/hudi/io/storage/HoodieFileReaderFactory.java
...rg/apache/hudi/cli/commands/CompactionCommand.java
... and 386 more

@xiarixiaoyao
Copy link
Contributor Author

cc @garyli1019 . Sorry for the late reply。 now the ci is pass, could you check and merge

@garyli1019 garyli1019 merged commit 0207323 into apache:master Mar 8, 2021
@xiarixiaoyao xiarixiaoyao deleted the date branch December 3, 2021 02:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants