Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUPPORT] when using bootstrap partitioned table, partition column return null when select table #6517

Closed
peanut-chenzhong opened this issue Aug 27, 2022 · 5 comments
Assignees
Labels
priority:critical production down; pipelines stalled; Need help asap. writer-core Issues relating to core transactions/write actions

Comments

@peanut-chenzhong
Copy link
Contributor

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

A clear and concise description of the problem.

To Reproduce

Steps to reproduce the behavior:

1.create a bootstrap partitioned table
2.select table found partitioned column is null
image

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version :0.11.0

  • Spark version :3.1.1

  • Hive version :3.1.0

  • Hadoop version :3.3.1

  • Storage (HDFS/S3/GCS..) :HDFS

  • Running on Docker? (yes/no) :no

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

@nsivabalan nsivabalan added priority:critical production down; pipelines stalled; Need help asap. writer-core Issues relating to core transactions/write actions labels Aug 27, 2022
@yihua
Copy link
Contributor

yihua commented Sep 12, 2022

@peanut-chenzhong I assume the source parquet table uses Hive-style partition path. It should be related to HUDI-4783.

@yihua
Copy link
Contributor

yihua commented Sep 12, 2022

I'm going to fix it.

@nsivabalan
Copy link
Contributor

if we have a tracking jira, can we close it and since you plan to fix it in the next 2 to 3 weeks.

@peanut-chenzhong
Copy link
Contributor Author

sure,will recheck after this jira ticket has fixed, thx for help

@yihua
Copy link
Contributor

yihua commented Sep 16, 2022

#6673 and #6676 have fixed the problem of reading the partition column from a bootstrap table and I verified that it works (see the df.show result below after bootstrap). Closing this issue. @peanut-chenzhong feel free to reopen this if you still see the problem.

scala> df.show
+-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+-------------+--------------------+------------+--------------------+--------------------+--------------------+-----+---------+
|_hoodie_commit_time|_hoodie_commit_seqno|  _hoodie_record_key|_hoodie_partition_path|   _hoodie_file_name|                 key|           ts|           textField|decimalField|           longField|          arrayField|            mapField|round|partition|
+-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+-------------+--------------------+------------+--------------------+--------------------+--------------------+-----+---------+
|     00000000000002|  00000000000002_1_0|000-416e-f335-1f3...|             2022/1/31|356f2b69-6958-465...|000-416e-f335-1f3...|1643949407427|abcdefghijklmnopq...|   0.5398461| 4486089480226173414|[0, 1, 2, 3, 4, 5...|{4a19-ff6d-95f87c...|    0|2022/1/31|
|     00000000000002|  00000000000002_1_1|000-4638-bd51-7ce...|             2022/1/31|356f2b69-6958-465...|000-4638-bd51-7ce...|1643949404254|abcdefghijklmnopq...|    0.542539|-7250499539432824960|[0, 1, 2, 3, 4, 5...|{4a1c-6792-b6a852...|    0|2022/1/31|

@yihua yihua closed this as completed Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:critical production down; pipelines stalled; Need help asap. writer-core Issues relating to core transactions/write actions
Projects
Archived in project
Development

No branches or pull requests

3 participants