Skip to content

[HUDI-4724]Add function of skip the _rt suffix for read snapshot#6510

Open
linfey90 wants to merge 7 commits intoapache:masterfrom
linfey90:deepnova_skip_rt
Open

[HUDI-4724]Add function of skip the _rt suffix for read snapshot#6510
linfey90 wants to merge 7 commits intoapache:masterfrom
linfey90:deepnova_skip_rt

Conversation

@linfey90
Copy link
Contributor

Change Logs

add function of skip the _rt suffix for read snapshot when hive sync meta.

Impact

Describe any public API or user-facing feature change or any performance impact.

Risk level: none | low | medium | high

Choose one. If medium or high, explain what verification was done to mitigate the risks.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

public static final ConfigOption<Boolean> HIVE_SYNC_SKIP_RT_SUFFIX = ConfigOptions
.key("hive_sync.skip_rt_suffix")
.booleanType()
.defaultValue(false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may you add sinceValue("0.12.1") to track in the doc when to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd love to, but there's no since method. Any other suggestions?

.withDocumentation("Skip the _ro suffix for Read optimized table, when registering");
public static final ConfigProperty<String> HIVE_SKIP_RT_SUFFIX_FOR_READ_SNAPSHOT_TABLE = ConfigProperty
.key("hoodie.datasource.hive_sync.skip_rt_suffix")
.defaultValue("false")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here (sinceVersion)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done it!

@yihua yihua requested review from danny0405 and yuzhaojing August 30, 2022 05:54
@yihua yihua added priority:medium Moderate impact; usability gaps engine:flink Flink integration component:catalog-sync Catalog-sync related labels Aug 30, 2022
this.snapshotTableName = tableName + SUFFIX_SNAPSHOT_TABLE;
this.snapshotTableName = config.getBoolean(HIVE_SKIP_RT_SUFFIX_FOR_READ_SNAPSHOT_TABLE)
? tableName
: tableName + SUFFIX_SNAPSHOT_TABLE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the situation if both skip_ro_suffix and skip_rt_suffix are true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reply.That's very thoughtful of you. Can we throw an exception in this case? Any suggestion do you have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can throw exception if both options are valuable, but before do that, let's figure out the background why HIVE_SKIP_RO_SUFFIX_FOR_READ_OPTIMIZED_TABLE is introduced ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During Hive query, first, we prefer to use the original table name, and second, we query the latest data.

if (config.getBoolean(HIVE_SKIP_RT_SUFFIX_FOR_READ_SNAPSHOT_TABLE)
&& config.getBoolean(HIVE_SKIP_RO_SUFFIX_FOR_READ_OPTIMIZED_TABLE)) {
throw new HoodieHiveSyncException("can not be set both skip _rt and _ro are true at the same time!");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can not be set both skip _rt and _ro are true

Can not skip _ro and _rt suffix both at the same time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,thanks!

@hudi-bot
Copy link
Collaborator

hudi-bot commented Sep 1, 2022

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@XuQianJin-Stars
Copy link
Contributor

XuQianJin-Stars commented Sep 23, 2022

hi @linfey90 What is your WeChat ID? I want to communicate with you.

@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:catalog-sync Catalog-sync related engine:flink Flink integration priority:medium Moderate impact; usability gaps size:S PR with lines of changes in (10, 100]

Projects

Status: 🚧 Needs Repro
Status: 🏗 Under discussion

Development

Successfully merging this pull request may close these issues.

6 participants