Skip to content

Comments

[MINOR] Refactor hive realtime config to extend from HoodieConfig#3307

Closed
leoluan2009 wants to merge 5 commits intoapache:masterfrom
leoluan2009:config
Closed

[MINOR] Refactor hive realtime config to extend from HoodieConfig#3307
leoluan2009 wants to merge 5 commits intoapache:masterfrom
leoluan2009:config

Conversation

@leoluan2009
Copy link
Contributor

Tips

What is the purpose of the pull request

(For example: This pull request adds quick-start document.)

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@codecov-commenter
Copy link

codecov-commenter commented Jul 20, 2021

Codecov Report

Merging #3307 (5ad4271) into master (a086d25) will decrease coverage by 2.08%.
The diff coverage is 97.14%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3307      +/-   ##
============================================
- Coverage     47.74%   45.65%   -2.09%     
- Complexity     5591     5596       +5     
============================================
  Files           938      999      +61     
  Lines         41823    43790    +1967     
  Branches       4213     4403     +190     
============================================
+ Hits          19968    19992      +24     
- Misses        20070    22015    +1945     
+ Partials       1785     1783       -2     
Flag Coverage Δ
hudicli 39.97% <ø> (ø)
hudiclient 34.55% <ø> (ø)
hudicommon 48.65% <ø> (+0.01%) ⬆️
hudiflink 59.62% <100.00%> (+0.18%) ⬆️
hudihadoopmr 52.40% <95.65%> (+0.37%) ⬆️
hudiintegtest 0.00% <ø> (?)
hudisparkdatasource 67.12% <100.00%> (+0.01%) ⬆️
hudisync 55.97% <ø> (ø)
huditimelineservice 64.07% <ø> (ø)
hudiutilities 59.87% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...pache/hudi/hadoop/config/HoodieRealtimeConfig.java 88.88% <88.88%> (+88.88%) ⬆️
...java/org/apache/hudi/table/format/FormatUtils.java 89.65% <100.00%> (-3.68%) ⬇️
...hadoop/realtime/RealtimeCompactedRecordReader.java 77.77% <100.00%> (+0.96%) ⬆️
.../hadoop/realtime/RealtimeUnmergedRecordReader.java 97.67% <100.00%> (+0.11%) ⬆️
.../hadoop/utils/HoodieRealtimeRecordReaderUtils.java 72.03% <100.00%> (+0.23%) ⬆️
...n/scala/org/apache/hudi/HoodieMergeOnReadRDD.scala 90.55% <100.00%> (-0.21%) ⬇️
...he/hudi/sink/partitioner/BucketAssignFunction.java 80.00% <0.00%> (ø)
...e/hudi/integ/testsuite/writer/DeltaWriteStats.java 0.00% <0.00%> (ø)
.../integ/testsuite/writer/DFSDeltaWriterAdapter.java 0.00% <0.00%> (ø)
... and 64 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a086d25...5ad4271. Read the comment docs.

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment to avoid the dependency change, at which point, this is not a MINOR PR :)

<artifactId>hudi-common</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm. not sure if we would want hadoop-mr depend on the client-common

.withReverseReader(false)
.withBufferSize(jobConf.getInt(HoodieRealtimeConfig.MAX_DFS_STREAM_BUFFER_SIZE_PROP, HoodieRealtimeConfig.DEFAULT_MAX_DFS_STREAM_BUFFER_SIZE))
.withSpillableMapBasePath(jobConf.get(HoodieRealtimeConfig.SPILLABLE_MAP_BASE_PATH_PROP, HoodieRealtimeConfig.DEFAULT_SPILLABLE_MAP_BASE_PATH))
.withBufferSize(jobConf.getInt(HoodieMemoryConfig.MAX_DFS_STREAM_BUFFER_SIZE_PROP.key(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HoodieMemoryConfig has some stuff that is specific to writing, like WriteStatus . Can we leave these configs as-is in the HoodieRealtimeConfig

@vinothchandar vinothchandar self-assigned this Sep 23, 2021
@hudi-bot
Copy link
Collaborator

hudi-bot commented Nov 5, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua yihua added priority:medium Moderate impact; usability gaps engine:hive Hive integration reader-core labels Sep 13, 2022
@yihua
Copy link
Contributor

yihua commented Sep 13, 2022

@leoluan2009 do you still have bandwidth to fix this PR?

@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Feb 26, 2024
Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is no longer needed as the latest code on master has covered similar changes already. Closing this PR.

string2Boolean(
config.get(HoodieRealtimeConfig.COMPACTION_LAZY_BLOCK_READ_ENABLED_PROP,
HoodieRealtimeConfig.DEFAULT_COMPACTION_LAZY_BLOCK_READ_ENABLED)))
.withReadBlocksLazily(config.getBoolean(HoodieRealtimeConfig.COMPACTION_LAZY_BLOCK_READ_ENABLED_PROP.key(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.withReadBlocksLazily is already removed.

</dependency>
<dependency>
<groupId>org.apache.hudi</groupId>
<artifactId>hudi-client-common</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HoodieMemoryConfig, HoodieCommonConfig and HoodieReaderConfig containing all "real-time" configs are already in hudi-common module on master so there is no need to add hudi-client-common module dependency.

import org.apache.hudi.common.config.ConfigProperty;
import org.apache.hudi.common.config.HoodieConfig;

public class HoodieRealtimeConfig extends HoodieConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HoodieRealtimeConfig no longer exists and these configs are moved to HoodieMemoryConfig, HoodieCommonConfig and HoodieReaderConfig.

@yihua yihua closed this Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine:hive Hive integration priority:medium Moderate impact; usability gaps size:M PR with lines of changes in (100, 300]

Projects

Status: 🏁 Triaged
Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

5 participants