Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-2491] hoodie.datasource.hive_sync.mode=hms mode is supported in… #3799

Closed
wants to merge 7 commits into from

Conversation

fuyun2024
Copy link
Contributor

@fuyun2024 fuyun2024 commented Oct 14, 2021

… spark writer option

Tips

What is the purpose of the pull request

(For example: This pull request adds quick-start document.)

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@fuyun2024
Copy link
Contributor Author

@codepe Are you free to take a look at it for me ? This is my new commit .

Copy link
Member

@codope codope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fuyun2024 Thanks for reworking on this. Left a few comments.

HiveConf hiveConf = new HiveConf(conf, HiveConf.class);
if (!DataSourceWriteOptions.METASTORE_URIS().defaultValue().equals(hiveSyncConfig.metastoreUris)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to check this? Why not simply set whatever user has passed irrespective of the default value?

@@ -159,10 +159,14 @@ public void execute() throws IOException {
* Sync to Hive.
*/
private void syncHive() {
if (cfg.enableHiveSync) {
if (cfg.enableHiveSync || cfg.enableHiveSync) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant condition. Did you mean to add some other condition?

@@ -49,6 +49,9 @@
@Parameter(names = {"--jdbc-url"}, description = "Hive jdbc connect url")
public String jdbcUrl;

@Parameter(names = {"--metastore-uris"}, description = "Hive metastore uris")
public String metastoreUris;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be used somewhere right. I mean in HoodieHiveClient or HiveSyncTool. I don't see it is being used anywhere in hudi-hive-sync module.

correction parameters
@vinothchandar vinothchandar self-assigned this Oct 15, 2021
@vinothchandar vinothchandar added this to Nearing Landing in PR Tracker Board Oct 15, 2021
…ies/deltastreamer/DeltaSync.java to hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastream
@fuyun2024
Copy link
Contributor Author

@codope @vinothchandar anything else should i do, to achive this commit

Copy link
Member

@codope codope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fuyun2024 Why is DeltaSync.java renamed?

…ream to hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
@fuyun2024
Copy link
Contributor Author

it was my first time modifying PR online in github, not familiar enough with the operation. I correct it a moment ago, please check it again.

@codope
Copy link
Member

codope commented Oct 29, 2021

@hudi-bot run azure

Copy link
Member

@codope codope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fuyun2024 Overall, the code is in better shape. One issue is compatibility: users previously passed metastore uri in the same jdbc uri config even in hms mode. Now that the patch introduces a new config for metastore uri, we should ensure that it is backwards compatible with older config as well. Maybe, give precedence to metastore uri and if it's not available then also check jdbc uri. That way existing pipelines won't break if Hudi is upgraded. What do you think?

@fuyun2024
Copy link
Contributor Author

@codope Thank you for your comments. Yes, I looked at the code and judged that it is compatible in houdiehiveclient()

@nsivabalan
Copy link
Contributor

@fuyun2024 : looks like there are some conflicts with master. Can you rebase with latest.

@fuyun2024
Copy link
Contributor Author

@nsivabalan Yes, with pleasure

@codope
Copy link
Member

codope commented Nov 3, 2021

@hudi-bot run azure

@codope
Copy link
Member

codope commented Nov 3, 2021

@fuyun2024 Looks like this change is breaking a few tests that make use of HoodieHiveSync in deltastreamer. Can you please check that?

[ERROR] Errors: 
[ERROR]   TestHoodieDeltaStreamer.testBulkInsertsAndUpsertsWithSQLBasedTransformerFor2StepPipeline:1240 » HoodieHiveSync
[ERROR]   TestHoodieDeltaStreamer.testPayloadClassUpdate:1328 » HoodieHiveSync Got runti...
[ERROR]   TestHoodieDeltaStreamer.testPayloadClassUpdateWithCOWTable:1354 » HoodieHiveSync

@fuyun2024
Copy link
Contributor Author

@codope
hello codope, I'm sorry, I'm busy these days and can't reply in time.
I carefully checked the Java CI log and found no error. Can you tell me where the error is and the error stack information

@hudi-bot
Copy link

hudi-bot commented Nov 5, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@fuyun2024 fuyun2024 closed this Nov 5, 2021
@fuyun2024 fuyun2024 deleted the HUDI-2491 branch November 5, 2021 15:32
PR Tracker Board automation moved this from Nearing Landing to Done Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants