Skip to content

[HUDI-1226] Fix ComplexKeyGenerator for non-partitioned tables#2037

Merged
n3nash merged 1 commit intoapache:masterfrom
satishkotha:sk/emptyPartition
Aug 26, 2020
Merged

[HUDI-1226] Fix ComplexKeyGenerator for non-partitioned tables#2037
n3nash merged 1 commit intoapache:masterfrom
satishkotha:sk/emptyPartition

Conversation

@satishkotha
Copy link
Member

What is the purpose of the pull request

ComplexKeyGenerator getPartitionPath doesnt seem to work well with non-partitioned tables. Fix it and add test case

Brief change log

  • Ignore empty string for partition path fields
  • Return "" for partitionPath if there are no partition path fields defined

Verify this pull request

This change added tests

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

Copy link
Contributor

@n3nash n3nash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satishkotha was it throwing an exception before this change ?

@satishkotha
Copy link
Member Author

@satishkotha was it throwing an exception before this change ?

From https://issues.apache.org/jira/browse/HUDI-1226,

  1. If we pass empty string(-hoodie-conf hoodie.datasource.write.partitionpath.field=), generator returns 'default' as partitionpath
  2. if we pass delimiter alone (-hoodie-conf hoodie.datasource.write.partitionpath.field=,), it throws
    java.lang.StringIndexOutOfBoundsException: String index out of range: -1

at java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:824)
at java.lang.StringBuilder.deleteCharAt(StringBuilder.java:253)
at org.apache.hudi.keygen.KeyGenUtils.getRecordPartitionPath(KeyGenUtils.java:80)
at org.apache.hudi.keygen.ComplexKeyGenerator.getPartitionPath(ComplexKeyGenerator.java:52)
at org.apache.hudi.keygen.BuiltinKeyGenerator.getKey(BuiltinKeyGenerator.java:75)

@satishkotha
Copy link
Member Author

@satishkotha was it throwing an exception before this change ?

From https://issues.apache.org/jira/browse/HUDI-1226,

  1. If we pass empty string(-hoodie-conf hoodie.datasource.write.partitionpath.field=), generator returns 'default' as partitionpath
  2. if we pass delimiter alone (-hoodie-conf hoodie.datasource.write.partitionpath.field=,), it throws
    java.lang.StringIndexOutOfBoundsException: String index out of range: -1

at java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:824)
at java.lang.StringBuilder.deleteCharAt(StringBuilder.java:253)
at org.apache.hudi.keygen.KeyGenUtils.getRecordPartitionPath(KeyGenUtils.java:80)
at org.apache.hudi.keygen.ComplexKeyGenerator.getPartitionPath(ComplexKeyGenerator.java:52)
at org.apache.hudi.keygen.BuiltinKeyGenerator.getKey(BuiltinKeyGenerator.java:75)

@n3nash ^

@n3nash n3nash self-requested a review August 26, 2020 03:55
@n3nash n3nash merged commit f468c20 into apache:master Aug 26, 2020
@satishkotha satishkotha deleted the sk/emptyPartition branch August 26, 2020 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants