Skip to content

Add support for composite key#862

Closed
afilipchik wants to merge 1 commit intoapache:masterfrom
afilipchik:af-composite-key
Closed

Add support for composite key#862
afilipchik wants to merge 1 commit intoapache:masterfrom
afilipchik:af-composite-key

Conversation

@afilipchik
Copy link
Contributor

No description provided.

config.getString(Config.TIMESTAMP_INPUT_DATE_FORMAT_PROP));
this.inputDateFormat.setTimeZone(TimeZone.getTimeZone("GMT"));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afilipchik : Can you document some examples of how this is configured and used

.map(
recordKeyField ->
DataSourceUtils.getNestedFieldValAsString(record, recordKeyField))
.collect(Collectors.joining(".")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are joining by "." here. Is there any assumption here ? Wanted to see if this is generic enough to be put it in this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be anything.
HoodieKey(String recordKey, String partitionPath)

so I'm just creating a long key: bla.bla1.bla2. Can be bla:bla1:bla2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi I think this kind of key creates ambiguity refer this PR for details #728

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bvaradar any thoughts? lets make a call on this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaimin-shah : Wondering if there is a necessity to decode the generated record key. If this is a one-way concatenation of fields to record-key, it should be fine. right ? As we are storing individual fields that constitute the record key separately, there wont be any need to decode this record key and ambiguity during decoding should be ok. Isn't it ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bvaradar By ambiguity I meant two different records having same key. For example US, .A => US..A and US. , A => US..A Keeping recordKeyField as part of key resolves this. Although I agree these kind of cases are quite rare.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense @jaimin-shah. Didn't think about this possibility. @afilipchik : Based on this information, this may not be safe in general sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, not sure I understood the edge case example. Here is how we are using it:
We have an object with id and a version. Version suppose to monotonically increase, and we want to be able to dedub based on key and the version (there should be at most 1 object with the same id and version).

Our config looks like:
hoodie.datasource.write.recordkey.field=object.id,object.version for insert table and

What can go wrong with this one?

Copy link
Contributor

@jaimin-shah jaimin-shah Sep 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @afilipchik for your use case I don't think there will be any problem but there can be problem when there no restrictions on recordKeyField ( e.g. Version suppose to monotonically increase ). You can take a look at https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/ComplexKeyGenerator.java it has generic implementation of complex key for quite similar requirement.

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can you please file a JIRA for this and include it in the PR title as documented here https://hudi.apache.org/contributing.html#lifecycle

@vinothchandar
Copy link
Member

@bvaradar @afilipchik can we make a call on this?

@bvaradar
Copy link
Contributor

bvaradar commented Nov 6, 2019

Closing this for now due to inactivity. @afilipchik, please open it if you think otherwise.

@bvaradar bvaradar closed this Nov 6, 2019
vinishjail97 pushed a commit to vinishjail97/hudi that referenced this pull request Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants