adding support for complex keys#728
Conversation
Resolving the issue related to ambiguity in recordKey by creating and parsing json object as string.
|
HI I think scope for GSON is limited to test we might need to change scope to compile to make this work. |
This reverts commit 1c4e60b.
|
Hi @jaimin-shah Will take a look at this sometime over hte weekend. Been having some travis CI issues lately, that is keeping few of us pretty busy .. :( |
vinothchandar
left a comment
There was a problem hiding this comment.
can you please add an small unit test?
| } | ||
|
|
||
| JsonObject recordKeyJson = new JsonObject(); | ||
|
|
| recordKeyJson.addProperty(recordKeyField,DataSourceUtils.getNestedFieldValAsString(record, recordKeyField)); | ||
| } | ||
| Gson gson = new Gson(); | ||
| String recordKey = gson.toJson(recordKeyJson); |
There was a problem hiding this comment.
instead of json, can we just concatenate the recordKeyFields? This adds additional json building, parsing in the fast path. any specific reasons you choose json for the recordKey?
| } | ||
| partitionPath.delete(partitionPath.length() - 1, partitionPath.length()); | ||
| } catch (HoodieException e) { | ||
| // TODO : optimize this since throwing and catching exception is cpu intensive |
There was a problem hiding this comment.
but is that a common scenario? its probably ok to do this when misconfigured etc right? if you agree, we can remove this TODO
| partitionPath.append(DataSourceUtils.getNestedFieldValAsString(record, partitionPathField)); | ||
| partitionPath.append(DEFAULT_PARTITION_PATH_SEPARATOR); | ||
| } | ||
| partitionPath.delete(partitionPath.length() - 1, partitionPath.length()); |
There was a problem hiding this comment.
use StringBuilder.html#deleteCharAt(int) ?
row_key looks like this now row_key:16bf0b32-7557-42ac-b367-9fe32ae4795e,timestamp:0.0 Row_Key generated by concatanation instead of JSON.
- Resolving the issue related to ambiguity in recordKey by creating and parsing json object as string. - added unit test for ComplexKeyGenerator - minor changes
…napshotLoadQuerySplitter (apache#728)
Resolving the issue related to ambiguity in recordKey by creating and parsing json object as string.
Now HoodieKey looks like this:
HoodieKey { recordKey={"_row_key":"16bf0b32-7557-42ac-b367-9fe32ae4795e","timestamp":"0.0"} partitionPath=rider-002/driver-002}