Skip to content

[SUPPORT] Order rows with same key before precombine #11041

@ghost

Description

I have an use case where I would like to use hudi. I have to process several inserts, updates and deletes indicated in a file. The file can have lots of rows for the same key and I have to combine it in order using a file. I have developed my own Payload but I can process the rows in order. I have used the options hoodie.datasource.write.precombine.field to indicate the precombine field and hoodie.payload.ordering.field to order by the same field but it didn't work.
Also, I have tested using repartition and sort functions in the spark code before saving hudi but it didn't work either.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type:featureNew features and enhancements

    Type

    No type

    Projects

    Status

    🏁 Triaged

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions