Skip to content

[SUPPORT]What would be the best or high performant production level hoodie configs to be used for unpartitioned dataset??  #8820

@gtwuser

Description

@gtwuser

Describe the problem you faced

A clear and concise description of the problem.
I need to know from hoodie gurus what would be the best configuration for high performing read / write operations. In the given scenario i have multiple files each with average sized of 25MB. Total size of all files together would be 6GB. Total number of files is 354. Its all JSON data. We want to ingest it into hudi as soon as possible with the hudi metadata.

Please not we don't have fix field for partitioning. So if possible can you tell us configs for un-partitioned data

So the ask is what would be precise hoodie configs which we can use to get quick write and read.

^^ @nsivabalan @vinothchandar @umehrot2 or anybody who may have some insights.

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority:mediumModerate impact; usability gapstype:featureNew features and enhancements

    Type

    No type

    Projects

    Status

    👤 User Action

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions