Skip to content

Configuration Improvements #16930

@hudi-bot

Description

@hudi-bot

h3. Known Configuration Issues

Key Generator Conflicts

Changing the {{hoodie.datasource.write.keygenerator.class}} after initial table creation can lead to runtime exceptions. For instance, switching from {{SimpleKeyGenerator}} to {{GlobalDeleteKeyGenerator}} without recreating the table may raise a {{HoodieException}} due to mismatched metadata expectations.
🔗 https://medium.com/@life-is-short-so-enjoy-it/apache-hudi-exception-raised-when-using-different-keygenerator-d307d8efe7a1

Partition Path Field Data Type Conflicts

If you specify partition field data types (e.g., {{{}inserted_at:TIMESTAMP{}}}) in one ingestion run and omit the type in another (e.g., just {{{}inserted_at{}}}), it can cause schema mismatch issues or ingestion failures.
🔗 [https://github.com//issues/8372]

Flink SQL vs Hudi Config Overlap

When using Flink SQL, specifying {{PRIMARY KEY}} and {{PARTITIONED BY}} can silently override {{hoodie.datasource.write.recordkey.field}} and {{{}hoodie.datasource.write.partitionpath.field{}}}, leading to confusing or unexpected ingestion behavior.
🔗 [https://github.com//issues/12024]

Embedded Timeline Server with Flink

Enabling the embedded timeline server ({{{}hoodie.embed.timeline.server=true{}}}) can lead to performance degradation or connectivity issues in environments like AWS Managed Flink, where cross-task communication is restricted. It is recommended to disable this in such cases.
🔗 [https://docs.aws.amazon.com/managed-flink/latest/java/troubleshooting-hudi.html]

JIRA info

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions