Skip to content

Report metrics on the number of duplicates after dedup #15668

@hudi-bot

Description

@hudi-bot

In Delastreamer, user can set --filter-dupes to dedup the input records.  Currently, Hudi does not report any metric on the duplicates.  It would be good to report such metrics and monitor how many records are filtered in each commit.

JIRA info

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions