add new config controls whether input rdd should be first persist before insert.

When our input data comes from a complex rdd lineage, hudi writing will lead to repeated calculations.
For example, we will de duplicate according to the key of the input data, and we will obtain all partitions to be written to the data in the tag location. So I think we should cache the data to be written for downstream use.

## JIRA info

- Link: https://issues.apache.org/jira/browse/HUDI-5284
- Type: New Feature

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add new config controls whether input rdd should be first persist before insert. #15596

JIRA info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

add new config controls whether input rdd should be first persist before insert. #15596

Description

JIRA info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions