Add Trigger.ProcessingTime to Writers

Currently, all jobs are ingested with `Trigger.Once`, i.e. all data is ingested into one parquet file (per kafka partition). Certain jobs may produce very large output files, leading to out of memory errors.

To prevent this, `Trigger.ProcessingTime` should be used.

New configuration property: `writer.parquet.trigger`
The expected value is the number of milliseconds.
If the value is not a number or the property is not present, all data should be ingested at once, as it is the case now.

The change should be available for both `ParquetStreamWriter` and `ParquetPartitioningStreamWriter`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Trigger.ProcessingTime to Writers #84

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Trigger.ProcessingTime to Writers #84

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions