-
Notifications
You must be signed in to change notification settings - Fork 144
APEXMALHAR-2096: Add property blocksThreshold to limit input rate #283
Conversation
|
@ChandniSingh can you please review this? |
| dag.setAttribute(blockReader, Context.OperatorContext.PARTITIONER, new StatelessPartitioner<FSSliceReader>(readersCount)); | ||
| fileSplitter.setBlocksThreshold(readersCount); | ||
| } | ||
| if (blocksThreshold != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this check as blocksThreshold is always >=1 according to constraint?
|
Test case? |
|
@DT-Priyanka |
|
As per my experience so far, 2 or 4 looks like a good number. @vrozov or anyone has any other idea please let me know. I would remove min=1 constraint and put default in AbstractFileSplitter. |
|
@DT-Priyanka |
|
Need to find a good default value and also update tests if required, closing this PR for time being. |
|
I don't think there is a good default value. It depends on a run-time environment and the amount of processing necessary. In general, it will be better to change the behavior of the downstream operator to process as many records as it can in a single window, instead of trying to process all tuples emitted by FileSplitter. |
|
That breaks idempotency of the downstream. File splitter and block reader work in conjunction so to keep things simple and not break idempotency, this property IMO should be on splitter. Priyanka, how about setting the default to 16? |
|
If user end up setting static partitioning on Readers, and sets readers count to a really small value say 1 or 2, the readers will be overloaded, reading 16 blocks in a window, and we can still see this problem.
|
|
Priyanka, I agree with point 1. |
|
@ChandniSingh can you please elaborate how item 2 is already handled. |
|
@vrozov dynamic partition of block readers is explained in the documentation of this operator. For changing the block threshold dynamically on splitter there is a ticket open. |
The blocksThreshold is already exposed by FileSplitter operator, exposing same using module.