Skip to content

Conversation

@ddanielr
Copy link
Contributor

@ddanielr ddanielr commented Feb 3, 2024

Adds a property to configure the max wait interval in the compactor as well as the minimum wait period.

Changes the logic in compaction-coordinator to use this new property when calculating the wait period for sending warning messages

closes #4217

Adds a property to configure the max wait interval in the compactor
Also changes the logic in compaction-coordinator to use this new
property when calculating the wait period for sending warning messages
@ddanielr ddanielr linked an issue Feb 3, 2024 that may be closed by this pull request
Refactored check property to match elasticity property.
Adds in min wait property for compaction job queue checks
Also use the MAX_JOB_WAIT_TIME prop for the thrift retry interval when
the compactor is unable to communicate with the compaction-coordinator.
@ddanielr ddanielr changed the title Add Compaction Max Wait property Add Compaction Job Min & Max Wait properties Feb 5, 2024
@ddanielr
Copy link
Contributor Author

Talked with @EdColeman about this PR.
The two main concerns:

  1. If the job min and max wait properties were changing the thrift communication call.
    This is not the case as the RetryableThriftCall class uses these options to configure a Retry object.
    public RetryableThriftCall(long start, long maxWaitTime, int maxNumRetries,
    RetryableThriftFunction<T> function) {
    this.function = function;
    NeedsRetryDelay builder = null;
    if (maxNumRetries == 0) {
    builder = Retry.builder().infiniteRetries();
    } else {
    builder = Retry.builder().maxRetries(maxNumRetries);
    }
    this.retry = builder.retryAfter(start, MILLISECONDS).incrementBy(0, MILLISECONDS)
    .maxWait(maxWaitTime, MILLISECONDS).backOffFactor(2).logInterval(1, TimeUnit.MINUTES)
    .createRetry();

The underlying thrift call is not using these properties and remains unaffected.

  1. Not setting maxRetries would leave the compactor in a state where it appeared hung unless a logging level of trace was used.

Attempted to correctly calculate the maxRetries period and pass that value.
Found that the calculation is prone to errors. Opened #4261 to handle that issue separately.

Upon further review, the RetryableThriftCall has logging logic that would display log messages of warn during this infinite wait state. This meets the expectation that the user can determine that the compactor is not hung and is looking for work.

Use the minJobWait value instead of a hardcoded one second value.
@ddanielr ddanielr merged commit 3298d6d into apache:2.1 Feb 16, 2024
@ddanielr ddanielr deleted the feature/4217-add-max-wait-prop branch February 16, 2024 22:02
@ctubbsii ctubbsii added this to the 2.1.3 milestone Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Compactor max wait property.

4 participants