Skip to content

Conversation

@Yuqi-Du
Copy link
Contributor

@Yuqi-Du Yuqi-Du commented Sep 16, 2025

What this PR does:
When there is a faulty row in the entire batch of insertMany, and ordered as true.
Every row before the faulty row should succeed, and all the faulty row errors should propagate to the result.

Which issue(s) this PR fixes:
fixes: #1672

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CLA Signed: DataStax CLA

@Yuqi-Du Yuqi-Du requested a review from a team as a code owner September 16, 2025 19:24
Copy link
Contributor

@tatu-at-datastax tatu-at-datastax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hemidactylus
Copy link
Contributor

LGTM (I just tested from the perspective of a client, it's all good :)
Should there be specific testing perhaps? (clients CI can also take care of such testing)

@Yuqi-Du Yuqi-Du merged commit 8de2f9a into main Sep 24, 2025
5 of 6 checks passed
@Yuqi-Du Yuqi-Du deleted the yuqi/tables-insertMany-task-order-true branch September 24, 2025 20:39
Copy link
Contributor

@amorton amorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have stict ordering guarantees in the TaskGroup, and the predicate for finding the tasks before the current task can be stricter.

/**
* Checks if, given the config of the container and the current state of the tasks, the container
* should fail fast and stop processing any further tasks.
* Determines whether the given {@code targetTask} should fail fast based on the container's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: replace container with TaskGroup

* shouldFailFast(e) → true ('c' failed earlier, so 'e' must fail fast)
* </pre>
*
* @param targetTask the task to evaluate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we clarify, should this be the next ask to be run or is it the last task that was run ?

* <p>If sequential processing is enabled:
*
* <ul>
* <li>If the {@code targetTask} itself is in error state, it should fail fast.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can happen if the task generated an error as it was being created, such as one insert in an insertMany with a unknown column name

* </pre>
*
* @param targetTask the task to evaluate
* @return {@code true} if the container is configured for sequential processing and the {@code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this part is implementation logic and can be removed "if the container is configured for sequential processing"

return true;
}
// In sequential processing, if any prior task is in error state, we should fail fast
return stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we are subclassing the array list it is mutatable AND there is no explicit ordering based on the position... it relies on the caller inserting the Tasks in position() order and no one else changing them.

It would be good to either 1) sort each time we iterate 2) stop subclassing and use something that supports custom order 3) stop subclassing, use an array list and sort on insertion 4) change the predicate as below.

But even if we change the predicate we still has not strictly ordered

task -> {
var failFast = taskGroup.shouldFailFast();
var failFast = taskGroup.shouldFailFast(task);
LOGGER.debug(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we move this to be DEBUG if failFast == true otherwise TRACE

}
// In sequential processing, if any prior task is in error state, we should fail fast
return stream()
.takeWhile(task -> task != targetTask)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be safer and stricter that we want to get tasks that strictly have a position() that is before the position() of this task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ordered insertMany incorrectly fails if an row has pre-execution error

5 participants