fix(parallel): handle aborted batches in map_response_to_handler#682
Conversation
luisremis
left a comment
There was a problem hiding this comment.
add a test case using the example provided here: #181 (comment)
ad-claw000
left a comment
There was a problem hiding this comment.
Added the requested test case using the example from issue 181 in test/test_ResponseHandler.py.
ad-claw000
left a comment
There was a problem hiding this comment.
I have addressed the review comment by adding the test case using the example from issue 181 in test/test_ResponseHandler.py.
There was a problem hiding this comment.
Pull request overview
This PR addresses a failure mode in ParallelQuery response handling when a batched transaction aborts early and the database returns fewer per-command responses than the number of commands issued. The change adjusts map_response_to_handler to iterate based on the actual response length to avoid calling user response_handlers with empty response slices for queries that were never executed.
Changes:
- Update
map_response_to_handlerto limit iteration bylen(response)(when the response is a list), instead of the original query length. - Adjust an existing test DB mock response shape to return one response object per command.
- Add a regression test intended to reproduce issue #181 with multi-command queries and an aborted batch.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
aperturedb/CommonLibrary.py |
Changes the response-mapping loop to iterate based on returned response length to avoid empty response slices after aborted batches. |
test/test_ResponseHandler.py |
Updates the mock response shape and adds a new regression test for the aborted-batch + multi-command response_handler scenario. |
Comments suppressed due to low confidence (2)
test/test_ResponseHandler.py:476
- This assertion is inconsistent with the current mock + batching parameters. With
batchsize=2andcommands_per_query=2,map_response_to_handlerwill only invoke the handler for groups that have at least one returned command response; with the current mock, that results in fewer than 3 handler invocations/updates, so this test will fail. Update the expected count to match the intended abort scenario (e.g., only the queries that received responses) once the mock response behavior is corrected.
generator = UpdateAndCheckImageLabel(imgids, field, new_labels, changed_ids)
querier = ParallelQuery(db)
querier.query(generator, numthreads=1, batchsize=2, stats=False)
assert len(changed_ids) == 3
test/test_ResponseHandler.py:468
- The test mock replaces
Connector.querybut does not setself.response/self.blobslike the real implementation. Since other code (e.g.,last_query_ok()) relies onself.response, this can make the test dependent on whatever response happened previously. Setself.response = responses(andself.blobs = []) inside the mock to keep behavior deterministic.
def mock_query(self, request, blobs):
responses = []
for c in request:
if "FindImage" in c:
responses.append({
"FindImage": {
"status": 0,
"count": 1,
"entities": [{"image_id": "img1"}]
}
})
return responses, []
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Use isinstance instead of issubclass(type(...)) in CommonLibrary.py - Fix mock_query to align responses with requests in test_ResponseHandler.py - Correctly capture and re-raise exceptions in test_ResponseHandler.py - Rename local variable 'id' to 'image_id' in test_ResponseHandler.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
aperturedb/CommonLibrary.py:352
- This function now uses
isinstance(response, list)to computelimit, but later still usesissubclass(type(response), list)checks. For consistency/readability, consider using a single boolean likeis_list = isinstance(response, list)and reusing it for the later branches as well.
limit = len(response) if isinstance(response, list) else len(query)
for i in range(math.ceil(limit / commands_per_query)):
ad-claw000
left a comment
There was a problem hiding this comment.
I have addressed the review comment by adding the test case using the example from issue 181 in test/test_ResponseHandler.py.
ad-claw000
left a comment
There was a problem hiding this comment.
Applied autopep8 formatting to the new log messages to fix the pre-commit CI failure.
ad-claw000
left a comment
There was a problem hiding this comment.
Addressed the remaining feedback. Mocked clone and last_query_ok in tests, and simplified the isinstance logic in CommonLibrary.py. PTAL!
ad-claw000
left a comment
There was a problem hiding this comment.
Addressed the remaining feedback about the mocked tests. Now mocking clone() and last_query_ok()! Tests added.
ad-claw000
left a comment
There was a problem hiding this comment.
I have addressed the remaining feedback regarding by storing it in a local variable. Commit 888f8ca in aperturedb/CommonLibrary.py.
ad-claw000
left a comment
There was a problem hiding this comment.
I have addressed the remaining feedback regarding isinstance(response, list) by storing it in a local is_list variable. Commit 888f8ca in aperturedb/CommonLibrary.py.
ad-claw000
left a comment
There was a problem hiding this comment.
All actionable review comments have been addressed in commits 888f8ca and prior. Fixes applied to CommonLibrary.py and test_ResponseHandler.py.
Closes #181
If a batch query aborts prematurely, the database returns a response shorter than the number of queries issued.
map_response_to_handlerwas looping over the original query length, which resulted in passing empty lists[]as the responserfor the queries that were not executed.This PR changes the loop to iterate based on the length of
response, preventingIndexError(orKeyError) inside user-definedresponse_handlers when queries contain more than one command and fail mid-batch.