fix: resolve e2e import test concurrency races#1067
Conversation
7ad4f06 to
83ae514
Compare
83ae514 to
75b490d
Compare
75b490d to
975c5ec
Compare
Fix two independent concurrency issues causing flaky e2e import tests: 1. TOCTOU race in evaluator import (import-evaluator.ts): The beforeConfigWrite hook lists all online eval configs then fetches details for each with Promise.all. If a config is deleted between the list and get calls, the API throws 'Online evaluation configuration not found' and the entire import fails. Fixed by using Promise.allSettled and filtering out disappeared configs. 2. Resource name collisions across parallel CI shards (setup_*.py): Python setup scripts generated resource names using int(time.time()) (second-level precision). Parallel CI shards starting in the same second would collide with ConflictException. The test already passes a unique RESOURCE_SUFFIX env var but scripts ignored it for naming. Added NAME_SUFFIX to common.py that prefers RESOURCE_SUFFIX when set, and updated all setup scripts to use it.
975c5ec to
f46d632
Compare
|
Reviewed the changes — looks good to merge. The two fixes are well-scoped and correct:
Very minor nit (not blocking): |
|
removed unused time import. Good call review agent! |
Description
Two issues in running the e2e tests in parallel:
naming conflict
Resource names are using time (in seconds) to suffix, but this creates an issue when they create at the same time. Switch to UUID or millisecond precision in resource name suffix.
list then get race condition
The code performs a check that the given evaluator isn't already in an eval config, because if it is, the import will fail.
When iterating through all online eval configs in the account, we call list then get, but its possible an eval config is deleted between the list and get. This will cause the promise to reject with a ResourceNotFoundError. Therefore we can ignore these errors.
Related Issue
Closes ##1066
Documentation PR
N/A
Type of Change
Testing
How have you tested the change?
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshotsChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.