test: add e2e test for evaluations lifecycle by Hweinstock · Pull Request #628 · aws/agentcore-cli

Hweinstock · 2026-03-24T21:23:52Z

Description

The evaluations feature (custom evaluators, online eval configs, on-demand evals, pause/resume) is missing e2e test coverage. This PR adds a standalone e2e suite that tests the full lifecycle against real AWS infrastructure.

Also refactor e2e test setup to be build consumable utility functions rather than a single mono-function.

Known limitation: `logs evals` not included

Online eval log processing is async and the CloudWatch log group (/aws/bedrock-agentcore/evaluations/results/{configId}) was not populated within 8+ minutes of polling across multiple test runs. This makes logs evals assertions unreliable for e2e testing.

Related Issue

Closes #

Documentation PR

N/A — test-only change.

Type of Change

Bug fix
New feature
Breaking change
Documentation update
Other (please describe): e2e test coverage for evaluations lifecycle (user stories 2.4, 2.5, 10.1, 10.3, 10.4)

Testing

How have you tested the change?

I ran npm run test:unit and npm run test:integ
I ran npm run typecheck
I ran npm run lint
If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Verified against dev account (us-east-1). All 6 tests pass in ~4 minutes:

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

tejaskash

Looks good. Clean refactoring of e2e helpers and solid evals lifecycle test coverage.

github-actions bot added the size/m PR size: M label Mar 24, 2026

Hweinstock force-pushed the feat/e2e-evals-lifecycle branch from 6eed8a7 to 61b6f9f Compare March 24, 2026 21:29