LCORE-1472: Add e2e retriy capability#1500
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 46 minutes and 25 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
WalkthroughThe changes introduce opt-in auto-retry infrastructure for flaky e2e tests and apply Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/features/environment.py`:
- Around line 38-39: The _E2E_FEATURE_PERF_START_ATTR constant and timing doc
are stale because no timestamp is recorded or used; either remove the
constant/docs or wire the timing by setting
feature._lightspeed_e2e_feature_perf_start in the Behave hook before_feature and
computing/consuming the duration in after_feature (use time.time() or
datetime.now()), then store or log the computed duration (e.g., on feature or
context) so the declared feature-duration tracking is actually recorded and
consumed; update or remove any related docstrings and references accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7c1d5cb2-4f4e-4894-83eb-952383ac648d
📒 Files selected for processing (7)
tests/e2e/features/environment.pytests/e2e/features/mcp.featuretests/e2e/features/query.featuretests/e2e/features/responses.featuretests/e2e/features/responses_streaming.featuretests/e2e/features/steps/llm_query_response.pytests/e2e/features/streaming_query.feature
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: unit_tests (3.13)
- GitHub Check: unit_tests (3.12)
- GitHub Check: build-pr
- GitHub Check: integration_tests (3.12)
- GitHub Check: list_outdated_dependencies
- GitHub Check: integration_tests (3.13)
- GitHub Check: Pylinter
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (2)
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (AGENTS.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/query.featuretests/e2e/features/responses.featuretests/e2e/features/streaming_query.featuretests/e2e/features/responses_streaming.featuretests/e2e/features/mcp.feature
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/e2e/features/steps/llm_query_response.pytests/e2e/features/environment.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Applied to files:
tests/e2e/features/query.feature
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Place E2E step definitions in `tests/e2e/features/steps/`
Applied to files:
tests/e2e/features/query.feature
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Maintain E2E test list in `tests/e2e/test_list.txt`
Applied to files:
tests/e2e/features/query.feature
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.
Applied to files:
tests/e2e/features/query.featuretests/e2e/features/mcp.feature
📚 Learning: 2025-09-02T11:15:02.411Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/test_list.txt:2-3
Timestamp: 2025-09-02T11:15:02.411Z
Learning: In the lightspeed-stack e2e tests, the Authorized tag is intentionally omitted from noop authentication tests because they are designed to test against the default lightspeed-stack.yaml configuration rather than the specialized noop-with-token configuration.
Applied to files:
tests/e2e/features/query.feature
📚 Learning: 2026-04-07T09:20:26.590Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1467
File: tests/e2e/features/steps/common.py:36-49
Timestamp: 2026-04-07T09:20:26.590Z
Learning: For Behave-based Python tests, rely on Behave’s Context layered stack for attribute lifecycle: Behave pushes a new Context layer when entering feature scope (before_feature) and again for scenario scope (before_scenario). Attributes assigned inside given/when/then steps live on the current scenario layer and are automatically removed when the scenario ends. As a result, step-set attributes should not be expected to persist across scenarios or features, and manual cleanup in after_scenario/after_feature is generally unnecessary for attributes set in step functions. Only perform manual cleanup for attributes that you set explicitly in before_feature/before_scenario, since those live on the respective feature/scenario layers.
Applied to files:
tests/e2e/features/steps/llm_query_response.pytests/e2e/features/environment.py
📚 Learning: 2026-04-13T13:39:54.963Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1490
File: tests/e2e/features/environment.py:206-211
Timestamp: 2026-04-13T13:39:54.963Z
Learning: In lightspeed-stack E2E tests under tests/e2e/features, it is intentional to set context.feature_config inside Background/step functions (scenario-scoped Behave layer). The environment.py after_scenario restore logic should only restore configuration when context.scenario_lightspeed_override_active is True; this flag is set by configure_service only when a real config switch occurs (so restore does not run for scenarios without a switch). Additionally, steps/common.py’s module-level _active_lightspeed_stack_config_basename is used to prevent re-applying the same config across subsequent scenarios, ensuring scenario_lightspeed_override_active stays False after the first apply. Therefore, reviewers should not “fix” this flow as if feature_config were incorrectly scoped or if after_scenario restoration is missing—config switching and restoration are meant to happen exactly once per actual switch, not redundantly per scenario.
Applied to files:
tests/e2e/features/steps/llm_query_response.pytests/e2e/features/environment.py
📚 Learning: 2026-02-19T10:06:58.708Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1181
File: tests/e2e-prow/rhoai/manifests/lightspeed/mock-jwks.yaml:32-34
Timestamp: 2026-02-19T10:06:58.708Z
Learning: In tests/e2e-prow/rhoai/, ConfigMaps like mock-jwks-script and mcp-mock-server-script are created dynamically by the pipeline.sh deployment script using `oc create configmap` commands, rather than being defined as static ConfigMap resources in the manifest YAML files.
Applied to files:
tests/e2e/features/mcp.feature
🔇 Additional comments (7)
tests/e2e/features/steps/llm_query_response.py (1)
28-30: Good consistency improvement for fragment assertions.The case-insensitive helper is applied consistently across all three fragment-checking steps, which should reduce brittle failures caused only by response capitalization while preserving clear failure output.
Also applies to: 246-247, 259-262, 272-273, 298-301, 311-312, 322-325
tests/e2e/features/responses.feature (1)
337-337:@flakytagging is correctly scoped.Nice targeted application of retry behavior to the known unstable scenario without changing its test intent.
tests/e2e/features/query.feature (1)
13-13: Flaky annotations look appropriate for these query scenarios.The change is focused and keeps scenario behavior unchanged while enabling retry only where needed.
Also applies to: 26-26
tests/e2e/features/responses_streaming.feature (1)
328-328: Good targeted retry enablement for streaming conversation-flow scenarios.These
@flakytags are applied narrowly to the high-variance paths and preserve existing assertions.Also applies to: 359-359, 381-381
tests/e2e/features/streaming_query.feature (1)
22-22: Scoped@flakyupdates look good.Retry is enabled only for the unstable streaming-query prompt-variant scenarios, with no behavioral drift.
Also applies to: 36-36
tests/e2e/features/environment.py (1)
374-382: Flaky autoretry wiring is clean and well-guarded.The env override parsing with safe fallback plus targeted patching of
@flakyscenarios is a solid implementation.tests/e2e/features/mcp.feature (1)
21-21:@flakytagging across MCP success-path scenarios is consistent and appropriate.This is a focused reliability change that preserves scenario logic while enabling retry where intermittent infrastructure variance is expected.
Also applies to: 38-38, 130-130, 147-147, 241-241, 261-261, 291-291, 308-308, 409-409, 429-429
* add retry functionality to flaky e2e tests --------- Co-authored-by: Radovan Fuchs <rfuchs@rfuchs-thinkpadp1gen7.tpb.csb>
Description
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
Release Notes