LCORE-758: add additional e2e test for streaming query endpoint and enable color output #745

radofuchs · 2025-10-31T11:34:22Z

add additional e2e test for streaming query endpoint and enable color output

Description

Type of change

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Bug Fixes
- Improved error messages and assertions for streaming response validation
- Refined token handling in response streaming to skip initial protocol messages
Tests
- Added authorization header validation to streaming query tests
- Introduced authentication failure scenario testing
- Enhanced test output formatting with colored, pretty-printed results

…ed output

coderabbitai · 2025-10-31T11:34:35Z

Walkthrough

This PR enhances e2e tests with authorization header validation for streaming queries, improves terminal output visibility through environment configuration, refines streaming response token processing with first-token skipping logic, and adds validation for unauthenticated requests.

Changes

Cohort / File(s)	Summary
Test Environment Configuration `.github/workflows/e2e_tests.yaml`, `Makefile`	Adds TERM and FORCE_COLOR environment variables to e2e test execution and wraps command with script utility for enhanced terminal output and color formatting
Streaming Query Test Scenarios `tests/e2e/features/streaming_query.feature`	Adds Authorization header requirements across streaming query test scenarios and introduces new scenario to validate unauthenticated request rejection with 400 status
Streaming Response Step Implementation `tests/e2e/features/steps/llm_query_response.py`	Introduces first_token flag to skip initial token in streaming responses and adds explicit error messages to assertion validations

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

The first_token flag logic should be verified to handle all streaming response edge cases correctly
Authorization header changes across multiple test scenarios should be confirmed for consistency
Ensure terminal output configuration (TERM, FORCE_COLOR, script wrapper) doesn't inadvertently mask or obscure test failures

Possibly related PRs

LCORE-488: Add E2E tests for query endpoint #616: Directly modifies e2e streaming query tests and llm_query_response.py with complementary authorization and streaming token handling changes
LCORE-303: fixes made in E2E tests #228: Modifies the same llm_query_response.py file with related streaming assertion and precondition handling
LCORE-489: E2E tests for streaming query endpoint #539: Modifies streaming query feature and step files with related streaming response parsing and test scenario enhancements

Suggested reviewers

tisnik

Poem

🐰 Streams flow with auth so bright,
Colors dance in TERM's light,
Skip the first, take the rest,
E2E tests now pass the test! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "LCORE-758: add additional e2e test for streaming query endpoint and enable color output" directly aligns with the main objectives and changes in the PR. The title accurately captures the two primary changes: the addition of new e2e tests for the streaming query endpoint (visible in the streaming_query.feature file with new authorization scenarios and an unauthenticated request test) and the enablement of color output (visible in the .github/workflows/e2e_tests.yaml and Makefile modifications). The supporting changes to the streaming parser logic in llm_query_response.py are implementation details that enable the e2e tests to function correctly, rather than primary objectives. The title is specific, concise, and clearly conveys the main purpose of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

Makefile (1)
24-24: Consider the impact of removing PYTHONDONTWRITEBYTECODE.

The previous command set PYTHONDONTWRITEBYTECODE=1 to prevent Python bytecode (.pyc) files from being created. Removing this may allow .pyc files to be generated during e2e test runs, which could affect test reproducibility or cause clutter in the test directories.

If you want to maintain the previous behavior, apply this diff:
-	script -q -e -c "uv run behave --color --format pretty --tags=-skip -D dump_errors=true @tests/e2e/test_list.txt"
+	PYTHONDONTWRITEBYTECODE=1 script -q -e -c "uv run behave --color --format pretty --tags=-skip -D dump_errors=true @tests/e2e/test_list.txt"
tests/e2e/features/steps/llm_query_response.py (1)
182-185: Consider adding validation before skipping the first token.

Instead of unconditionally skipping the first token, consider validating that it actually contains the expected shield/status message before discarding it. This would make the code more robust and provide better error messages if the expected format changes.

Apply this diff to add validation:
                 elif event == "token":
-                    # Skip the first token (shield status message)
                     if first_token:
                         first_token = False
-                        continue
+                        # Skip the first token if it's a shield/status message
+                        token = data["data"]["token"]
+                        # Add your shield message detection logic here
+                        # For now, skip if empty or matches known pattern
+                        if not token or token.strip() == "":
+                            continue
                     full_response_split.append(data["data"]["token"])

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 980e871 and 09fbe04.

📒 Files selected for processing (4)

.github/workflows/e2e_tests.yaml (1 hunks)
Makefile (1 hunks)
tests/e2e/features/steps/llm_query_response.py (3 hunks)
tests/e2e/features/streaming_query.feature (9 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Write E2E tests as Gherkin feature files for behave

Files:

tests/e2e/features/streaming_query.feature

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

tests/e2e/features/steps/llm_query_response.py

tests/e2e/features/steps/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place behave step definitions under tests/e2e/features/steps/

Files:

tests/e2e/features/steps/llm_query_response.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

tests/e2e/features/steps/llm_query_response.py

🧠 Learnings (5)

📚 Learning: 2025-09-02T11:15:02.411Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/test_list.txt:2-3
Timestamp: 2025-09-02T11:15:02.411Z
Learning: In the lightspeed-stack e2e tests, the Authorized tag is intentionally omitted from noop authentication tests because they are designed to test against the default lightspeed-stack.yaml configuration rather than the specialized noop-with-token configuration.

Applied to files:

tests/e2e/features/streaming_query.feature

📚 Learning: 2025-09-02T11:09:40.404Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.

Applied to files:

tests/e2e/features/streaming_query.feature

📚 Learning: 2025-09-18T16:46:33.353Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to tests/e2e/test_list.txt : Maintain the E2E test list in tests/e2e/test_list.txt

Applied to files:

.github/workflows/e2e_tests.yaml
Makefile

📚 Learning: 2025-09-18T16:46:33.353Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to tests/e2e/features/**/*.feature : Write E2E tests as Gherkin feature files for behave

Applied to files:

.github/workflows/e2e_tests.yaml
Makefile

📚 Learning: 2025-09-18T16:46:33.353Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to tests/{unit,integration}/**/*.py : Use pytest for all unit and integration tests

Applied to files:

Makefile

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: build-pr
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: e2e_tests (azure)
GitHub Check: e2e_tests (ci)

🔇 Additional comments (8)

.github/workflows/e2e_tests.yaml (1)

196-198: LGTM! Proper configuration for color output in CI.

The environment variables correctly enable 256-color terminal support and force color output in the CI environment, which aligns with the Makefile changes that add --color and --format pretty flags to behave.

tests/e2e/features/streaming_query.feature (4)

1-1: LGTM! Feature-level authorization tag added.

The @Authorized tag correctly enables authorization testing for all scenarios in this feature. Based on learnings, this triggers the specialized noop-with-token configuration for these tests.

46-65: Verify authorization header usage in conversation continuation scenario.

Line 48 sets the Authorization header, but lines 49 and 55 use the plain "ask question" step instead of "ask question with authorization header" like other scenarios. This inconsistency may indicate:

The authorization header should be automatically carried over in the same conversation (if that's the intended behavior)

These steps should be updated to use "with authorization header" for consistency

Please clarify the intended behavior for authorization in multi-turn conversations.

125-135: LGTM! Proper validation of missing authorization.

This new scenario correctly tests that requests without the Authorization header are rejected with a 400 status and appropriate error message. This complements the authenticated scenarios and ensures proper authorization enforcement.

11-12: Truncated JWT token is an intentional test fixture.

The token eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva is consistently reused across all scenarios in this file and other feature files in the test suite (query.feature, conversations.feature, feedback.feature, authorized_noop*.feature). This uniformity confirms it is a deliberate test fixture, not an accidental truncation.

The tests verify API endpoint behavior when the Authorization header is provided or absent—they do not validate JWT token structure or content. Using a simplified token for testing is standard practice when the auth mechanism is not the focus of the test. The actual JWT validation occurs in the application's auth middleware, which is outside the scope of these endpoint tests.

Likely an incorrect or invalid review comment.

tests/e2e/features/steps/llm_query_response.py (2)

158-161: LGTM! Enhanced assertions improve test failure diagnostics.

The explicit error messages provide clear information when tests fail, making it easier to diagnose issues. The comparison output showing both responses is particularly helpful for debugging mismatches.

171-171: Document or conditionally handle the first-token skip logic in the e2e streaming response parser.

The code unconditionally skips the first "token" event in _parse_streaming_response, citing a "shield status message." However:

No documentation explains why or which providers/configurations require this skip

No unit tests verify this behavior

Different LLM providers may have different streaming response formats, making this assumption unreliable

The "turn_complete" event already contains the full response, so this skip only affects intermediate tokens

Either document the requirement with specific provider details, make the behavior configurable based on the provider being tested, or verify token content before skipping (e.g., check if it matches a known shield/status pattern).
Makefile (1)
24-24: ---

script utility is not available in CI—this will cause e2e tests to fail.

The script utility from util-linux is not installed in the GitHub Actions runner. The command script -q -e -c "uv run behave ..." will fail immediately.

Either install util-linux before running tests, or remove the script wrapper and handle output/exit codes directly:
uv run behave --color --format pretty --tags=-skip -D dump_errors=true @tests/e2e/test_list.txt
⛔ Skipped due to learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to tests/e2e/test_list.txt : Maintain the E2E test list in tests/e2e/test_list.txt
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to tests/e2e/features/**/*.feature : Write E2E tests as Gherkin feature files for behave

tisnik

LGTM

add additional e2e test for streaming query endpoint and enable color…

09fbe04

…ed output

radofuchs requested a review from tisnik October 31, 2025 11:34

coderabbitai bot reviewed Oct 31, 2025

View reviewed changes

tisnik approved these changes Oct 31, 2025

View reviewed changes

tisnik merged commit 9cce972 into lightspeed-core:main Oct 31, 2025
20 of 22 checks passed

coderabbitai bot mentioned this pull request Nov 4, 2025

fix RHELAI tests and add assert messages to make test debugging easier #757

Merged

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LCORE-758: add additional e2e test for streaming query endpoint and enable color output #745

LCORE-758: add additional e2e test for streaming query endpoint and enable color output #745

Uh oh!

radofuchs commented Oct 31, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 31, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LCORE-758: add additional e2e test for streaming query endpoint and enable color output #745

LCORE-758: add additional e2e test for streaming query endpoint and enable color output #745

Uh oh!

Conversation

radofuchs commented Oct 31, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

radofuchs commented Oct 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 31, 2025 •

edited

Loading