Skip to content

Conversation

@omertuc
Copy link
Contributor

@omertuc omertuc commented Aug 14, 2025

Description

The query endpoint would treat an empty conversation ID as not provided,
but the streaming query endpoint would treat it as if it were an ID
(which doesn't belong to anyone, so it would fail).

Align the behavior of the streaming query endpoint with the query
endpoint, so when users provide a "conversation_id" with an empty
string, it will be as if they didn't provide it at all.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Manual testing

Summary by CodeRabbit

  • Bug Fixes
    • Adjusted conversation ID handling in streaming queries: validation now occurs only when a non-empty ID is provided.
    • Prevents unintended 403 errors for requests with empty or missing conversation IDs; valid, non-empty IDs continue to be checked as before.
    • Requests with empty strings or other falsy IDs are treated as having no conversation context.
    • No changes to streaming behavior or response format beyond this validation refinement.

The query endpoint would treat an empty conversation ID as not provided,
but the streaming query endpoint would treat it as if it were an ID
(which doesn't belong to anyone, so it would fail).

Align the behavior of the streaming query endpoint with the query
endpoint, so when users provide a "conversation_id" with an empty
string, it will be as if they didn't provide it at all.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 14, 2025

Walkthrough

Adjusted a guard condition in streaming_query_endpoint_handler to use a truthiness check on conversation_id instead of an explicit None comparison, which changes when ownership validation executes. All other logic, including 403 handling within the validation block and the downstream streaming flow, remains unchanged.

Changes

Cohort / File(s) Summary
Streaming query endpoint
src/app/endpoints/streaming_query.py
Changed condition from if query_request.conversation_id is not None to if query_request.conversation_id, affecting execution of conversation ownership validation for falsy values (e.g., empty string). No other flow changes.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Endpoint
  participant OwnershipValidator as Ownership Validator
  participant Streamer as Streaming Logic

  Client->>Endpoint: POST /streaming_query (conversation_id?)
  alt conversation_id is truthy
    Endpoint->>OwnershipValidator: validate(conversation_id)
    alt not owner
      OwnershipValidator-->>Endpoint: Forbidden (403)
      Endpoint-->>Client: 403
    else owner
      OwnershipValidator-->>Endpoint: OK
      Endpoint->>Streamer: execute streaming query
      Streamer-->>Endpoint: stream chunks
      Endpoint-->>Client: stream response
    end
  else conversation_id is falsy (None, "", 0, etc.)
    Note over Endpoint: Skip ownership validation
    Endpoint->>Streamer: execute streaming query
    Streamer-->>Endpoint: stream chunks
    Endpoint-->>Client: stream response
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~7 minutes

Poem

I twitch my ears at a tiny tweak,
A truthy hop, a falsy squeak—
Guarded paths now shift their way,
Empty burrows skip the fray.
Streams still flow, carrots still sweet,
Commit hops in—change complete! 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@eranco74 eranco74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks ok. But it is just /streaming_query problem and not /query one?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
src/app/endpoints/streaming_query.py (1)

386-393: Add regression tests for empty conversation_id ('') and parity with /query — existing tests cover None but not ""

Short summary: rg output shows existing tests exercise conversation_id=None for both /query and /streaming_query, but I couldn't find tests that send conversation_id == "" (empty string). Add tests to ensure empty string is treated like None (no 403, starts a new conversation) and that /streaming_query matches /query behavior.

Places to update/add tests:

  • tests/unit/app/endpoints/test_streaming_query.py — existing None coverage (see tests around the get_agent/ retrieve_response assertions); add a test that posts QueryRequest(conversation_id="") and assert no 403 and a new conversation_id is returned in the stream start event.
  • tests/unit/app/endpoints/test_query.py — existing None coverage; add a test that calls query_endpoint_handler with QueryRequest(conversation_id="") and assert response.conversation_id is a new ID (no 403).
  • tests/unit/utils/test_endpoints.py — there are get_agent tests for conversation_id=None (e.g., test_get_agent_no_conversation_id); add a corresponding test for conversation_id="".

Relevant code locations that show current behavior (verify before changing tests):

  • src/models/requests.py @field_validator("conversation_id") (lines ~187–193) — validator uses truthiness (if value) so "" skips UUID check.
  • src/app/endpoints/query.py (lines ~161–177) — endpoint checks if query_request.conversation_id: (truthy check), so "" will be treated like no conversation_id.
  • src/app/endpoints/streaming_query.py (lines ~401–416) — same truthy check and 403 path only taken when conversation_id is truthy and ownership fails.

Suggested minimal test templates (adapt to existing fixtures/mocks):

  • For /query:
    • Patch retrieve_response to return ("LLM answer", "new_conv_id")
    • Call query_endpoint_handler(QueryRequest(query="x", conversation_id=""), auth=MOCK_AUTH)
    • Assert response.conversation_id == "new_conv_id" and no HTTPException raised
  • For /streaming_query:
    • Patch retrieve_response to return (mock_streaming_iterable, "new_conv_id")
    • Call streaming_query_endpoint_handler(..., query_request=QueryRequest(query="x", conversation_id=""), auth=MOCK_AUTH, mcp_headers={})
    • Assert the returned StreamingResponse yields a start event containing conversation_id == "new_conv_id" and no 403

Tag:

🧹 Nitpick comments (2)
src/app/endpoints/streaming_query.py (2)

25-26: Normalize conversation_id at the model layer for global consistency.

To guarantee uniform behavior across all endpoints and callers, normalize empty/blank conversation_id to None in QueryRequest. This removes the need for endpoint-level guards and prevents accidental propagation of blank IDs.

Example with Pydantic v2:

from pydantic import BaseModel, field_validator

class QueryRequest(BaseModel):
    conversation_id: str | None = None
    # ... other fields ...

    @field_validator('conversation_id', mode='before')
    @classmethod
    def empty_str_to_none(cls, v):
        if v is None:
            return None
        if isinstance(v, str) and not v.strip():
            return None
        return v

If you’re on Pydantic v1, use @validator('conversation_id', pre=True) with equivalent logic.


489-489: Set correct SSE Content-Type for StreamingResponse.

Explicitly mark the response as Server-Sent Events to avoid client/parsing issues.

-        return StreamingResponse(response_generator(response))
+        return StreamingResponse(
+            response_generator(response),
+            media_type="text/event-stream",
+        )

Optionally consider headers like Cache-Control: no-cache and Connection: keep-alive depending on your deployment/proxy.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d09b328 and e2c910f.

📒 Files selected for processing (1)
  • src/app/endpoints/streaming_query.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: e2e_tests
🔇 Additional comments (2)
src/app/endpoints/streaming_query.py (2)

402-405: Treating empty conversation_id as absent is correct and aligns behavior.

Switching to a truthiness check fixes the bug where an empty string was treated as a valid ID. This brings streaming_query in line with the non-streaming endpoint.


537-545: No change required — get_agent treats empty string as “no conversation”

get_agent uses a truthy check (if conversation_id:), so an empty string is falsy and it will NOT call client.agents.retrieve; it falls through to creating/using a new agent and session.

  • src/utils/endpoints.py:78-116 — get_agent signature and logic (if conversation_id: → conditional retrieve; else sets conversation_id = agent.agent_id and creates session).
  • src/app/endpoints/streaming_query.py:537-545 — streaming endpoint forwards query_request.conversation_id verbatim (your snippet).
  • src/app/endpoints/query.py:448-452 — non-streaming endpoint normalizes conversation_id to cid (os.path.normpath + lstrip), which yields "" for empty input; behavior is consistent.

Conclusion: get_agent already treats '' the same as None — no fix required.

@omertuc
Copy link
Contributor Author

omertuc commented Aug 14, 2025

it looks ok. But it is just /streaming_query problem and not /query one?

Depends on how you want to define the behavior... Is empty the same as absent or is it a value that should be treated like any other. Rather arbitrary in a not-yet v1 library

@tisnik
Copy link
Contributor

tisnik commented Aug 14, 2025

it looks ok. But it is just /streaming_query problem and not /query one?

Depends on how you want to define the behavior... Is empty the same as absent or is it a value that should be treated like any other. Rather arbitrary in a not-yet v1 library

IMHO both endpoints should behave the same, just the response is either one string or sequence of chunks.

@omertuc
Copy link
Contributor Author

omertuc commented Aug 14, 2025

it looks ok. But it is just /streaming_query problem and not /query one?

Depends on how you want to define the behavior... Is empty the same as absent or is it a value that should be treated like any other. Rather arbitrary in a not-yet v1 library

IMHO both endpoints should behave the same, just the response is either one string or sequence of chunks.

This PR is aligning their behavior, so it makes them the same.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 7a531cb into lightspeed-core:main Aug 14, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants