Do not create new session if conversation_id is provided #163

rawagner · 2025-07-01T12:45:09Z

Description

Before this change, a new session is created for every query, which makes the assistant hardly use-able.

Type of change

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

manstis

I had a quick look out of curiosity really.

I think how a conversation_id is created needs some thought as it's really the llama-stack session_id now.

src/app/endpoints/query.py

eranco74 · 2025-07-02T07:59:58Z

@onmete how is this done in OLS?

tisnik · 2025-07-02T08:01:39Z

@onmete how is this done in OLS?

@eranco74:

there is no Llama Stack involved in OLS.
So just conversation_id is needed:

when not provided in query endpoint, it is created
when provided, it is used as is

Nothing super special there

rawagner · 2025-07-02T08:08:21Z

Thanks for the quick review.

I've looked into this a bit further and realized that a new Agent is created on every request - which results in having empty sessions. I've added a code to cache the Agents so that we can reuse the instances in a following requests.

I don't think that is the solution you'd want to go with and I'm not familiar with the code-base - though, at least, it helps to demonstrate the problem ?

manstis · 2025-07-02T09:11:34Z

@rawagner I think this is a step in the right direction.

Personally, if a conversation_id was not present in the QueryRequest I'd first create an Agent and session; returning the session_id as the conversation_id... they are one and the same thing just with different terms (conversation_id, IIRC, is a road-core term carried into lightspeed-core and session_id is a llama-stack term.. but they both mean the same thing).

You are correct that we need to re-use Agent for a conversation.

Looking at the llama-stack source code for an Agent the persistent store (for the conversation) is created when the Agent is instantiated. If we create a new Agent for each QueryRequest a new agent_id is assigned and a new persistent store created.... meaning we effectively loose any conversational history.

Similar changes will be needed for https://github.com/lightspeed-core/lightspeed-stack/blob/main/src/app/endpoints/streaming_query.py

I suspect this PR also addresses #122

Finally, and conscious of my participation on lightspeed-core, I'm just a contributor like yourself. I am not part of the lightspeed-core team and therefore review/approval/rejection of this PR is really for the likes of @tisnik to decide.

I do however believe that this PR is important, addressing a serious flaw in the existing code.

tisnik

I'm perfectly ok with update like this, but we'd need some integration and/or end to end tests for it to be able to simulate multiple requests from different clients etc. Gimme some time please

manstis

LGTM 👍

Thank-you @rawagner I think this is a great improvement on your original PR.

This will however need applying to streaming_query.py too.

tisnik

Looks correct in overall, just have some nitpicks. Pls update, then we'll merge. TYVM

src/app/endpoints/query.py

tisnik · 2025-07-03T08:40:09Z

@rawagner it looks nice now! You'd need rebase/resolve conflict - the same code was changed meantime. But it should be easy. Thanks a lot in advance!

manstis · 2025-07-03T08:41:49Z

it looks nice now! You'd need rebase/resolve conflict - the same code was changed meantime. But it should be easy. Thanks a lot in advance!

@rawagner @tisnik

And we need the same changes made to the streaming_query.py endpoint handler 👍

rawagner · 2025-07-03T08:46:00Z

Looking into it. Thanks for the reviews!

rawagner · 2025-07-03T08:58:22Z

For now, i've just quick-fixed the streaming_query. I am looking for a proper solution.

tisnik

ok

umago

Code-wise it looks good, thanks very much for this addition.

I'm however a bit skeptical about the use of the expiringdict library, it seems like an abandoned project. Fortunately, there's another project which is well-maintained with a similar syntax that we can use. Left more details inline.

umago · 2025-07-03T21:01:19Z

pyproject.toml

    "uvicorn>=0.34.3",
    "llama-stack>=0.2.13",
    "rich>=14.0.0",
+    "expiringdict>=1.2.2",


Not sure if we should be using expiringdict, the project looks abandoned [0] the last release was in 2022 and the code repository is marked as not active [1].

May I suggest using cachetools for this ? Very similar syntax and well maintained project [2]

from cachetools import TTLCache import time cache = TTLCache(maxsize=100, ttl=5) # max 100 items, TTL 5 seconds cache['foo'] = 'bar' print(cache['foo']) # Prints bar time.sleep(6) print(cache.get('foo')) # -> None (expired)

[0] https://pypi.org/project/expiringdict/#history
[1] https://app.travis-ci.com/github/mailgun/expiringdict/
[2] https://pypi.org/project/cachetools/#history

great suggestion. Switched to cachetools.

umago · 2025-07-03T21:04:48Z

src/app/endpoints/query.py

 import logging
 import os
 from pathlib import Path
 from typing import Any


Nit: an empty line separating the built-in library to the 3rd party libraries

manstis

The changes as they are are generally fine 👍

However it does introduce an edge-case that could be problematic. I propose a solution.

The edge-case would only manifest itself if different requests had different system_prompt.

Whether:

It's considered an unlikely scenario to worry about
It should be fixed in this PR
It should be fixed in a new PR

I leave to you.

manstis · 2025-07-04T08:15:09Z

src/app/endpoints/query.py

+    agent = Agent(
+        client,
+        model=model_id,
+        instructions=system_prompt,


This represents an interesting scenario I missed earlier.

system_prompt is part of the QueryRequest class that is used per request.

The Agent is now cached with whatever system_prompt was supplied on the first request.

Meaning QueryRequest.system_prompt effectively becomes redundant.

I think we should probably remove instructions=system_prompt from here and ...

TBH, it seems strange that the system prompt is passed in the request.

Also relates to [RFE] Move system_prompt to configuration #123

I think that we should keep this and in case the query_request.system_prompt isn't empty we shold add it to the messages (as you suggested).

Unsure whether that is the right thing to do here, perhaps it's better to differ this and solve it in another PR (along with #123.

manstis · 2025-07-04T08:20:39Z

src/app/endpoints/query.py

+
+    vector_db_ids = [vector_db.identifier for vector_db in client.vector_dbs.list()]
+    response = agent.create_turn(
+        messages=[UserMessage(role="user", content=query_request.query)],


... and change this to be:

messages=[ UserMessage(role="user", content=query_request.query), SystemMessage(role="system", content=query_request.system_prompt), ]

This is what llama-stack is doing already:

https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/agents/meta_reference/agent_instance.py#L223

https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/agents/meta_reference/agent_instance.py#L168-L175

manstis · 2025-07-04T08:20:58Z

src/app/endpoints/streaming_query.py

+    agent = AsyncAgent(
+        client,  # type: ignore[arg-type]
+        model=model_id,
+        instructions=system_prompt,


Likewise; remove this line.

manstis · 2025-07-04T08:21:30Z

src/app/endpoints/streaming_query.py

        vector_db.identifier for vector_db in await client.vector_dbs.list()
    ]
    response = await agent.create_turn(
        messages=[UserMessage(role="user", content=query_request.query)],


Likewise, add a SystemMessage here for the system_prompt.

There's two PRs that haven't been merged yet, but are useful: lightspeed-core/lightspeed-stack#163 (tl;dr - conversation_id retention fix) openshift-assisted/assisted-installer-ui#3016 (tl;dr - draft UI for the chatbot) This commit updates the submodules to checkout those PRs

There's two PRs that haven't been merged yet, but are useful: lightspeed-core/lightspeed-stack#163 (tl;dr - conversation_id retention fix) openshift-assisted/assisted-installer-ui#3016 (tl;dr - draft UI for the chatbot) This commit updates the submodules to checkout those PRs Also updates the assisted-chat-pod.yaml to use correctly set up the UI pod so it's accessible at http://localhost:8080 and can communicate with the lightspeed-stack container. This still doesn't fully work because the proxy in the container cannot be configured to use the token / URL yet

tisnik

Please:

rebase
remove pdm.lock completely, we switched to uv to meantime (sorry, too many changes occurred in the last week)

rawagner · 2025-07-07T07:18:30Z

I've rebased & switched to cachetools as suggested.
for now i've not made any changes to system_prompt as the discussion on the proper solution is ongoing & I'd rather see those changes in a separate PR.

manstis reviewed Jul 1, 2025

View reviewed changes

src/app/endpoints/query.py Outdated Show resolved Hide resolved

src/app/endpoints/query.py Outdated Show resolved Hide resolved

src/app/endpoints/query.py Outdated Show resolved Hide resolved

rawagner force-pushed the fix_session branch 3 times, most recently from 25b9c40 to 215f914 Compare July 2, 2025 06:09

tisnik reviewed Jul 2, 2025

View reviewed changes

rawagner force-pushed the fix_session branch 2 times, most recently from 177c221 to 8100bd1 Compare July 2, 2025 12:29

manstis requested changes Jul 3, 2025

View reviewed changes

tisnik requested changes Jul 3, 2025

View reviewed changes

src/app/endpoints/query.py Outdated Show resolved Hide resolved

src/app/endpoints/query.py Outdated Show resolved Hide resolved

rawagner force-pushed the fix_session branch from 8100bd1 to 3cbaee5 Compare July 3, 2025 08:34

rawagner force-pushed the fix_session branch from 3cbaee5 to 7478fcf Compare July 3, 2025 08:56

rawagner force-pushed the fix_session branch 2 times, most recently from 7cf0493 to 9d60493 Compare July 3, 2025 09:16

tisnik approved these changes Jul 3, 2025

View reviewed changes

rawagner force-pushed the fix_session branch from 9d60493 to 54b4149 Compare July 3, 2025 13:13

rawagner mentioned this pull request Jul 3, 2025

Pass mcp config and auth headers in streaming_query too #179

Merged

18 tasks

umago suggested changes Jul 3, 2025

View reviewed changes

manstis reviewed Jul 4, 2025

View reviewed changes

omertuc mentioned this pull request Jul 4, 2025

Use unmerged PRs rh-ecosystem-edge/assisted-chat#11

Merged

tisnik requested changes Jul 7, 2025

View reviewed changes

rawagner force-pushed the fix_session branch from 54b4149 to 4ac7d40 Compare July 7, 2025 07:17

rawagner force-pushed the fix_session branch from 4ac7d40 to c928f32 Compare July 7, 2025 07:40

Do not create new session if conversation_id is provided

48248a0

rawagner force-pushed the fix_session branch from c928f32 to 48248a0 Compare July 7, 2025 07:42

tisnik closed this pull request by merging all changes into lightspeed-core:main in 5737dbf Jul 7, 2025

Do not create new session if conversation_id is provided #163

Do not create new session if conversation_id is provided #163

Uh oh!

Conversation

rawagner commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Checklist before requesting a review

Testing

Uh oh!

manstis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eranco74 commented Jul 2, 2025

Uh oh!

tisnik commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rawagner commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

manstis commented Jul 2, 2025

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

manstis left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tisnik commented Jul 3, 2025

Uh oh!

manstis commented Jul 3, 2025

Uh oh!

rawagner commented Jul 3, 2025

Uh oh!

rawagner commented Jul 3, 2025

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

umago left a comment

Choose a reason for hiding this comment

Uh oh!

umago Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

rawagner Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

umago Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

manstis left a comment

Choose a reason for hiding this comment

Uh oh!

manstis Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

eranco74 Jul 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

manstis Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

manstis Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

manstis Jul 4, 2025

Choose a reason for hiding this comment

rawagner commented Jul 1, 2025 •

edited

Loading

tisnik commented Jul 2, 2025 •

edited

Loading

rawagner commented Jul 2, 2025 •

edited

Loading

eranco74 Jul 6, 2025 •

edited

Loading

rawagner commented Jul 7, 2025 •

edited

Loading