-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: simplify ContextVar
and fix sub-span attribution when delegated to thread
#13700
Conversation
from llama_index.core.instrumentation.span.base import BaseSpan | ||
from llama_index.core.instrumentation.span.simple import SimpleSpan | ||
|
||
# ContextVar for managing active spans | ||
active_span_id: ContextVar[Optional[str]] = ContextVar("active_span_id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha. This is all we need eh? I didn't know that a single global ContextVar can be used to manage span_id's across both async tasks as well as threads. This definitely cleans up the manual shenanigans I resorted to using. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For threads, it also needs that Thread
wrapper I added. For async it's automatic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I saw that was really nice. So we should resort to using llama_index.core.types.Thread
from now on (CC: @logan-markewich).
|
||
token = active_span_id.set(id_) | ||
parent_id = token.old_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RogerHYang: I'm running into some trouble when trying to execute your code with the Basic Usage Notebook, particularly when it gets to Streaming section.
If token.old_value
returns Token.MISSING
marker object indicating that the context var had not been set before (https://docs.python.org/3/library/contextvars.html#contextvars.Token.old_value), but I don't believe that to be true.
Any ideas as to why I might be running into this? The async stream_chat of this works fine...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into it. I suspect it's because of the async nature of notebook. That's my oversight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries at all! Thanks very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. The root cause of the issue is that llama-index-agent-openai
package is not from this PR and so is still using the regular Thread
. Therefore the contextvar
value of None
didn't get carried cross the thread barrier when the openai agent runs. I have updated this PR to handle Token.MISSING
so that it would be backward-compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, makes senese. Thanks Roger for digging into that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR already updated all the import Thread statements via search and replace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay. Great thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have also bumped the minimum version requirements on the llama-index-core
. Sorry about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me submit a new PR for the version bump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. That's my miss too for not noticing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @RogerHYang for this!
I had tried using Thread
and context.run()
before (not nearly as nice as you got it here tho) and remember having some issues at that time with async
version (I believe).
Really happy about this PR and the really great code improvements you made here! 🙏
FYI: going to cut a new release of |
…d to thread (run-llama#13700) * fix: simplify contextvar * test trees in parallel * fix Token.MISSING
Issue
Sub-spans delegated to threads cannot find its parent.
For example, this can happen here:
llama_index/llama-index-core/llama_index/core/chat_engine/simple.py
Lines 122 to 124 in 56a0e70
Furthermore, in the thread shown above, the following event fires with a
span_id
from outside the thread, but the outside span may no longer be open when the event fires (when streaming is done).llama_index/llama-index-core/llama_index/core/llms/callbacks.py
Lines 97 to 98 in 56a0e70
Before
Sub-spans delegated to threads are separated from parent span, resulting in multiple traces.
After
Sub-spans can now find their correct
parent_span_id
.Original Behavior
The original behavior was captured in this notebook.
Code
Below is code producing the screenshots above.