Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add span_id attribute to Events (instrumentation) #12417

Merged
merged 18 commits into from Apr 2, 2024

Conversation

nerdai
Copy link
Contributor

@nerdai nerdai commented Mar 29, 2024

Description

  • This PR aims to associate an Event with the Span enclosure where it takes place
  • Specifically a span_id field is added to BaseEvent
  • Coroutine safety is handled by assigning a current_span_id to dispatcher which we use asyncio.Lock to update when working with async funcs
  • When firing an event with dispatcher, we first get dispatcher.current_span_id and store it into a variable that can be re-used within the scope of the func
  • Also added a SpanDropEvent that fires if a Span is being dropped due to an encountered Exception.
    • First image: BaseQueryEngine.aquery span IS NOT dropped so QueryEndEvent is observed
    • Second image: BaseQueryengine.aquery span IS dropped so SpanDropEvent is observed and QueryEndEvent is not

image
image

Fixes # (issue)

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Added new unit/integration tests
  • Ran notebooks (that tests end-to-end)
  • I stared at the code and made sure it makes sense

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Mar 29, 2024
@nerdai nerdai enabled auto-merge (squash) March 30, 2024 19:24
@@ -365,7 +365,8 @@ def _run_step(
**kwargs: Any,
) -> TaskStepOutput:
"""Execute step."""
dispatcher.event(AgentRunStepStartEvent())
span_id = dispatcher.current_span_id
dispatcher.event(AgentRunStepStartEvent(span_id=span_id))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than getting the span-id and passing it in, could event() attach this automatically?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(would make development easier and less error-prone)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually why the callback manager used a context manager
with callback_manager.event() as event:

So if the dispatcher needs a similar pattern, we could use that again, or if possible, it just gets handled automatically

Copy link
Contributor Author

@nerdai nerdai Mar 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. We could maybe try to adopt a context manager approach:

with dispatcher.span_session() as dispatch_event:
     dispatch_event()
     # perform other logic
     dispatch_event()

So, here I think we add span_id arg to event() i.e., dispatcher.event(span_id, event). Thus, span_id will added to the event within .event(). And then make it so that the context manager yields a partial of dispatcher.event(span_id, event) i.e., dispatcher_event = partial(dispatcher.event, span_id=dispatcher.current_span_id)

# dispatcher.py
def event(event: BaseEvent, span_id: str = ""):
     # modify span_id field of event
     event.span_id = span_id
     ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we already decorated the function, I wonder if we can do something like this
https://stackoverflow.com/questions/12435765/can-a-decorated-function-access-variables-of-the-decorator

It would completely hide this from the implementation (and maintains a cleaner look to the code for readability)

Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way, when .event() is called, it can automatically pull the span ID from globals? Maybe? (if the injected global is a function/lambda?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that doesn't work, then context manager it is (even though its slightly less neat)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For visibility (lol we're discussing via slack on this too -- my b):

I think this will work! Attach the attr to the wrapper which should still be coroutine safe I think. Will give it a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, wrapt complicated things for me a bit. Resorted to using context manager approach as described previously in this thread.

@nerdai nerdai disabled auto-merge March 31, 2024 16:22
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Apr 1, 2024
@nerdai nerdai force-pushed the nerdai/instrumentation-event-span-tying branch from e07e42c to f090642 Compare April 1, 2024 03:40
@nerdai nerdai force-pushed the nerdai/instrumentation-event-span-tying branch from 91c7efc to 92144e7 Compare April 1, 2024 16:42
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Apr 1, 2024
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 2, 2024
@nerdai nerdai merged commit 95be930 into main Apr 2, 2024
8 checks passed
@nerdai nerdai deleted the nerdai/instrumentation-event-span-tying branch April 2, 2024 04:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants