Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coroutine-safe Spans #12589

Merged
merged 5 commits into from Apr 7, 2024
Merged

Coroutine-safe Spans #12589

merged 5 commits into from Apr 7, 2024

Conversation

nerdai
Copy link
Contributor

@nerdai nerdai commented Apr 4, 2024

Description

  • This PR adds coroutine (ie async-tasks) safety for entering/exiting/dropping spans
  • For jobs that use async.gather() we should now use our async_utils as that is decorated with dispatcher.span and its nested (semaphore bounded) worker function is also decorated with the correct parent_id.
  • This PR also fixes the issue of asyncio.Lock() not being lazily constructed and thus breaks for jobs that don't have an async event loop.

image

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Added new notebook (that tests end-to-end)
  • I stared at the code and made sure it makes sense

@nerdai nerdai changed the title add async task stack Coroutine-safe Spans Apr 4, 2024
@nerdai nerdai marked this pull request as ready for review April 6, 2024 02:30
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 6, 2024
asyncio_mod = get_asyncio_module(show_progress=show_progress)
semaphore = asyncio.Semaphore(workers)

@dispatcher.async_span_with_parent_id(parent_id=parent_span_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically, anything span that spawns many jobs should use this decorator right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes with async that's right. The outer function to this should be decorated and we can extract the parent_id.

In fact, to be safe, we should use run_jobs() whenever we want to instrument the "simultaneous" execution of async tasks.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 7, 2024
@nerdai nerdai merged commit 9163067 into main Apr 7, 2024
8 checks passed
@nerdai nerdai deleted the nerdai/coroutine-safe-span branch April 7, 2024 22:55
chrisalexiuk-nvidia pushed a commit to chrisalexiuk-nvidia/llama_index that referenced this pull request Apr 25, 2024
* add async task stack

* just use asyncio.current_task

* wip

* delay asyncio Lock assignment in dispatcher

* fix async unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants