Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix annotations and spans leaking between threads #10367

Merged
merged 8 commits into from Jun 22, 2023

Conversation

j-bennet
Copy link
Contributor

@j-bennet j-bennet commented Jun 20, 2023

Switch global annotations to use ContextVar instead of setting values in dask config.

Dependent PR in distributed: dask/distributed#7935

cc @fjetter @crusaderky

@j-bennet j-bennet marked this pull request as draft June 20, 2023 21:12
@j-bennet j-bennet changed the title Global annotations use ContextVar. Fix annotations and spans leaking between threads Jun 20, 2023
@j-bennet j-bennet force-pushed the j-bennet/10340-annotations-leak branch from e8ac766 to dfe06a4 Compare June 20, 2023 21:51
j-bennet added a commit to j-bennet/distributed that referenced this pull request Jun 20, 2023
@j-bennet j-bennet marked this pull request as ready for review June 20, 2023 22:03
@j-bennet j-bennet force-pushed the j-bennet/10340-annotations-leak branch from dfe06a4 to ff86f8f Compare June 20, 2023 22:10
@j-bennet
Copy link
Contributor Author

Initially, I was going to create a module for this code called dask.annotations. Unfortunately, we already have dask.annotations here:

from __future__ import annotations

The name would be clashing, so for now, I left it in base.py. If you have better suggestions, let me know.

Copy link
Member

@fjetter fjetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a test for this as well somewhere? #10340 provides an example that triggers this leakage w/out spans and I'd like to test this behavior.

dask/base.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@crusaderky crusaderky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing unit tests:

  • the code inside with dask.annotate() raises
  • annotations in two different asyncio tasks or threads[1] running in parallel do not interfere with each other

[1] only need one. Pick whatever is easier/cleaner to write.

dask/base.py Outdated Show resolved Hide resolved
dask/base.py Outdated Show resolved Hide resolved
dask/base.py Outdated Show resolved Hide resolved
dask/base.py Outdated Show resolved Hide resolved
dask/base.py Outdated Show resolved Hide resolved
dask/dataframe/io/parquet/core.py Outdated Show resolved Hide resolved
dask/highlevelgraph.py Outdated Show resolved Hide resolved
dask/base.py Outdated Show resolved Hide resolved
dask/tests/test_highgraph.py Outdated Show resolved Hide resolved
dask/tests/test_highgraph.py Outdated Show resolved Hide resolved
@crusaderky
Copy link
Collaborator

Please add line to docs/source/api.rst

@j-bennet j-bennet requested a review from crusaderky June 21, 2023 18:50
with dask.annotate(banana=5):
with dask.annotate(apple=3):
with pytest.raises(ZeroDivisionError):
_ = 1 / 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception is being swallowed by pytest.raises and never reaches dask.annotate.

@@ -186,10 +186,19 @@ def test_annotation_and_config_collision():
with dask.config.set({"foo": 1}):
with dask.annotate(foo=2):
assert dask.config.get("foo") == 1
assert dask.config.get("annotations") == {"foo": 2}
assert dask.get_annotations() == {"foo": 2}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test no longer makes sense

@github-actions github-actions bot added the documentation Improve or add to documentation label Jun 21, 2023
@crusaderky crusaderky force-pushed the j-bennet/10340-annotations-leak branch from 0f67385 to 561f501 Compare June 21, 2023 21:42
Copy link
Collaborator

@crusaderky crusaderky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing unit test:
annotations in two different asyncio tasks or threads[1] running in parallel do not interfere with each other.
[1] only need one. Pick whatever is easier/cleaner to write.

@j-bennet j-bennet force-pushed the j-bennet/10340-annotations-leak branch from 296ff00 to 26d6f0a Compare June 21, 2023 22:02
docs/source/api.rst Outdated Show resolved Hide resolved
@j-bennet
Copy link
Contributor Author

Missing unit test: annotations in two different asyncio tasks or threads[1] running in parallel do not interfere with each other. [1] only need one. Pick whatever is easier/cleaner to write.

I added a test called test_annotations_threaded, let me know if that looks right.

@j-bennet
Copy link
Contributor Author

test_datetime_std_with_larger_dataset is failing, but it's unrelated.

Copy link
Collaborator

@crusaderky crusaderky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready to be merged as soon as you revert the temporary CI changes in dask/distributed#7935

@crusaderky crusaderky merged commit 0454f34 into dask:main Jun 22, 2023
23 of 24 checks passed
@j-bennet j-bennet deleted the j-bennet/10340-annotations-leak branch June 22, 2023 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataframe documentation Improve or add to documentation io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants