[tracing] Fix issue where actor/task is defined before `ray.init` is called #38323

edoakes · 2023-08-10T16:24:54Z

Why are these changes needed?

Fixes an issue where the _ray_trace_ctx kwarg isn't injected to the function signature if ray.init is called w/ a tracing hook after defining the function (see issue for repro).

The issue was we were checking _is_tracing_enabled at function definition time and selectively injecting the kwarg, but this variable isn't set until ray.init is called. I modified it to always inject the kwarg (matching the existing behavior for actor methods).

I've updated the tests to not explicitly call ray.init before defining the task.

Related issue number

Closes #26019

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

…tracing-bug

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

…tracing-bug

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

edoakes · 2023-08-15T19:36:04Z

python/ray/actor.py

@@ -163,6 +163,7 @@ def remote(self, *args, **kwargs):

        return FuncWrapper()

+    @wrap_auto_init


auto-init needs to happen before the tracing decorator runs because it calls get_runtime_context

edoakes · 2023-08-15T19:38:33Z

python/ray/util/tracing/tracing_helper.py

-    if not _is_tracing_enabled():
-        return function


This is the main change: if ray.init hasn't been called then _is_tracing_enabled isn't set properly yet.

Need to always inject this kwarg optimistically. This shouldn't affect anything unless users have a kwarg called _ray_trace_ctx in their function.

Note that this is already how it worked for actor methods.

…tracing-bug

rkooo567

do we need an unit test, or is it already covered?

edoakes · 2023-08-16T14:29:20Z

@rkooo567 it's covered by removing the ray.init in the fixture in the tests

…tracing-bug

…called (ray-project#38323) Fixes an issue where the `_ray_trace_ctx` kwarg isn't injected to the function signature if `ray.init` is called w/ a tracing hook _after_ defining the function (see issue for repro). The issue was we were checking `_is_tracing_enabled` at function definition time and selectively injecting the kwarg, but this variable isn't set until `ray.init` is called. I modified it to always inject the kwarg (matching the existing behavior for actor methods). I've updated the tests to not explicitly call `ray.init` before defining the task. Signed-off-by: harborn <gangsheng.wu@intel.com>

…called (ray-project#38323) Fixes an issue where the `_ray_trace_ctx` kwarg isn't injected to the function signature if `ray.init` is called w/ a tracing hook _after_ defining the function (see issue for repro). The issue was we were checking `_is_tracing_enabled` at function definition time and selectively injecting the kwarg, but this variable isn't set until `ray.init` is called. I modified it to always inject the kwarg (matching the existing behavior for actor methods). I've updated the tests to not explicitly call `ray.init` before defining the task.

…called (ray-project#38323) Fixes an issue where the `_ray_trace_ctx` kwarg isn't injected to the function signature if `ray.init` is called w/ a tracing hook _after_ defining the function (see issue for repro). The issue was we were checking `_is_tracing_enabled` at function definition time and selectively injecting the kwarg, but this variable isn't set until `ray.init` is called. I modified it to always inject the kwarg (matching the existing behavior for actor methods). I've updated the tests to not explicitly call `ray.init` before defining the task. Signed-off-by: e428265 <arvind.chandramouli@lmco.com>

#39362) The single_client_tasks_and_get_batch benchmark saw a ~0.5-1k tasks/s average regression (2k tasks/s on a local machine) due to #38323, which changed some tracing logic to unconditionally change the signature of every remote function to accomodate tracing during _inject_tracing_into_function. Make the signature change conditional again, but move it to the execution portion of RemoteFunction rather than the definition. Also make sure the injection only happens once even when the remote function is executed multiple times.

ray-project#39362) The single_client_tasks_and_get_batch benchmark saw a ~0.5-1k tasks/s average regression (2k tasks/s on a local machine) due to ray-project#38323, which changed some tracing logic to unconditionally change the signature of every remote function to accomodate tracing during _inject_tracing_into_function. Make the signature change conditional again, but move it to the execution portion of RemoteFunction rather than the definition. Also make sure the injection only happens once even when the remote function is executed multiple times.

#39362) (#39429) The single_client_tasks_and_get_batch benchmark saw a ~0.5-1k tasks/s average regression (2k tasks/s on a local machine) due to #38323, which changed some tracing logic to unconditionally change the signature of every remote function to accomodate tracing during _inject_tracing_into_function. Make the signature change conditional again, but move it to the execution portion of RemoteFunction rather than the definition. Also make sure the injection only happens once even when the remote function is executed multiple times.

ray-project#39362) The single_client_tasks_and_get_batch benchmark saw a ~0.5-1k tasks/s average regression (2k tasks/s on a local machine) due to ray-project#38323, which changed some tracing logic to unconditionally change the signature of every remote function to accomodate tracing during _inject_tracing_into_function. Make the signature change conditional again, but move it to the execution portion of RemoteFunction rather than the definition. Also make sure the injection only happens once even when the remote function is executed multiple times. Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>

…called (ray-project#38323) Fixes an issue where the `_ray_trace_ctx` kwarg isn't injected to the function signature if `ray.init` is called w/ a tracing hook _after_ defining the function (see issue for repro). The issue was we were checking `_is_tracing_enabled` at function definition time and selectively injecting the kwarg, but this variable isn't set until `ray.init` is called. I modified it to always inject the kwarg (matching the existing behavior for actor methods). I've updated the tests to not explicitly call `ray.init` before defining the task. Signed-off-by: Victor <vctr.y.m@example.com>

ray-project#39362) The single_client_tasks_and_get_batch benchmark saw a ~0.5-1k tasks/s average regression (2k tasks/s on a local machine) due to ray-project#38323, which changed some tracing logic to unconditionally change the signature of every remote function to accomodate tracing during _inject_tracing_into_function. Make the signature change conditional again, but move it to the execution portion of RemoteFunction rather than the definition. Also make sure the injection only happens once even when the remote function is executed multiple times. Signed-off-by: Victor <vctr.y.m@example.com>

edoakes added 9 commits August 10, 2023 11:24

WIP maybe?

ff1c719

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into fix-…

5cd3d7e

…tracing-bug

Merge branch 'master' of https://github.com/ray-project/ray into fix-…

00f1a02

…tracing-bug

Merge branch 'master' of https://github.com/ray-project/ray into fix-…

2807217

…tracing-bug

TMP

26b9691

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into fix-…

3bfc690

…tracing-bug

add auto init wrapper

38c4177

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

fix actor

5cdaa81

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

fix again

3f8cc73

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

edoakes changed the title ~~[WIP] Attempt to fix tracing issue~~ [tracing] Fix issue where actor/task is defined before ray.init is called Aug 15, 2023

edoakes commented Aug 15, 2023

View reviewed changes

edoakes requested review from jjyao and rkooo567 August 15, 2023 19:45

Merge branch 'master' of https://github.com/ray-project/ray into fix-…

0bea033

…tracing-bug

rkooo567 approved these changes Aug 16, 2023

View reviewed changes

Merge branch 'master' of https://github.com/ray-project/ray into fix-…

0db796a

…tracing-bug

edoakes assigned rkooo567 Aug 16, 2023

edoakes merged commit d52282e into ray-project:master Aug 16, 2023
92 of 98 checks passed

matthewdeng mentioned this pull request Aug 18, 2023

[tune] skip tracing context in test #38620

Merged

8 tasks

This was referenced Sep 6, 2023

[Core][Performance] single_client_tasks_and_get_batch regression #39259

Closed

[core] Fix performance regression in single_client_tasks_and_get_batch #39362

Merged

vitsai mentioned this pull request Sep 8, 2023

[core] Fix performance regression in single_client_tasks_and_get_batc… #39429

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tracing] Fix issue where actor/task is defined before `ray.init` is called #38323

[tracing] Fix issue where actor/task is defined before `ray.init` is called #38323

edoakes commented Aug 10, 2023 •

edited

edoakes Aug 15, 2023

edoakes Aug 15, 2023

rkooo567 left a comment

edoakes commented Aug 16, 2023

		@@ -163,6 +163,7 @@ def remote(self, args, *kwargs):

		return FuncWrapper()

		@wrap_auto_init

[tracing] Fix issue where actor/task is defined before ray.init is called #38323

[tracing] Fix issue where actor/task is defined before ray.init is called #38323

Conversation

edoakes commented Aug 10, 2023 • edited

Why are these changes needed?

Related issue number

Checks

edoakes Aug 15, 2023

Choose a reason for hiding this comment

edoakes Aug 15, 2023

Choose a reason for hiding this comment

rkooo567 left a comment

Choose a reason for hiding this comment

edoakes commented Aug 16, 2023

[tracing] Fix issue where actor/task is defined before `ray.init` is called #38323

[tracing] Fix issue where actor/task is defined before `ray.init` is called #38323

edoakes commented Aug 10, 2023 •

edited