Unified management of thread local variables #28520

albanD · 2019-10-23T17:36:04Z

We should add a centralized thread local handling (in c10).
This would have two main advantages:

Make the code cleaner as all thread local elements can easily be identified and worked with.
Allow improvement for example in at::parallel_for where it will be able to handle any Tensor operation within the loop.

(Setting high priority only to be discussed for triage)

cc @ezyang @gchanan @zou3519 @jerryzh168 @ssnl @albanD @gqchen @VitalyFedyunin @ngimel @mruberry

The text was updated successfully, but these errors were encountered:

v0dro · 2019-12-14T09:23:48Z

@albanD can you provide some samples of what exactly you want solved with this issue and approaches that you might have in mind?

albanD · 2019-12-16T09:49:18Z

This is discussed in #28370 for example.
The problem is that both our no grad logic and no variable type logic are thread local. And that leads to unsafe behavior in the current codebase: #28980
Beyond fixing the current behavior, we want to actually allow the use of Tensors in such constructs to enable use cases like this one for example.

zdevito · 2020-03-10T18:19:17Z

We probably want something like:

unique_ptr<SavedThreadLocalState> saveThreadLocalState();
restoreThreadLocalState(const SavedThreadLocalState&);

#34360 is an instance that needs this centralized state.

ilia-cher · 2020-03-10T21:51:22Z

it seems we already basically have this, but in a form of ThreadLocalDebugInfoBase and DebugInfoGuard, the name though is misleading - it doesn't have to be debug info; we can extend this as we want and we also already take care of fork and autograd threads. I'd suggest:

refactor DebugInfoGuard into ThreadLocalStateGuard (and so on)
there's been a discussion on passing extra bits of info along model execution and allowing a python interface (context manager) to do it from python
no_grad context manager may as well be rewritten to use this thread local (+propagated across threads) struct; at the moment it simply uses https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/grad_mode.cpp#L13
I think we may find more cases of thread local vars that can be included within this struct
add propagation to the threads that execute at::parallel_for (straightforward, similarly to how we pass it into 'fork')

albanD · 2020-03-10T22:42:28Z

For 4. are the thread_local associated to the dispatcher already there? Otherwise, these ones should be added as well.

ngimel · 2020-03-11T04:44:00Z

No, dispatcher keys are not there yet.

peterbell10 · 2020-03-27T16:02:28Z

It looks like ThreadLocalDebugInfo only supports one struct of values for the whole program. I think it should be possible to have a ThreadLocal<T> class that automatically stores everything in one global context e.g.

ThreadLocal<bool> GradMode_enabled = true;

Thoughts?

ezyang · 2020-03-30T13:26:21Z

@peterbell10 I don't think we have any requirement for extensible thread local state, in which case hard coding the struct is simpler and more efficient. Do you have a use case in mind?

peterbell10 · 2020-03-30T14:58:42Z

The original PR that added ThreadLocalDebugInfo (#22365) uses it for a test case. #35523 seems to support that by holding a linked list of thread local states with a DebugInfoKind key value.

ilia-cher · 2020-03-30T19:54:15Z

#35523 addresses this, it cleans up our internal interfaces and propagates thread locals (e.g. grad_mode, dispatch key) as well as thread local debug info which is set by the user

ilia-cher · 2020-03-30T19:55:41Z

@peterbell10 I don't think we have any requirement for extensible thread local state, in which case hard coding the struct is simpler and more efficient. Do you have a use case in mind?

@ezyang we already have a use case for this in prod, e.g. passing user_id from the user down to logging operator observers

mattip · 2020-05-21T14:00:03Z

#35523 addresses this

gh-35523 is merged. @ilia-cher will you be continuing to work on this?

albanD added feature A request for a proper, new feature. triage review high priority labels Oct 23, 2019

ilia-cher mentioned this issue Mar 18, 2020

[autograd] allow PyNode to persist error message #34845

Closed

ngimel mentioned this issue Mar 27, 2020

Unify management of thread local settings #35523

Closed

ezyang assigned ilia-cher Mar 30, 2020

hameerabbasi mentioned this issue May 27, 2020

RFC-0001: Add method __torch_function__ RFC. pytorch/rfcs#3

Merged

ilia-cher removed their assignment Sep 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unified management of thread local variables #28520

Unified management of thread local variables #28520

albanD commented Oct 23, 2019 •

edited by pytorch-probot bot

v0dro commented Dec 14, 2019

albanD commented Dec 16, 2019

zdevito commented Mar 10, 2020

ilia-cher commented Mar 10, 2020 •

edited

albanD commented Mar 10, 2020

ngimel commented Mar 11, 2020

peterbell10 commented Mar 27, 2020

ezyang commented Mar 30, 2020

peterbell10 commented Mar 30, 2020

ilia-cher commented Mar 30, 2020

ilia-cher commented Mar 30, 2020

mattip commented May 21, 2020

Unified management of thread local variables #28520

Unified management of thread local variables #28520

Comments

albanD commented Oct 23, 2019 • edited by pytorch-probot bot

v0dro commented Dec 14, 2019

albanD commented Dec 16, 2019

zdevito commented Mar 10, 2020

ilia-cher commented Mar 10, 2020 • edited

albanD commented Mar 10, 2020

ngimel commented Mar 11, 2020

peterbell10 commented Mar 27, 2020

ezyang commented Mar 30, 2020

peterbell10 commented Mar 30, 2020

ilia-cher commented Mar 30, 2020

ilia-cher commented Mar 30, 2020

mattip commented May 21, 2020

albanD commented Oct 23, 2019 •

edited by pytorch-probot bot

ilia-cher commented Mar 10, 2020 •

edited