[DISCUSSION] RPC server-side ThreadLocalState #38510
Labels
module: multithreading
Related to issues that occur when running on multiple CPU threads
module: rpc
Related to RPC, distributed autograd, RRef, and distributed optimizer
oncall: jit
Add this issue/PR to JIT oncall triage queue
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
This is a followup discussion of #38439. We fix #38439 by restoring
ThreadLocalState
and distributed autograd context separately. Creating this issue to track discussion on whether distributed autograd context id belongs toThreadLocalState.h
or we should createRpcThreadLocalState.h
or else. Below are some concerns and notes from an offline discussion with @xush6528 @ilia-cher and @pritamdamania87,ThreadLocalState.h
to set/skip autograd context id.ThreadLocalState.h
, we should also dedup the similar logic added for RPC's TorchScript support. See [DistAutograd x JIT] Capture global state, dist autograd current context id, before thread switching triggered by JIT future.wait() #36395cc @suo @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @rohan-varma @xush6528 @jjlilley @osalpekar
The text was updated successfully, but these errors were encountered: