-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-sync with internal repository #6
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chadaustin
approved these changes
Oct 15, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
facebook-github-bot
pushed a commit
that referenced
this pull request
Mar 17, 2021
Summary: This can cause a deadlock if `func()` calls `create_callsite`. Usually it's rare. But (Python) signal handler makes it realistic. Example backtrace: Traceback (most recent call first): File "/opt/fb/mercurial/edenscm/tracing.py", line 337, in event File "/opt/fb/mercurial/edenscm/mercurial/ui.py", line 1275, in debug tracing.debug(msg.rstrip("\n"), depth=1) File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 608, in _reapworkers self.ui.debug("worker process exited (pid=%d)\n" % pid) File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 591, in _sigchldhandler self._reapworkers(os.WNOHANG) File "/opt/fb/mercurial/edenscm/tracing.py", line 337, in event File "/opt/fb/mercurial/edenscm/mercurial/ui.py", line 1275, in debug tracing.debug(msg.rstrip("\n"), depth=1) File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 608, in _reapworkers self.ui.debug("worker process exited (pid=%d)\n" % pid) File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 591, in _sigchldhandler self._reapworkers(os.WNOHANG) #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #1 0x000055d0d65ba339 in <parking_lot::raw_rwlock::RawRwLock>::lock_upgradable_slow () #2 0x000055d0d55b5814 in tracing_runtime_callsite::create_callsite::<tracing_runtime_callsite::callsite_info::EventKindType, pytracing::new_callsite<tracing_runtime_callsite::callsite_info::EventKindType>::{closure#2}> () #3 0x000055d0d5584cb9 in <pytracing::EventCallsite>::__new__ () #4 0x000055d0d55a3eaa in std::panicking::try::<*mut python3_sys::object::PyObject, cpython::function::handle_callback<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter>::{closure#0}> () #5 0x000055d0d5589365 in cpython::function::handle_callback::<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter> () #6 0x000055d0d55856e1 in <pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc () #7 0x00007ff88d576230 in type_call ( kwds={'obj': Frame 0x7ff87c1f8c40, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87d...(truncated), args=(), type=0x55d0d8ea5b40 <_RNvNvMs1R_CsgCrAUYYhx1D_9pytracingNtB8_13EventCallsite15create_instance11TYPE_OBJECT.llvm.4665269759137401160>) at Objects/typeobject.c:974 #8 _PyObject_MakeTpCall (callable=<type at remote 0x55d0d8ea5b40>, args=<optimized out>, nargs=<optimized out>, keywords=<optimized out>) at Objects/call.c:159 #9 0x00007ff88d56dc81 in _PyObject_Vectorcall ( kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), nargsf=<optimized out>, args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>) at ./Include/cpython/abstract.h:125 #10 _PyObject_Vectorcall (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), nargsf=<optimized out>, args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>) at ./Include/cpython/abstract.h:115 #11 call_function (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:4963 #12 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515 #13 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87cced010, for file /opt/fb/mercurial/edenscm/tracing.py, line 337, in event (message='worker process exited (pid=3953080)', name=None, target=None, level=1, depth=1, meta={}, frame=Frame 0x7ff87c1f8c40, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': ...(truncated)) at Python/ceval.c:741 #14 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c0fdc78, kwcount=<optimized out>, kwstep=1, defs=0x7ff87d572558, defcount=4, kwdefs=0x0, closure=0x0, name='event', qualname='event') at Python/ceval.c:4298 #15 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c0fdc70, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #16 0x00007ff88d57574f in _PyObject_FastCallDict (callable=<function at remote 0x7ff87d5741f0>, args=0x7ff87eb63838, nargsf=<optimized out>, kwargs=<optimized out>) at Objects/call.c:104 #17 0x00007ff88d66d1ab in partial_fastcall (kwargs={'level': 1, 'depth': 1}, nargs=<optimized out>, args=<optimized out>, pto=0x7ff87d572630) at ./Modules/_functoolsmodule.c:169 #18 partial_call (pto=0x7ff87d572630, args=<optimized out>, kwargs=<optimized out>) at ./Modules/_functoolsmodule.c:224 #19 0x00007ff88d576331 in _PyObject_MakeTpCall (callable=<functools.partial at remote 0x7ff87d572630>, args=<optimized out>, nargs=<optimized out>, keywords=<optimized out>) at Objects/object.c:2207 #20 0x00007ff88d56dc81 in _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>, args=<optimized out>, callable=<functools.partial at remote 0x7ff87d572630>) at ./Include/cpython/abstract.h:125 #21 _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>, args=<optimized out>, callable=<functools.partial at remote 0x7ff87d572630>) at ./Include/cpython/abstract.h:115 #22 call_function (kwnames=('depth',), oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:4963 #23 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515 #24 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87c0e43c0, for file /opt/fb/mercurial/edenscm/mercurial/ui.py, line 1275, in debug (self=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>, 'bookmarks': <itemregister(_generics=se...(truncated)) at Python/ceval.c:741 #25 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c1f8de8, kwcount=<optimized out>, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name='debug', qualname='ui.debug') at Python/ceval.c:4298 #26 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c1f8dd8, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #27 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ff87c1f8dd8, callable=<function at remote 0x7ff87d57e9d0>) at ./Include/cpython/abstract.h:127 #28 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x7ff88ca08780) at Python/ceval.c:4963 #29 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #30 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87c1f8c40, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>,...(truncated)) at Python/ceval.c:738 #31 function_code_fastcall (globals=<optimized out>, nargs=2, args=<optimized out>, co=<optimized out>) at Objects/call.c:283 #32 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:410 #33 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ff87c0f26d8, callable=<function at remote 0x7ff87d73b310>) at ./Include/cpython/abstract.h:127 #34 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x7ff88ca08780) at Python/ceval.c:4963 #35 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #36 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87c0f2550, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 591, in _sigchldhandler (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemreg ister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a...(truncated)) at Python/ceval.c:738 #37 function_code_fastcall (globals=<optimized out>, nargs=3, args=<optimized out>, co=<optimized out>) at Objects/call.c:283 #38 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:410 #39 0x00007ff88d592153 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:115 #40 method_vectorcall (method=<optimized out>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/classobject.c:67 #41 0x00007ff88d5963fb in PyVectorcall_Call (kwargs=0x0, tuple=<optimized out>, callable=<method at remote 0x7ff87c70d5c0>) at Objects/dictobject.c:1802 #42 PyObject_Call (callable=<method at remote 0x7ff87c70d5c0>, args=<optimized out>, kwargs=0x0) at Objects/call.c:227 #43 0x00007ff88d6405ea in _PyErr_CheckSignals () at ./Modules/signalmodule.c:1689 #44 0x00007ff88d5a41a1 in _PyErr_CheckSignals () at Objects/object.c:577 #45 PyErr_CheckSignals () at ./Modules/signalmodule.c:1649 #46 PyObject_Str (v='_reapworkers') at Objects/object.c:561 #47 0x000055d0d557c821 in pytracing::tostr_opt () #48 0x000055d0d55b5a7d in tracing_runtime_callsite::create_callsite::<tracing_runtime_callsite::callsite_info::EventKindType, pytracing::new_callsite<tracing_runtime_callsite::callsite_info::EventKindType>::{closure#2}> () #49 0x000055d0d5584cb9 in <pytracing::EventCallsite>::__new__ () #50 0x000055d0d55a3eaa in std::panicking::try::<*mut python3_sys::object::PyObject, cpython::function::handle_callback<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter>::{closure#0}> () #51 0x000055d0d5589365 in cpython::function::handle_callback::<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter> () #52 0x000055d0d55856e1 in <pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc () #53 0x00007ff88d576230 in type_call ( kwds={'obj': Frame 0x7ff87c1f8440, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87d...(truncated), args=(), type=0x55d0d8ea5b40 <_RNvNvMs1R_CsgCrAUYYhx1D_9pytracingNtB8_13EventCallsite15create_instance11TYPE_OBJECT.llvm.4665269759137401160>) at Objects/typeobject.c:974 #54 _PyObject_MakeTpCall (callable=<type at remote 0x55d0d8ea5b40>, args=<optimized out>, nargs=<optimized out>, keywords=<optimized out>) at Objects/call.c:159 #55 0x00007ff88d56dc81 in _PyObject_Vectorcall ( kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), nargsf=<optimized out>, args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>) at ./Include/cpython/abstract.h:125 #56 _PyObject_Vectorcall (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), nargsf=<optimized out>, args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>) at ./Include/cpython/abstract.h:115 #57 call_function (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:4963 #58 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515 #59 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87ccec890, for file /opt/fb/mercurial/edenscm/tracing.py, line 337, in event (message='worker process exited (pid=3953122)', name=None, target=None, level=1, depth=1, meta={}, frame=Frame 0x7ff87c1f8440, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _u nserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': ...(truncated)) at Python/ceval.c:741 #60 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c0fdd78, kwcount=<optimized out>, kwstep=1, defs=0x7ff87d572558, defcount=4, kwdefs=0x0, closure=0x0, name='event', qualname='event') at Python/ceval.c:4298 #61 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c0fdd70, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #62 0x00007ff88d57574f in _PyObject_FastCallDict (callable=<function at remote 0x7ff87d5741f0>, args=0x7ff87eb59d18, nargsf=<optimized out>, kwargs=<optimized out>) at Objects/call.c:104 #63 0x00007ff88d66d1ab in partial_fastcall (kwargs={'level': 1, 'depth': 1}, nargs=<optimized out>, args=<optimized out>, pto=0x7ff87d572630) at ./Modules/_functoolsmodule.c:169 #64 partial_call (pto=0x7ff87d572630, args=<optimized out>, kwargs=<optimized out>) at ./Modules/_functoolsmodule.c:224 #65 0x00007ff88d576331 in _PyObject_MakeTpCall (callable=<functools.partial at remote 0x7ff87d572630>, args=<optimized out>, nargs=<optimized out>, keywords=<optimized out>) at Objects/object.c:2207 #66 0x00007ff88d56dc81 in _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>, args=<optimized out>, callable=<functools.partial at remote 0x7ff87d572630>) at ./Include/cpython/abstract.h:125 #67 _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>, args=<optimized out>, callable=<functools.partial at remote 0x7ff87d572630>) at ./Include/cpython/abstract.h:115 #68 call_function (kwnames=('depth',), oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:4963 #69 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515 #70 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87c0e4200, for file /opt/fb/mercurial/edenscm/mercurial/ui.py, line 1275, in debug (self=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>, 'bookmarks': <itemregister(_generics=se...(truncated)) at Python/ceval.c:741 #71 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c1f85e8, kwcount=<optimized out>, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name='debug', qualname='ui.debug') at Python/ceval.c:4298 #72 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c1f85d8, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #73 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ff87c1f85d8, callable=<function at remote 0x7ff87d57e9d0>) at ./Include/cpython/abstract.h:127 #74 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x7ff88ca08780) at Python/ceval.c:4963 #75 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #76 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87c1f8440, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>,...(truncated)) at Python/ceval.c:738 #77 function_code_fastcall (globals=<optimized out>, nargs=2, args=<optimized out>, co=<optimized out>) {<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>,...(truncated)) at Python/ceval.c:738 #77 function_code_fastcall (globals=<optimized out>, nargs=2, args=<optimized out>, co=<optimized out>) --Type <RET> for more, q to quit, c to continue without paging-- at Objects/call.c:283 #78 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:410 #79 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ff87c0f2528, callable=<function at remote 0x7ff87d73b310>) at ./Include/cpython/abstract.h:127 #80 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x7ff88ca08780) at Python/ceval.c:4963 #81 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #82 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7ff87c0f23a0, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 591, in _sigchldhandler (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a...(truncated)) at Python/ceval.c:738 #83 function_code_fastcall (globals=<optimized out>, nargs=3, args=<optimized out>, co=<optimized out>) at Objects/call.c:283 #84 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:410 #85 0x00007ff88d592153 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:115 #86 method_vectorcall (method=<optimized out>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/classobject.c:67 #87 0x00007ff88d5963fb in PyVectorcall_Call (kwargs=0x0, tuple=<optimized out>, callable=<method at remote 0x7ff87c70d5c0>) at Objects/dictobject.c:1802 #88 PyObject_Call (callable=<method at remote 0x7ff87c70d5c0>, args=<optimized out>, kwargs=0x0) at Objects/call.c:227 #89 0x00007ff88d6405ea in _PyErr_CheckSignals () at ./Modules/signalmodule.c:1689 #90 0x00007ff88d59b7fd in _PyErr_CheckSignals () at ./Modules/signalmodule.c:1660 #91 PyErr_CheckSignals () at ./Modules/signalmodule.c:1649 .... Reviewed By: DurhamG Differential Revision: D27111187 fbshipit-source-id: 1aa29ab24088b57b98de3741eb81c0a7be01237d
facebook-github-bot
pushed a commit
that referenced
this pull request
Jul 12, 2022
Summary: During shutdown, the overlay background thread is terminated, and the overlay closed, but EdenServer timers are still running. One of which being the manageOverlay one which performs a maintenance on the Overlay. In some cases, shutdown and this timer are racing each other, causing the timer to run after the overlay has been closed, causing a use after free and crashing EdenFS. To solve this, we can either add locks around closing the overlay and the maintenance to guarantee that they do not race, or we can move the maintenance operation to a thread that is known to be running only when the overlay is opened. This diff takes the second approach. The bug manifest itself like so: I0712 01:16:48.253296 21620 EdenServiceHandler.cpp:3394] [000002517432E1E0] initiateShutdown() took 368 µs I0712 01:16:48.253838 24700 PrjfsChannel.cpp:1185] Stopping PrjfsChannel for: C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main I0712 01:16:48.258533 19188 EdenServer.cpp:1624] mount point "C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main" stopped V0712 01:16:48.258814 19188 EdenMount.cpp:851] beginning shutdown for EdenMount C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main V0712 01:16:48.259895 19188 EdenMount.cpp:855] shutdown complete for EdenMount C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main V0712 01:16:48.287378 19188 EdenMount.cpp:861] successfully closed overlay at C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main Unhandled win32 exception code=0xC0000005. Fatal error detected at: #0 00007FF707BA3D81 (7a686d9) facebook::eden::SqliteDatabase::checkpoint Z:\shipit\eden\eden\fs\sqlite\SqliteDatabase.cpp:99 #1 00007FF7072D4090 facebook::eden::EdenServer::manageOverlay Z:\shipit\eden\eden\fs\service\EdenServer.cpp:2142 #2 00007FF7074765D0 facebook::eden::PeriodicTask::timeoutExpired Z:\shipit\eden\eden\fs\service\PeriodicTask.cpp:32 #3 00007FF707500F93 folly::HHWheelTimerBase<std::chrono::duration<__int64,std::ratio<1,1000> > >::timeoutExpired Z:\shipit\folly\folly\io\async\HHWheelTimer.cpp:286 #4 00007FF70757BB44 folly::AsyncTimeout::libeventCallback Z:\shipit\folly\folly\io\async\AsyncTimeout.cpp:174 #5 00007FF708E7AD45 (645b6fc) event_priority_set #6 00007FF708E7AA3A event_priority_set #7 00007FF708E77343 event_base_loop #8 00007FF707515FC5 folly::EventBase::loopMain Z:\shipit\folly\folly\io\async\EventBase.cpp:405 #9 00007FF707515C62 folly::EventBase::loopBody Z:\shipit\folly\folly\io\async\EventBase.cpp:326 #10 00007FF7072DC5EE facebook::eden::EdenServer::performCleanup Z:\shipit\eden\eden\fs\service\EdenServer.cpp:1212 #11 00007FF707219BED facebook::eden::runEdenMain Z:\shipit\eden\eden\fs\service\EdenMain.cpp:395 #12 00007FF7071C624A main Z:\shipit\eden\eden\fs\service\oss\main.cpp:23 #13 00007FF708E87A94 __scrt_common_main_seh d:\A01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 #14 00007FFC96DC7034 BaseThreadInitThunk #15 00007FFC98C5CEC1 RtlUserThreadStart Reviewed By: fanzeyi Differential Revision: D37793444 fbshipit-source-id: cd33302789c2c7a29d566d5bac6e119eccf0a5f2
facebook-github-bot
pushed a commit
that referenced
this pull request
Oct 20, 2022
Summary: It turns out that initializing objc types before (disabling appnap) and after (via libcurl [2]) fork() would abort the program like: objc[<pid>]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. It seems objc really dislikes fork. Disabling the objc logic before fork seems to be the only way to unblock the issue. Other approaches considered: - Avoid `fork()`: not really fesiable for performance (startup, Python GIL) reasons. - Ensure chgserver does not create threads (see https://bugs.python.org/issue33725). Not possible since appnope implicitly creates a (short-lived?) thread. - Disable AppNap without using objc: does not seem trivial. - Set `OBJC_DISABLE_INITIALIZE_FORK_SAFETY` to `YES`. Abort with a different message [1]. [1]: ``` The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug. ``` [2]: ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x00007ff81395f067 libobjc.A.dylib`objc_initializeAfterForkError frame #1: 0x00007ff81395f187 libobjc.A.dylib`performForkChildInitialize(objc_class*, objc_class*) + 274 frame #2: 0x00007ff81394a479 libobjc.A.dylib`initializeNonMetaClass + 617 frame #3: 0x00007ff813949f12 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt<false>&, bool) + 232 frame #4: 0x00007ff813949c93 libobjc.A.dylib`lookUpImpOrForward + 1087 frame #5: 0x00007ff81394e02f libobjc.A.dylib`object_getMethodImplementation + 153 frame #6: 0x00007ff813b12934 CoreFoundation`_NSIsNSString + 55 frame #7: 0x00007ff813b128dc CoreFoundation`-[NSTaggedPointerString isEqual:] + 36 frame #8: 0x00007ff813b05609 CoreFoundation`CFEqual + 533 frame #9: 0x00007ff813b0b2a3 CoreFoundation`_CFBundleCopyBundleURLForExecutableURL + 220 frame #10: 0x00007ff813b069cb CoreFoundation`CFBundleGetMainBundle + 116 frame #11: 0x00007ff813b28ade CoreFoundation`_CFPrefsGetCacheStringForBundleID + 71 frame #12: 0x00007ff813b2f52f CoreFoundation`-[CFPrefsPlistSource setDomainIdentifier:] + 92 frame #13: 0x00007ff813b2f46c CoreFoundation`-[CFPrefsPlistSource initWithDomain:user:byHost:containerPath:containingPreferences:] + 99 frame #14: 0x00007ff813b2f351 CoreFoundation`__85-[_CFXPreferences(PlistSourceAdditions) withManagedSourceForIdentifier:user:perform:]_block_invoke + 156 frame #15: 0x00007ff813c622a7 CoreFoundation`-[_CFXPreferences withSources:] + 60 frame #16: 0x00007ff813cac80f CoreFoundation`-[_CFXPreferences withManagedSourceForIdentifier:user:perform:] + 240 frame #17: 0x00007ff813b2a1ab CoreFoundation`-[CFPrefsSearchListSource addManagedSourceForIdentifier:user:] + 98 frame #18: 0x00007ff813c83a18 CoreFoundation`__108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke.160 + 287 frame #19: 0x00007ff813c836e8 CoreFoundation`-[_CFXPreferences withSearchLists:] + 60 frame #20: 0x00007ff813b29f50 CoreFoundation`__108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke + 279 frame #21: 0x00007ff813c83879 CoreFoundation`-[_CFXPreferences withSearchListForIdentifier:container:cloudConfigurationURL:perform:] + 374 frame #22: 0x00007ff813b29a42 CoreFoundation`-[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:] + 137 frame #23: 0x00007ff813b29978 CoreFoundation`_CFPreferencesCopyAppValueWithContainerAndConfiguration + 101 frame #24: 0x00007ff8145c672b SystemConfiguration`SCDynamicStoreCopyProxiesWithOptions + 155 frame #25: 0x000000010272a315 hg`Curl_resolv(data=0x00007f8dea888a00, hostname="api.github.com", port=443, allowDOH=true, entry=0x000000030a12be10) at hostip.c:675:30 [opt] frame #26: 0x000000010272a68c hg`Curl_resolv_timeout(data=<unavailable>, hostname=<unavailable>, port=<unavailable>, entry=<unavailable>, timeoutms=<unavailable>) at hostip.c:908:8 [opt] [artificial] frame #27: 0x0000000102753e1e hg`create_conn at url.c:3440:12 [opt] frame #28: 0x0000000102753cfe hg`create_conn(data=0x00007f8dea888a00, in_connect=0x000000030a12bed8, async=0x000000030a12bf8f) at url.c:4077:12 [opt] frame #29: 0x000000010275026b hg`Curl_connect(data=0x00007f8dea888a00, asyncp=0x000000030a12bf8f, protocol_done=0x000000030a12bfb2) at url.c:4156:12 [opt] frame #30: 0x000000010273dbd1 hg`multi_runsingle(multi=<unavailable>, nowp=0x000000030a12c020, data=0x00007f8dea888a00) at multi.c:1858:16 [opt] frame #31: 0x000000010273d6ae hg`curl_multi_perform(multi=0x00007f8deb0699c0, running_handles=0x000000030a12c074) at multi.c:2636:14 [opt] frame #32: 0x0000000102726ada hg`curl_easy_perform at easy.c:599:15 [opt] frame #33: 0x0000000102726aa8 hg`curl_easy_perform [inlined] easy_perform(data=0x00007f8dea888a00, events=false) at easy.c:689:42 [opt] frame #34: 0x00000001027269a4 hg`curl_easy_perform(data=0x00007f8dea888a00) at easy.c:708:10 [opt] frame #35: 0x00000001025e1cf6 hg`http_client::request::Request::send::h13a4e600f6bc5508 [inlined] curl::easy::handler::Easy2$LT$H$GT$::perform::h2ba0ae1da25a8852(self=<unavailable>) at handler.rs:3163:37 [opt] ``` Reviewed By: bolinfest Differential Revision: D40538471 fbshipit-source-id: cd8611c8082fbe2d610efb78cb84defdb16d7980
facebook-github-bot
pushed a commit
that referenced
this pull request
Aug 21, 2023
Summary: Creating an initial version of the gitexport CLI (for more context on why we need it, see T160586594). This tool is supposed to take a repo and a list of paths as input and it should export all the history of those paths in a git repo. ## What does it do now? Currently, this binary doesn't do anything useful. it just gets the history of a single path to be exported and prints their changeset ids and commit messages (for manual debugging). The main point of this diff is to **set most of the structure/flow of the tool to get some early feedback** before I start implementing anything more complex. Most of the functions don't have an actual implementation, but just do something simple (e.g. returning the first element of a vector) so it typechecks. ## What's my current plan? 1) Get the history of all the given paths. (This is mostly done in this diff already) 2) Merge the changesets into a single, topologically sorted, list of changesets 3) Strip irrelevant changes from every commit (T161205476). 4) Create a CommitGraph from this list (T161204758). 5) Export that CommitGraph to a new, temporary, Mononoke repo (T160787114). 6) Use existing tools to export that temporary repo to a git repo (T160787114). The tricky bits are steps 2,3 and 4, which is where I expect to spend most of my time. First, I'm not sure if event to create a CommitGraph at all, to be able to export the processed changesets to a new repo. If I do need to, I'm not sure if I should (a) strip the irrelevant file changes before or after creating the graph and (b) how to create a new repo and populate it with the commits from the graph I created. (b) is more of a implementation detail, so I don't worry about now... The main unknowns for me are #2 and #4. Basically, how can I create a proper commit graph from a set of commits that are not direct descendants of each other. Assuming a linear history, I don't think it would be very complicated, but we also have to support branching, so I'm not sure how to do this efficiently... ## Examples Let me put as simple example below. Commits with uppercase letters are relevant (i.e. should be exported) and lower case letters should now. ``` A -> b -> C -> D -> e |-> f -> G ``` In this case, I want to have the following commit graph in the end: ``` A' -> C' -> D' |-> G' ``` where X' is X stripped of irrelevant changes ## RFC - This is my first Rust diff ever, so please LMK what horrible things I'm doing, bc I'm very likely doing a few 😂 - Does the plan I described above make sense? - Any suggestions/ideas on how to efficiently stitch the changesets together would be appreciated! But I'll probably set up some time to discuss this problem specifically once I spend more time thinking about it... ## Next steps - Implement steps #5 and #6 (T160787114) to get the entire E2E solution working with the simplest case (i.e. one path with linear history). This is basically exporting the commit graph to a git repo (maybe through a temporary mononoke repo). - Update integration test case to actually run and test the tool with the simple case. - Figure out how to properly create a commit graph from a list of changeset lists. - Add test cases for multiple paths and edge cases, like having multiple branches. Reviewed By: RajivTS Differential Revision: D48226070 fbshipit-source-id: eed970a8e4697ab10682e3b93863e6d621adaacc
facebook-github-bot
pushed a commit
that referenced
this pull request
May 1, 2024
Summary: We have discovered a deadlock in EdenFS in S412223. The deadlock appears when all fsChannelThreads have a stack trace that looks like: ``` thread #225, name = 'FsChannelThread', stop reason = signal SIGSTOP frame #0: 0x00007f7ca85257b9 frame #1: 0x0000000007d73aa4 edenfs`bool folly::DynamicBoundedQueue<folly::CPUThreadPoolExecutor::CPUTask, false, false, true, 8ul, 7ul, folly::DefaultWeightFn<folly::CPUThreadPoolExecutor::CPUTask>, std::atomic>::canEnqueue<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>>(std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>> const&, unsigned long) + 212 frame #2: 0x0000000007d73557 edenfs`bool folly::DynamicBoundedQueue<folly::CPUThreadPoolExecutor::CPUTask, false, false, true, 8ul, 7ul, folly::DefaultWeightFn<folly::CPUThreadPoolExecutor::CPUTask>, std::atomic>::tryEnqueueUntilSlow<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>, folly::CPUThreadPoolExecutor::CPUTask>(folly::CPUThreadPoolExecutor::CPUTask&&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>> const&) + 39 frame #3: 0x0000000007d722c0 edenfs`facebook::eden::EdenTaskQueue::add(folly::CPUThreadPoolExecutor::CPUTask) + 576 frame #4: 0x0000000005172cbd edenfs`folly::CPUThreadPoolExecutor::add(folly::Function<void ()>) + 557 frame #5: 0x000000000503c4d8 edenfs`void folly::Executor::KeepAlive<folly::Executor>::add<folly::Function<void (folly::Executor::KeepAlive<folly::Executor>&&)>>(folly::Function<void (folly::Executor::KeepAlive<folly::Executor>&&)>&&) && + 216 frame #6: 0x000000000503c257 edenfs`folly::futures::detail::DeferredExecutor::addFrom(folly::Executor::KeepAlive<folly::Executor>&&, folly::Function<void (folly::Executor::KeepAlive<folly::Executor>&&)>) + 135 frame #7: 0x000000000503baf5 edenfs`folly::futures::detail::CoreBase::doCallback(folly::Executor::KeepAlive<folly::Executor>&&, folly::futures::detail::State) + 565 frame #8: 0x00000000052e5e0d edenfs`folly::futures::detail::CoreBase::proxyCallback(folly::futures::detail::State) + 493 frame #9: 0x00000000052e5bb2 edenfs`folly::futures::detail::CoreBase::setProxy_(folly::futures::detail::CoreBase*) + 66 ... ``` This stack means that an FsChannelThread is blocked waiting to enque a task to the EdenTaskQueue. The EdenTaskQueue is the task queue that FsChannelThread's pull work from. So the deadlock is that all FsChannelThreads are blocked trying to add to a full queue which can only be emptied by FsChannelThreads. There are two contributing reasons for this happening: 1. The FsChannelThread will queue a bunch of network fetches into the Sapling queue. When those all finish the future callback chain runs on the FsChannelThreads. So the backing store accumulates a bunch of work then when the fetches complete it dumps a bunch of tasks onto the FsChannelThread's queue. So the backing store is filling up the FsChannelThread task queue. Other threads could theoretically do this to, but backingstore is the main one I have seen. 2. the FsChannelThread might enqueue to their own threads. Folly futures have some smarts to try to prevent a task running on executor a from enqueueing work onto executor a (and instead just run the callback inline), see: https://www.internalfb.com/code/fbsource/[c7c20340562d2eab5f5d2f7f45805546687942d9]/fbcode/folly/futures/detail/Core.cpp?lines=147-148. However, that does not prevent a future that is unknowningly running on an executor's thread from enqueueing to that thread's executor queue. I belive that kind of thing happens when we do stuff like this: https://www.internalfb.com/code/fbsource/[824f6dc95f161e141bf9b821a7826c40b570ddc3]/fbcode/eden/fs/inodes/TreeInode.cpp?lines=375-376 The outerlamba is aware which exector it's running on, but the future we start inside the lambda is not, so when we add that thenValue, it doesn't realize it's enqueuing to it's own executor. I wrote up this toy program to show when folly will enqueue vs run inline: https://www.internalfb.com/intern/commit/cloud/FBS/bce3a906f53913ab8dc74944b8b50d09d78baf9a. script: P1222930602, output: P1222931093. This shows if you return a future from a future callback, the next callback is enqueued. and if you have a callback on a future returned by another future's callback, the callback on the returned future is enqueued. So in summary, backingstore fills up the FsChannelThread work queue then FsChannelThread trys to push work onto it's own queue and deadlocks. Potential solutions: **1- Queued Immediate Executor.** This behavior was likely introduced here: D51302394. We moved from queued immediate executor to the fschannelthreads. queued immediate executor uses an unbounded queue looks like: https://www.internalfb.com/code/fbsource/[c7c20340562d2eab5f5d2f7f45805546687942d9]/fbcode/folly/executors/QueuedImmediateExecutor.h?lines=39 so that is why we didn't have the problem before. I don't love going back to this solution because we are moving away from queued immediate exector because it makes it easy to cause deadlocks. For example the one introduced in D50199539. **2- Make the queue error instead of block when full.** We use to do that for the Eden CPU thread pool, and it resulted in a lot of errors from eden that caused issues for clients. See: D6490979. I think we are kicking the can down the road if we error rather than block. **3- Don't bother switching to the fschannelthreads to complete the fuse request.** This is likely going to be the same as 1. Unless we are going to undo the semifuture-ization we have been doing. or perhaps we could start the fuse request on the fschannel threads, then finish it on the eden cpu threads. Which is pretty much the same thing as this diff except more sets of threads involved. So I prefer this change. **4- add backpressure somewhere else.** If we prevent the backingstore/other threads from being able to fill up the fschannelthread queue then it should be impossible for the queue to fill up. Because there would be no fan out (one task out then one task in). However, this is fragile, we could easily introduce fan out again and then end up back here. Also this would mean we basically block all requests in the fuse layer instead of the lower layers of eden. We would need to redo the queueing in the backing store layer. The fragile-ness and complexity makes me not like this solution. **5 - linearize all future chains.** The reason that the fschannelthreads are enqueing to their own queue is that we have nested futures. If we didn't nest then folly future would run callbacks inline. So if we de-nest all our futures this issue should theoritically go away, because the fschannelthreads will not enqueue to their own queue. So if we de-nest all our futures this issue should theoritically go away. However, I don't know if we can completely get rid of returning a future in a future callback. I don't love this solution as I know there are some explicit places where we choose to nest (I'm looking at you PrjFSDispatcher). so it would be VERY confusing where am I supose to nest and where am I not. it would be easy to do the wrong thing and re-intoduce this bug. Also this is a ton of places we need to change and they are not easy to find. So don't like this option because it's not very "pit of success" - too fragile and too easy to get wrong the first time. **6 - unbound the fschannelthread queue.** It's not great. But there is precident for this. We unbounded the eden cpu thread pool for basically the same reason. See D6513572. The risk here is that we are opening out selves up to OOM. The queue might grow super duper large and then get eden oom killed. We probably should add a config to this change so we can roll this out carefully and watch for ooms as we rollout. Additionally, long term we likely want to rethink how we do threading to archetect eden away from this. I prefer this solution the most. That's what I have implemented here. --------------------------- note: I am removing the limit on the number of outstanding fs request we process at once in this diff. That config was not exactly working how we wanted any ways. Queueing in the backing store let us handle essentially infinite requests at once as the Sapling request queue does not have a max size. I can follow up with a semaphore in the fuse/nfs layers to rate limit the number of active requests. Though fwiw I will likely bump the limit at least initially by a lot when I do that since we realisiticly were allowing clients to do infinite requests previously. Reviewed By: jdelliot, genevievehelsel Differential Revision: D56553375 fbshipit-source-id: 9c6c8a76bd7c93b00d48654cd5fc31d1a68dc0b2
facebook-github-bot
pushed a commit
that referenced
this pull request
Oct 30, 2024
Summary: Fixes this (which was polluting my `arc rust-check` output): ``` warning: unused variable: `time` --> fbcode/eden/scm/saplingnative/bindings/modules/pymetalog/src/lib.rs:119:38 | 119 | def commit(&self, message: &str, time: Option<u64> = None, pending: bool = false) -> PyResult<Bytes> { | ^^^^ | help: `time` is captured in macro and introduced a unused variable --> third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class.rs:478:1 | = note: in this expansion of `py_class!` (#1) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class.rs:537:9 | = note: in this macro invocation (#2) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class.rs:543:1 | = note: in this expansion of `$crate::py_class_impl_item!` (#18) --> third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:30:1 | = note: in this expansion of `$crate::py_class_impl!` (#2) | = note: in this expansion of `$crate::py_class_impl!` (#3) | = note: in this expansion of `$crate::py_class_impl!` (#4) | = note: in this expansion of `$crate::py_class_impl!` (#5) | = note: in this expansion of `$crate::py_class_impl!` (#6) | = note: in this expansion of `$crate::py_class_impl!` (#7) | = note: in this expansion of `$crate::py_class_impl!` (#8) | = note: in this expansion of `$crate::py_class_impl!` (#9) | = note: in this expansion of `$crate::py_class_impl!` (#10) | = note: in this expansion of `$crate::py_class_impl!` (#11) | = note: in this expansion of `$crate::py_class_impl!` (#12) | = note: in this expansion of `$crate::py_class_impl!` (#13) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:249:5 | = note: in this macro invocation (#3) | = note: in this macro invocation (#4) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:2392:5 | = note: in this macro invocation (#5) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:2970:5 | = note: in this macro invocation (#7) | = note: in this macro invocation (#13) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:2999:13 | = note: in this macro invocation (#14) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:3005:5 | = note: in this macro invocation (#8) | = note: in this macro invocation (#11) | = note: in this macro invocation (#12) ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:3120:5 | = note: in this macro invocation (#6) | = note: in this macro invocation (#9) | = note: in this macro invocation (#10) --> third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:196:1 | = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#14) | = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#15) | = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#16) | = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#17) ::: third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:201:9 | = note: in this macro invocation (#18) ::: third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:271:9 | = note: in this macro invocation (#15) ::: third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:331:9 | = note: in this macro invocation (#16) | = note: in this macro invocation (#17) | ::: fbcode/eden/scm/saplingnative/bindings/modules/pymetalog/src/lib.rs:36:1 | 36 | / py_class!(pub class metalog |py| { 37 | | data log: Arc<RwLock<MetaLog>>; 38 | | data fspath: String; ... | 226 | | } 227 | | }); | |____- in this macro invocation (#1) = note: `#[warn(unused_variables)]` on by default ``` Reviewed By: quark-zju Differential Revision: D65218833 fbshipit-source-id: fa69c1a24a32b7eff857070528f6337ec0b3711c
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The internal and external repositories are out of sync. This attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.