Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding license url #2

Closed
wants to merge 2 commits into from
Closed

adding license url #2

wants to merge 2 commits into from

Conversation

KrishMunot
Copy link

No description provided.

@ghost ghost added the CLA Signed label Aug 8, 2016
@bolinfest
Copy link
Contributor

This is not the authoritative source of the license. https://github.com/facebookexperimental/eden/blob/master/LICENSE is.

@bolinfest bolinfest closed this Aug 8, 2016
@KrishMunot
Copy link
Author

Sorry, I rectified the mistake in an another PR - https://github.com/facebookexperimental/eden/blob/master/LICENSE

ghost pushed a commit that referenced this pull request Sep 9, 2016
Summary:
We can't allow ~EdenServer to delete the memory until we're sure that
the other threads are done.  To ensure that, we need to notify the condition
variable while the aux thread still holds the lock.  This makes sure that the
thread destroying the EdenServer waits for the aux thread to release the lock
before we check the predicate and proceed to deleting the memory.

```
SUMMARY  ThreadSanitizer: data race /
/common/concurrency/Event.cpp:107 in facebook::common::concurrency::Event::set() const
==================
I0909 14:51:18.543072 4147554 main.cpp:173] edenfs performing orderly shutdown
I0909 14:51:18.555794 4148654 Channel.cpp:177] session completed
I0909 14:51:18.556011 4148654 EdenServer.cpp:192] mount point "/tmp/eden_test.0ostuc90/mounts/main" stopped
==================
WARNING: ThreadSanitizer: data race (pid=4147554)
  Write of size 8 at 0x7fff9e182d90 by main thread:
    #0 pthread_cond_destroy <null> (edenfs+0x00000007671a)
    #1 facebook::eden::EdenServer::~EdenServer() /
/eden/fs/service/EdenServer.cpp:93 (libeden_fs_service_server.so+0x0000000b96cd)
    #2 main /
/eden/fs/service/main.cpp:176 (edenfs+0x000000018515)

  Previous read of size 8 at 0x7fff9e182d90 by thread T73:
    #0 pthread_cond_broadcast <null> (edenfs+0x0000000765b7)
    #1 __gthread_cond_broadcast /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/include/x86_64-facebook-linux/bits/gthr-default.h:852 (libstdc++.so.6+0x0000000e14f8)
    #2 std::condition_variable::notify_all() /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/src/c++11/../../../.././libstdc++-v3/src/c++11/condition_variable.cc:72 (libstdc
++.so.6+0x0000000e14f8)
    #3 facebook::eden::EdenServer::mount(std::shared_ptr<facebook::eden::EdenMount>, std::unique_ptr<facebook::eden::ClientConfig, std::default_delete<facebook::eden::ClientConfig> >)::$_0::operator()() const /
/
/eden/fs/service/EdenServer.cpp:145 (libeden_fs_service_server.so+0x0000000bcdb5)
    #4 std::_Function_handler<void (), facebook::eden::EdenServer::mount(std::shared_ptr<facebook::eden::EdenMount>, std::unique_ptr<facebook::eden::ClientConfig, std::default_delete<facebook::eden::ClientConfig>
>)::$_0>::_M_invoke(std::_Any_data const&) /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:2039 (libeden_fs_service_server.so+0x0000000bcab0)
    #5 std::function<void ()>::operator()() const /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:2439 (libeden_fuse_fusell.so+0x00000020fbb9)
    #6 facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0::operator()() const /
/eden/fuse/MountPoint.cpp:69 (libeden_fuse_fusell.so+0x000000237447
)
    #7 void std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()>::_M_invoke<>(std::_Index_tuple<>) /
/third-party-buck/gcc-4.9-
glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:1699 (libeden_fuse_fusell.so+0x000000237048)
    #8 std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()>::operator()() /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgc
c/include/c++/trunk/functional:1688 (libeden_fuse_fusell.so+0x000000236ff8)
    #9 std::thread::_Impl<std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()> >::_M_run() /
/third-party-buck/gcc-4.9-glibc-2.
20-fb/build/libgcc/include/c++/trunk/thread:115 (libeden_fuse_fusell.so+0x000000236d8c)
    #10 execute_native_thread_routine /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/src/c++11/../../../.././libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000e6
ec0)
```

Reviewed By: simpkins

Differential Revision: D3844846

fbshipit-source-id: 545474bc1aff8621dbeb487dcd6b54c82828ff3b
facebook-github-bot pushed a commit that referenced this pull request Apr 10, 2019
…t (attempt #2)

Summary:
The file descriptor API here needs to go away, so switch this API to NetworkSocket

It is expected that this commit will cause a number of Open Source projects to temporarily show up as broken. This is due to the fact that not all projects get synced to Github at the exact same time, so the builds may temporarily be fetching an older version of it's dependencies than it needs to :) It should fix itself quickly.

Reviewed By: yfeldblum

Differential Revision: D14673328

fbshipit-source-id: c5842fa5dc383d50043e0d8228e35d03b10a1c6b
facebook-github-bot pushed a commit that referenced this pull request Mar 12, 2020
Summary:
Let's use functionality added in D20389226 so that we can generate
diffs even for large files.

I've contemplated between two approaches: either "silently" generate
placeholder diff for files that are over the limit or add a new option
where client can request these placeholders. I choose the latter for a few
reasons:

1) To me option #1 might cause surprises e.g. diffing a single large file
doesn't fail, but diffing two files whose size is (LIMIT / 2 + 1) will fail.
Option #2 let's the client be very explicit about what it needs - and it also
won't return a placeholder when the actual content is expected!
2) Option #2 makes the client think about what it wants to be returned,
and it seems in line with source control thrift API design in general i.e.
commit_file_diffs is not trivial to use by design to because force client to
think about edge cases. And it seems that forcing client to think about additional
important edge case (i.e. large file) makes sense.
3) The code change is simpler with option #2

I also thought about adding generate_placeholder_diff parameter to CommitFileDiffsParams,
but that makes integration with scs_helper.py harder.

Reviewed By: markbt

Differential Revision: D20417787

fbshipit-source-id: ab9b32fd7a4768043414ed7d8bf39e3c6f4eb53e
facebook-github-bot pushed a commit that referenced this pull request Sep 24, 2020
Summary:
Our current megarepo configuration is in a bit of a mess:
1) We have LiveCommitSyncConfig, which fetches the latest version of configs
from configerator and should be used in all cases
2) However we still have an old commit that's stored in mononoke config. It
shouldn't really be used at all.

Unfortunately there are a few places where #2 is still used. This diff removes
one of them.

Reviewed By: ikostia

Differential Revision: D23845297

fbshipit-source-id: aa2d591223cc4b8fe5ef264147457fcb3d1faad7
facebook-github-bot pushed a commit that referenced this pull request Sep 25, 2020
Summary:
As described in D23845720 (5de500b) we are doing a pretty significant change in the
CommitSyncer. Previously it stored static Movers and BookmarkRenamers, but it
needs to change and they would need to fetch config from LiveCommitSyncConfig.

Unfortunately we already have a bunch of tests that create weird movers that
would be very hard to model via LiveCommitSyncConfig. So we have 2 options:

1) delete or rewrite all these tests
2) Create a wrapper that would return a mover/bookmark renamers.

Option #2 is preferable, and that's what this diff does. In production it would
use LiveCommitSyncConfig, but it also let's tests specify whatever weird movers
they need.

Reviewed By: ikostia

Differential Revision: D23909432

fbshipit-source-id: 83fb627812f625e07f7e40044e2f69274cd2d768
facebook-github-bot pushed a commit that referenced this pull request Mar 17, 2021
Summary:
This can cause a deadlock if `func()` calls `create_callsite`.

Usually it's rare. But (Python) signal handler makes it realistic.

Example backtrace:

  Traceback (most recent call first):
    File "/opt/fb/mercurial/edenscm/tracing.py", line 337, in event
    File "/opt/fb/mercurial/edenscm/mercurial/ui.py", line 1275, in debug
      tracing.debug(msg.rstrip("\n"), depth=1)
    File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 608, in _reapworkers
      self.ui.debug("worker process exited (pid=%d)\n" % pid)
    File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 591, in _sigchldhandler
      self._reapworkers(os.WNOHANG)
    File "/opt/fb/mercurial/edenscm/tracing.py", line 337, in event
    File "/opt/fb/mercurial/edenscm/mercurial/ui.py", line 1275, in debug
      tracing.debug(msg.rstrip("\n"), depth=1)
    File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 608, in _reapworkers
      self.ui.debug("worker process exited (pid=%d)\n" % pid)
    File "/opt/fb/mercurial/edenscm/mercurial/commandserver.py", line 591, in _sigchldhandler
      self._reapworkers(os.WNOHANG)

  #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  #1  0x000055d0d65ba339 in <parking_lot::raw_rwlock::RawRwLock>::lock_upgradable_slow ()
  #2  0x000055d0d55b5814 in tracing_runtime_callsite::create_callsite::<tracing_runtime_callsite::callsite_info::EventKindType, pytracing::new_callsite<tracing_runtime_callsite::callsite_info::EventKindType>::{closure#2}> ()
  #3  0x000055d0d5584cb9 in <pytracing::EventCallsite>::__new__ ()
  #4  0x000055d0d55a3eaa in std::panicking::try::<*mut python3_sys::object::PyObject, cpython::function::handle_callback<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter>::{closure#0}> ()
  #5  0x000055d0d5589365 in cpython::function::handle_callback::<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter> ()
  #6  0x000055d0d55856e1 in <pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc ()
  #7  0x00007ff88d576230 in type_call (
      kwds={'obj': Frame 0x7ff87c1f8c40, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87d...(truncated), args=(),
      type=0x55d0d8ea5b40 <_RNvNvMs1R_CsgCrAUYYhx1D_9pytracingNtB8_13EventCallsite15create_instance11TYPE_OBJECT.llvm.4665269759137401160>) at Objects/typeobject.c:974
  #8  _PyObject_MakeTpCall (callable=<type at remote 0x55d0d8ea5b40>, args=<optimized out>,
      nargs=<optimized out>, keywords=<optimized out>) at Objects/call.c:159
  #9  0x00007ff88d56dc81 in _PyObject_Vectorcall (
      kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), nargsf=<optimized out>,
      args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>) at ./Include/cpython/abstract.h:125
  #10 _PyObject_Vectorcall (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'),
      nargsf=<optimized out>, args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>)
      at ./Include/cpython/abstract.h:115
  #11 call_function (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), oparg=<optimized out>,
      pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:4963
  #12 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515
  #13 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87cced010, for file /opt/fb/mercurial/edenscm/tracing.py, line 337, in event (message='worker process exited (pid=3953080)', name=None, target=None, level=1, depth=1, meta={}, frame=Frame 0x7ff87c1f8c40, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': ...(truncated))
      at Python/ceval.c:741
  #14 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>,
      args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c0fdc78,
      kwcount=<optimized out>, kwstep=1, defs=0x7ff87d572558, defcount=4, kwdefs=0x0, closure=0x0,
      name='event', qualname='event') at Python/ceval.c:4298
  #15 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c0fdc70,
      nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435
  #16 0x00007ff88d57574f in _PyObject_FastCallDict (callable=<function at remote 0x7ff87d5741f0>,
      args=0x7ff87eb63838, nargsf=<optimized out>, kwargs=<optimized out>) at Objects/call.c:104
  #17 0x00007ff88d66d1ab in partial_fastcall (kwargs={'level': 1, 'depth': 1}, nargs=<optimized out>,
      args=<optimized out>, pto=0x7ff87d572630) at ./Modules/_functoolsmodule.c:169
  #18 partial_call (pto=0x7ff87d572630, args=<optimized out>, kwargs=<optimized out>)
      at ./Modules/_functoolsmodule.c:224
  #19 0x00007ff88d576331 in _PyObject_MakeTpCall (callable=<functools.partial at remote 0x7ff87d572630>,
      args=<optimized out>, nargs=<optimized out>, keywords=<optimized out>) at Objects/object.c:2207
  #20 0x00007ff88d56dc81 in _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>,
      args=<optimized out>, callable=<functools.partial at remote 0x7ff87d572630>)
      at ./Include/cpython/abstract.h:125
  #21 _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>, args=<optimized out>,
      callable=<functools.partial at remote 0x7ff87d572630>) at ./Include/cpython/abstract.h:115
  #22 call_function (kwnames=('depth',), oparg=<optimized out>, pp_stack=<synthetic pointer>,
      tstate=<optimized out>) at Python/ceval.c:4963
  #23 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515
  #24 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87c0e43c0, for file /opt/fb/mercurial/edenscm/mercurial/ui.py, line 1275, in debug (self=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>, 'bookmarks': <itemregister(_generics=se...(truncated))
      at Python/ceval.c:741
  #25 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>,
      args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c1f8de8,
      kwcount=<optimized out>, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name='debug',
      qualname='ui.debug') at Python/ceval.c:4298
  #26 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c1f8dd8,
      nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435
  #27 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>,
      args=0x7ff87c1f8dd8, callable=<function at remote 0x7ff87d57e9d0>)
      at ./Include/cpython/abstract.h:127
  #28 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>,
      tstate=0x7ff88ca08780) at Python/ceval.c:4963
  #29 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486
  #30 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87c1f8c40, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>,...(truncated))
      at Python/ceval.c:738
  #31 function_code_fastcall (globals=<optimized out>, nargs=2, args=<optimized out>, co=<optimized out>)
      at Objects/call.c:283
  #32 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>,
      kwnames=<optimized out>) at Objects/call.c:410
  #33 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>,
      args=0x7ff87c0f26d8, callable=<function at remote 0x7ff87d73b310>)
      at ./Include/cpython/abstract.h:127
  #34 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>,
      tstate=0x7ff88ca08780) at Python/ceval.c:4963
  #35 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486
  #36 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87c0f2550, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 591, in _sigchldhandler (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemreg
  ister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a...(truncated))
      at Python/ceval.c:738
  #37 function_code_fastcall (globals=<optimized out>, nargs=3, args=<optimized out>, co=<optimized out>)
      at Objects/call.c:283
  #38 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>,
      kwnames=<optimized out>) at Objects/call.c:410
  #39 0x00007ff88d592153 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>,
      args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:115
  #40 method_vectorcall (method=<optimized out>, args=<optimized out>, nargsf=<optimized out>,
      kwnames=<optimized out>) at Objects/classobject.c:67
  #41 0x00007ff88d5963fb in PyVectorcall_Call (kwargs=0x0, tuple=<optimized out>,
      callable=<method at remote 0x7ff87c70d5c0>) at Objects/dictobject.c:1802
  #42 PyObject_Call (callable=<method at remote 0x7ff87c70d5c0>, args=<optimized out>, kwargs=0x0)
      at Objects/call.c:227
  #43 0x00007ff88d6405ea in _PyErr_CheckSignals () at ./Modules/signalmodule.c:1689
  #44 0x00007ff88d5a41a1 in _PyErr_CheckSignals () at Objects/object.c:577
  #45 PyErr_CheckSignals () at ./Modules/signalmodule.c:1649
  #46 PyObject_Str (v='_reapworkers') at Objects/object.c:561
  #47 0x000055d0d557c821 in pytracing::tostr_opt ()
  #48 0x000055d0d55b5a7d in tracing_runtime_callsite::create_callsite::<tracing_runtime_callsite::callsite_info::EventKindType, pytracing::new_callsite<tracing_runtime_callsite::callsite_info::EventKindType>::{closure#2}> ()
  #49 0x000055d0d5584cb9 in <pytracing::EventCallsite>::__new__ ()
  #50 0x000055d0d55a3eaa in std::panicking::try::<*mut python3_sys::object::PyObject, cpython::function::handle_callback<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter>::{closure#0}> ()
  #51 0x000055d0d5589365 in cpython::function::handle_callback::<<pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc::{closure#0}, pytracing::EventCallsite, cpython::function::PyObjectCallbackConverter> ()
  #52 0x000055d0d55856e1 in <pytracing::EventCallsite>::create_instance::TYPE_OBJECT::wrap_newfunc ()
  #53 0x00007ff88d576230 in type_call (
      kwds={'obj': Frame 0x7ff87c1f8440, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87d...(truncated), args=(),
      type=0x55d0d8ea5b40 <_RNvNvMs1R_CsgCrAUYYhx1D_9pytracingNtB8_13EventCallsite15create_instance11TYPE_OBJECT.llvm.4665269759137401160>) at Objects/typeobject.c:974
  #54 _PyObject_MakeTpCall (callable=<type at remote 0x55d0d8ea5b40>, args=<optimized out>,
      nargs=<optimized out>, keywords=<optimized out>) at Objects/call.c:159
  #55 0x00007ff88d56dc81 in _PyObject_Vectorcall (
      kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), nargsf=<optimized out>,
      args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>) at ./Include/cpython/abstract.h:125
  #56 _PyObject_Vectorcall (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'),
      nargsf=<optimized out>, args=<optimized out>, callable=<type at remote 0x55d0d8ea5b40>)
      at ./Include/cpython/abstract.h:115
  #57 call_function (kwnames=('obj', 'name', 'target', 'level', 'fieldnames'), oparg=<optimized out>,
      pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:4963
  #58 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515
  #59 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87ccec890, for file /opt/fb/mercurial/edenscm/tracing.py, line 337, in event (message='worker process exited (pid=3953122)', name=None, target=None, level=1, depth=1, meta={}, frame=Frame 0x7ff87c1f8440, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _u
  nserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': ...(truncated))
      at Python/ceval.c:741
  #60 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>,
      args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c0fdd78,
      kwcount=<optimized out>, kwstep=1, defs=0x7ff87d572558, defcount=4, kwdefs=0x0, closure=0x0,
      name='event', qualname='event') at Python/ceval.c:4298
  #61 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c0fdd70,
      nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435
  #62 0x00007ff88d57574f in _PyObject_FastCallDict (callable=<function at remote 0x7ff87d5741f0>,
      args=0x7ff87eb59d18, nargsf=<optimized out>, kwargs=<optimized out>) at Objects/call.c:104
  #63 0x00007ff88d66d1ab in partial_fastcall (kwargs={'level': 1, 'depth': 1}, nargs=<optimized out>,
      args=<optimized out>, pto=0x7ff87d572630) at ./Modules/_functoolsmodule.c:169
  #64 partial_call (pto=0x7ff87d572630, args=<optimized out>, kwargs=<optimized out>)
      at ./Modules/_functoolsmodule.c:224
  #65 0x00007ff88d576331 in _PyObject_MakeTpCall (callable=<functools.partial at remote 0x7ff87d572630>,
      args=<optimized out>, nargs=<optimized out>, keywords=<optimized out>) at Objects/object.c:2207
  #66 0x00007ff88d56dc81 in _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>,
      args=<optimized out>, callable=<functools.partial at remote 0x7ff87d572630>)
      at ./Include/cpython/abstract.h:125
  #67 _PyObject_Vectorcall (kwnames=('depth',), nargsf=<optimized out>, args=<optimized out>,
      callable=<functools.partial at remote 0x7ff87d572630>) at ./Include/cpython/abstract.h:115
  #68 call_function (kwnames=('depth',), oparg=<optimized out>, pp_stack=<synthetic pointer>,
      tstate=<optimized out>) at Python/ceval.c:4963
  #69 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515
  #70 0x00007ff88d566268 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87c0e4200, for file /opt/fb/mercurial/edenscm/mercurial/ui.py, line 1275, in debug (self=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>, 'bookmarks': <itemregister(_generics=se...(truncated))
      at Python/ceval.c:741
  #71 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>,
      args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7ff87c1f85e8,
      kwcount=<optimized out>, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name='debug',
      qualname='ui.debug') at Python/ceval.c:4298
  #72 0x00007ff88d57fdce in _PyFunction_Vectorcall (func=<optimized out>, stack=0x7ff87c1f85d8,
      nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435
  #73 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>,
      args=0x7ff87c1f85d8, callable=<function at remote 0x7ff87d57e9d0>)
      at ./Include/cpython/abstract.h:127
  #74 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>,
      tstate=0x7ff88ca08780) at Python/ceval.c:4963
  #75 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486
  #76 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87c1f8440, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 608, in _reapworkers (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>,...(truncated))
      at Python/ceval.c:738
  #77 function_code_fastcall (globals=<optimized out>, nargs=2, args=<optimized out>, co=<optimized out>)
  {<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a0>,...(truncated))
      at Python/ceval.c:738
  #77 function_code_fastcall (globals=<optimized out>, nargs=2, args=<optimized out>, co=<optimized out>)
  --Type <RET> for more, q to quit, c to continue without paging--
      at Objects/call.c:283
  #78 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>,
      kwnames=<optimized out>) at Objects/call.c:410
  #79 0x00007ff88d56821a in _PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>,
      args=0x7ff87c0f2528, callable=<function at remote 0x7ff87d73b310>)
      at ./Include/cpython/abstract.h:127
  #80 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>,
      tstate=0x7ff88ca08780) at Python/ceval.c:4963
  #81 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486
  #82 0x00007ff88d57fd38 in PyEval_EvalFrameEx (throwflag=0,
      f=Frame 0x7ff87c0f23a0, for file /opt/fb/mercurial/edenscm/mercurial/commandserver.py, line 591, in _sigchldhandler (self=<unixforkingservice(ui=<ui(_buffers=[], _bufferstates=[], _bufferapplylabels=None, _outputui=None, callhooks=True, insecureconnections=False, _colormode=None, _styles={}, _terminaloutput=None, cmdname=None, _uiconfig=<uiconfig(quiet=False, verbose=False, debugflag=False, tracebackflag=False, logmeasuredtimes=False, _rcfg=<localrcfg(_rcfg=<bindings.configparser.config at remote 0x7ff87d7325d0>) at remote 0x7ff87eb73e80>, _unserializable={}, _pinnedconfigs=set(), _knownconfig={'alias': <itemregister(_generics={<configitem(section='alias', name='.*', default=None, alias=[], generic=True, priority=0, _re=<re.Pattern at remote 0x7ff87d69bed0>) at remote 0x7ff87d690a60>}) at remote 0x7ff87dbfde50>, 'annotate': <itemregister(_generics=set()) at remote 0x7ff87d6ad9f0>, 'auth': <itemregister(_generics=set()) at remote 0x7ff87dc037c0>, 'blackbox': <itemregister(_generics=set()) at remote 0x7ff87dc039a...(truncated))
      at Python/ceval.c:738
  #83 function_code_fastcall (globals=<optimized out>, nargs=3, args=<optimized out>, co=<optimized out>)
      at Objects/call.c:283
  #84 _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>,
      kwnames=<optimized out>) at Objects/call.c:410
  #85 0x00007ff88d592153 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>,
      args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:115
  #86 method_vectorcall (method=<optimized out>, args=<optimized out>, nargsf=<optimized out>,
      kwnames=<optimized out>) at Objects/classobject.c:67
  #87 0x00007ff88d5963fb in PyVectorcall_Call (kwargs=0x0, tuple=<optimized out>,
      callable=<method at remote 0x7ff87c70d5c0>) at Objects/dictobject.c:1802
  #88 PyObject_Call (callable=<method at remote 0x7ff87c70d5c0>, args=<optimized out>, kwargs=0x0)
      at Objects/call.c:227
  #89 0x00007ff88d6405ea in _PyErr_CheckSignals () at ./Modules/signalmodule.c:1689
  #90 0x00007ff88d59b7fd in _PyErr_CheckSignals () at ./Modules/signalmodule.c:1660
  #91 PyErr_CheckSignals () at ./Modules/signalmodule.c:1649
  ....

Reviewed By: DurhamG

Differential Revision: D27111187

fbshipit-source-id: 1aa29ab24088b57b98de3741eb81c0a7be01237d
facebook-github-bot pushed a commit that referenced this pull request Mar 19, 2021
Summary:
This introduces a basic building block for query batching. I called this
rendezvous, since it's about multiple queries meeting up in the same place :)

There are a few (somewhat conflicting) goals this tries to satisfy, so let's go
over them:

1), we'd like to reduce the total number of queries made by batch jobs. For
example, group hg bonsai lookups made by the walker. Those jobs are
characterized by the fact that they have a lot of queries to make, all the
time. Here's an example: https://fburl.com/ods/zuiep7yh.

2), we'd like to reduce the overall number of connections held to MySQL by
our tasks. The main way we achieve this is by reducing the maximum number of
concurrent queries. Indeed, a high total number of queries doesn't necessarily
result in a lot of connections as long as they're not concurrent, because we
can reuse connections. On the other hand, if you dispatch 100 concurrent
queries, that _does_ use 100 connections. This is something that applies to
batch jobs due to their query volume, but also to "interactive" jobs like
Mononoke Server or SCS, just not all the time. Here's an example:
https://fburl.com/ods/o6gp07qp (you can see the query count is overall low, but
sometimes spikes substantially).

2.1) It's also worth noting that concurrent queries are often the result of
many clients wanting the same data, so deduplication is also useful here.

3), we also don't want to impact the latency of interactive jobs when they
need to a little query here or there (i.e. it's largely fine if our jobs all
hold a few connections to MySQL and use them somewhat consistently).

4), we'd like this to make it easier to do batching right. For example, if
you have 100 Bonsais to map to hg, you should be able to just map and call
`future::try_join_all` and have that do the right thing.

5), we don't want "bad" queries to affect other queries negatively. One
example would be the occasional queries we make to Bonsai <-> Hg mapping in
`known` for thousands (if not more) of rows.

6), we want this to be easy to incorporate into the codebase.

So, how do we try to address all of this? Here's how:

- We ... do batching, and we deduplicate requests in a batch. This is the
  easier bit and should address #1, #2 and #2.1, #4.
- However, batching is conditional. We notably don't batch very large requests
  with the rest (addresses #5). We also don't batch small queries all the time:
  we only batch if we are observing a throughput of queries that suggests we
  can find some benefit in batching (this targets #3).
- Finally, we have some utilities for common cases like having to group by repo
  id (this is `MultiRendezVous`), and this is all configurable via tunables
  (and the default is to not do anything).

Reviewed By: StanislavGlebik

Differential Revision: D27010317

fbshipit-source-id: 4a2397255f9785c6722c02e4d419438fd0aafa07
facebook-github-bot pushed a commit that referenced this pull request Jul 12, 2021
Summary:
Implement using of uploading changesets in `hg cloud upload` command.

This is the last part for `hg cloud upload` - uploading changesets via Edenapi
test
```

```
# machine #2
liubovd {emoji:1f352}  ~/fbsource
 [15] → hg pull -r 0b6075b4bda143d5212c1525323fb285d96a1afb
pulling from mononoke://mononoke.c2p.facebook.net/fbsource
connected to twshared27150.03.cln3.facebook.com session RaIPDgvF6l8rmXkA
abort: 0b6075b4bda143d5212c1525323fb285d96a1afb not found!
```

```
# machine #1
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [6] →  EDENSCM_LOG="edenapi::client=info" ./hg cloud upload
Jul 11 13:26:26.322  INFO edenapi::client: Requesting lookup for 1 item(s)
commitcloud: head '0b6075b4bda1' hasn't been uploaded yet
Jul 11 13:26:26.472  INFO edenapi::client: Requesting lookup for 6 item(s)
commitcloud: queue 1 commit for upload
Jul 11 13:26:26.648  INFO edenapi::client: Requesting lookup for 1 item(s)
commitcloud: queue 0 files for upload
Jul 11 13:26:26.698  INFO edenapi::client: Requesting lookup for 4 item(s)
commitcloud: queue 4 trees for upload
Jul 11 13:26:27.393  INFO edenapi::client: Requesting trees upload for 4 item(s)
commitcloud: uploaded 4 trees
commitcloud: uploading commit '0b6075b4bda143d5212c1525323fb285d96a1afb'...
Jul 11 13:26:28.426  INFO edenapi::client: Requesting changesets upload for 1 item(s)
commitcloud: uploaded 1 commit
```

```
# machine #2
liubovd {emoji:1f352}  ~/fbsource
 [16] → hg pull -r 0b6075b4bda143d5212c1525323fb285d96a1afb
pulling from mononoke://mononoke.c2p.facebook.net/fbsource
connected to twshared16001.08.cln2.facebook.com session QCpy1x9yrflRF6xF
searching for changes
adding commits
adding manifests
adding file changes
added 895 commits with 0 changes to 0 files
(running background incremental repack)
prefetching trees for 4 commits

liubovd {emoji:1f352}  ~/fbsource
 [17] → hg up 0b6075b4bda143d5212c1525323fb285d96a1afb
warning: watchman has recently started (pid 93231) - operation will be slower than usual
connected to twshared32054.08.cln2.facebook.com session Hw91G8kRYzt4c5BV
1 files updated, 0 files merged, 0 files removed, 0 files unresolved

liubovd {emoji:1f352}  ~/fbsource
 [18] → hg diff -c .
connected to twshared0965.07.cln2.facebook.com session rrYSvRM6pnBYZ2Fn
 diff --git a/fbcode/eden/scm/test b/fbcode/eden/scm/test
new file mode 100644
 --- /dev/null
+++ b/fbcode/eden/scm/test
@@ -0,0 +1,1 @@
+test
```

Initial perf wins:

Having a large stack of 6 commits (total 24 files changed), tested *adding a single line to a file at the top commit*. We can see at least 2X win but it should be more because I have tested with a local instance of edenapi service that runs on my devserver.

```
╷
╷ @  5582fc8ee  6 minutes ago  liubovd
╷ │  test
╷ │
╷ o  d55f9bb65  86 minutes ago  liubovd  D29644738
╷ │  [hg] edenapi: Implement using of uploading changesets in `hg cloud upload` command
╷ │
╷ o  561149783  Friday at 15:10  liubovd  D29644797
╷ │  [hg] edenapi: Add request handler for uploading hg changesets
╷ │
╷ o  c3dda964a  Friday at 15:10  liubovd  D29644800
╷ │  [edenapi_service] Add new /:repo/upload/changesets endpoint
╷ │
╷ o  28ce2fa0c  Friday at 15:10  liubovd  D29644799
╷ │  [hg] edenapi/edenapi_service: Add new API for uploading Hg Changesets
╷ │
╷ o  13325b361  Yesterday at 15:23  liubovd  D29644798
╭─╯  [edenapi_service] Implement uploading of hg changesets
```

```
# adding new line to a file test in the test commit, and then run:
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [8] → time hg cloud upload
commitcloud: head '4e4f947d73e6' hasn't been uploaded yet
commitcloud: queue 1 commit for upload
commitcloud: queue 0 files for upload
commitcloud: queue 4 trees for upload
commitcloud: uploaded 4 trees
commitcloud: uploading commit '4e4f947d73e676b63df7c90c4e707d38e6d0a93b'...
commitcloud: uploaded 1 commit

real	0m3.778s
user	0m0.017s
sys	0m0.027s
```

```
# adding another new line to a file test in the test commit, and then run:
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [11] → time hg cloud backup
connected to twshared30574.02.cln2.facebook.com session uvOvhxtBfeM7pMgl
backing up stack rooted at 13325b3612d2
commitcloud: backed up 1 commit

real	0m7.507s
user	0m0.013s
sys	0m0.030s
```

Test force mode of the new command that reupload everything:
```
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [13] → time hg cloud upload --force
commitcloud: head '5582fc8ee382' hasn't been uploaded yet
commitcloud: queue 6 commits for upload
commitcloud: queue 24 files for upload
commitcloud: uploaded 24 files
commitcloud: queue 61 trees for upload
commitcloud: uploaded 61 trees
commitcloud: uploading commit '13325b3612d20c176923d1aab8a28383cea2ba9a'...
commitcloud: uploading commit '28ce2fa0c6a02de57cdc732db742fd5c8f2611ad'...
commitcloud: uploading commit 'c3dda964a71b65f01fc4ccadc9429ee887ea982c'...
commitcloud: uploading commit '561149783e2fb5916378fe27757dcc2077049f8c'...
commitcloud: uploading commit 'd55f9bb65a0829b1731baa686cb8a6e0c5500cc2'...
commitcloud: uploading commit '5582fc8ee382c4c367a057db2a1781377bf55ba4'...
commitcloud: uploaded 6 commits

real	0m7.830s
user	0m0.011s
sys	0m0.032s
```

We can see the time is similar to the current `hg cloud backup` command.

Reviewed By: markbt

Differential Revision: D29644738

fbshipit-source-id: cbbfcb2e8018f83f323f447848b3b6045baf47c5
facebook-github-bot pushed a commit that referenced this pull request Jul 28, 2021
…mputed

Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

If a previous add_sync_target() call was successful on mononoke side, but we
failed to deliver this result to the client (e.g. network issues), then client
would just try to retry this call. Before this diff it wouldn't work (i.e. we
just fail to create a bookmark because it's already created). This diff fixes
it by checking a commit this bookmark points to and checking if it looks like
it was created by a previous add_sync_target call. In particular, it checks
that remapping state file matches the request parameters, and that config
version is the same.

Differential Revision: D29848377

fbshipit-source-id: 16687d975748929e5eea8dfdbc9e206232ec9ca6
facebook-github-bot pushed a commit that referenced this pull request Jul 28, 2021
…as computed

Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

Same as with D29848377 - if result was already computed and client retries the
same request, then return it.

Differential Revision: D29874802

fbshipit-source-id: ebc2f709bc8280305473d6333d0725530c131872
facebook-github-bot pushed a commit that referenced this pull request Jul 28, 2021
Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

We already have it for change_target_config, and it's useful to prevent races
and inconsistencies. That's especially important given that our async request
worker might run a few identical sync_changeset methods at the same time, and
target_location can help process this situation correctly.

Let's add target_location to sync_changeset, and while there I also updated the
comment for these fields in other methods. The comment said

```
// This operation will succeed only if the
// `target`'s bookmark is still at the same location
// when this operation tries to advance it
```

This is not always the case - operation might succeed if the same operation has been
re-sent twice,  see previous diffs for more explanationmotivation.

Reviewed By: krallin

Differential Revision: D29875242

fbshipit-source-id: c14b2148548abde984c3cb5cc62d04f920240657
facebook-github-bot pushed a commit that referenced this pull request Jul 28, 2021
Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

Now that we have target_location in sync_changeset() method,
let's move bookmark in sync_changeset conditionally, just as in D29874803 (5afc48a).

This would prevent race conditions from happening when the same sync_changeset
method is executing twice.

Reviewed By: krallin

Differential Revision: D29876413

fbshipit-source-id: c076e14171c6615fba2cedf4524d442bd25f83ab
facebook-github-bot pushed a commit that referenced this pull request Jul 29, 2021
…lying

Summary:
We had somewhat inconsistent behaviour in multiplexed blobstore:
1) If on_put handlers are too slow (i.e. they are slower than all blobstores) then we
succeed as soon as all blobstores were successful (regardless of the value of
minimum_successful_writes). It doesn't matter if on_put handlers fail or
succeed, we've already returned success to our user.
2) However if all writes to the queue quickly fail, then we return a failure
even if writes to all blobstore were successful.

#2 seems like a change in behaviour from an old diff D17421208 (9de1de2), and not a
desirable one - if blobstore sync queue is unavailable and it responds with
failures quickly, then blobstore writes will always fail even if all blobstores
are healthy.

So this diff makes it so that we always succeed if all blobstore puts were
successful, regardless of success or failures of on_put handlers.

Reviewed By: liubov-dmitrieva

Differential Revision: D29985084

fbshipit-source-id: 64338d552be45a70d9b1d16dfbe7d10346ab539c
facebook-github-bot pushed a commit that referenced this pull request Aug 19, 2021
Summary:
The current fb303 counters only report aggregated latency while we want to track Eden performance under different version, os, channel, and configs. So I am setting up a new logging mechanism for this purpose.

This diff introduces the class `FsEventLogger` for sampling and logging. There are 3 configs introduced by this diff. The configs are reloaded every 30 minutes.
1. `telemetry:request-sampling-config-allowlist`
A list of config keys that we want to attach to scuba events.

2. `telemetry:request-samples-per-minute`
Max number of events logged to scuba per minute per mount.

3. `telemetry:request-sampling-group-denominators`
* Each type of operation has a "sampling group" (defaulted to 0, which is dropping all).
* We use this sampling group as index to look up its denominator in this config.
* The denominator is then used for sampling. e.g. `1/x` of the events are send to scuba, if we haven't reached the cap specified by #2.

Example workflow:
1. receive tracing event
2. look up denominator of the sampling group of the operation type
3. sample based on the denominator
4. check that we have not exceeded the logging cap per min
5. create sample and send to scribe

Reviewed By: xavierd

Differential Revision: D30288054

fbshipit-source-id: 8f2b95c11c718550a8162f4d1259a25628f499ff
facebook-github-bot pushed a commit that referenced this pull request Oct 1, 2021
Summary:
This diff adds a way to do snapshot tests in SC, by simply calling a macro. It uses that as an example on two wire objects to show it works.

Motivation:
- Using snapshot testing in wire object will make it super clear what changes are happening on the wire format, which is useful to prevent breakages. (see quip)
- I'm going with approach #3 from proposal, but both #2 and #3 would need snapshot testing.

How?
- I'm using [insta](https://docs.rs/insta/1.8.0/insta/), a rust crate for snapshot testing. Unfortunately it depends on some cargo-specific stuff that doesn't work with buck, but I was able to get it working without patching the package by bypassing the Cargo dependent stuff.
- Insta also provides some cargo exensions on top of it to compare snapshots, which we can't have, so I made it clear on the test errors (see test plan) how to update the snapshots (via flag).

Future
- On future diffs I'll add a snapshot test for all the wire objects, this just adds to a first few to show it works and set up the plumbing.
- Writing objects manually for snapshot tests for **every** wire object would be a large amount of code, and troublesome when adding new (specially big) wire objects. We already have wire objects implementing `Arbitrary`, so I think it should be possible to use this to generate simple snapshot tests on automatically generated objects. (It will need some extra work to make sure they are created consistently.)

Reviewed By: markbt

Differential Revision: D30958077

fbshipit-source-id: 3d8663e7897e5f6eb4b97c24f47b37ef2f138e5a
facebook-github-bot pushed a commit that referenced this pull request Oct 11, 2021
Summary:
I've investigated the deadlock we hit recently and attempted to fix it with D31310626 (8ca8816).
Now there's a bunch of facts, an unproven theory and a potential fix. Even though the theory is not proven and hence the fix might be wrong, the theory sounds plausible and the fix should be safe to land even if the theory is wrong.

But first, the facts,

### Facts

a)  There's a repro for a deadlock - running `buck-out/gen/eden/mononoke/git/gitimport/gitimport#binary/gitimport --use-mysql-client --repo-name aosp --mononoke-config-path configerator://scm/mononoke/repos/tiers/prod --panic-fate=exit --cachelib-only-blobstore /data/users/stash/gitrepos/boot_images git-range 46cdf6335e1c737f6cf22b89f3438ffabe13b8ae b4106a25a4d8a1168ebe9b7f074740237932b82e` very often deadlocks (maybe not every time, but at least every 3rd run). So even though D31310626 (8ca8816) unblocked the mega_grepo_sync, it doesn't seem like it fixed or even mitigated the issue consistently. Note that I'm running it on devbig which has quite a lot of cpus - this may or may not make deadlock more likely.

b) The code deadlocks on [`find_file_changes()`](https://fburl.com/code/7i6tt7om) function, and inside this function we do two things - first we find what needs to be uploaded using [`bonsai_diff()`](https://fburl.com/code/az3v3sbk) function, and then we use [`do_upload()`](https://fburl.com/code/kgb98kg9) function to upload contents to the mononoke  repo. Note that both `bonsai_diff()` and `do_upload()` use git_pool. Bonsai_diff produces a stream, and then we apply do_upload to each element of the stream and finally we apply [`buffer_unordered(100)`](https://fburl.com/code/6tuqp3jd) to upload them all in parallel.

In simplified pseudo-code it looks like
```
bonsai_diff(..., git_pool, ...)
  .map(|entry| do_upload(..., git_pool, entry, ....)
  .try_buffer_unordered(100)
```

c) I've added a few printlns in P462159232, and run the repro command until it, well, repros. These printlns just shows when there was an attempt to grab a semaphore and when it successfully grabbed, and also who was the caller - `bonsai_diff()` or `do_upload()` function. This is the example output I'm getting P462159671. The output is a mess, but what's interesting here is that there exactly 100 more entries of "grabbing semaphore uploading" than "grabbed semaphore uploading". And 100 is exactly the size of the buffer in buffer_unordered.

### Theory
So above are the facts, and below is the theory that fits these facts (but as I said above, the theory is unproven yet). Let me start with a few assumption

1) Fact `c)` seem to suggest that when we run into a deadlock there was a full buffer of `do_upload()` futures that were all waiting on a semaphore.
1) If we look at [`buffer_unordered` code](https://docs.rs/futures-util/0.3.17/src/futures_util/stream/stream/buffer_unordered.rs.html#71) we can see that if buffer is full then it stops polling underlying stream until any of the buffered futures is done.
1) Once semaphore is released it [wakes up just a handful of futures](https://docs.rs/tokio/1.12.0/src/tokio/sync/batch_semaphore.rs.html#242) that wait for this semaphore.

Given this assumptions I believe the following situation is happening:
1) We get into a state where we have 100 `do_upload()` futures in buffer_unordered stream all waiting for a semaphore. At the same time all the semaphore permits are grabbed by `bonsai_diff()` stream (because of assumption #1)
2) Because of assumption #2 we don't poll underlying `bonsai_diff()` stream. However this stream has already [spawned a few tasks](https://fburl.com/code/9iq3tfad) which successfully grabbed the semaphore permit and are executing
3) Once these `bonsai_diff()` tasks are finished they [drop the semaphore permit](https://fburl.com/code/sw5fwccw), which in turn causes semaphore to be released and one of the tasks that are waiting to be woken up (see assumption #3). But now two things can happen - either a `do_upload()` future will be woken up or `bonsai_diff()` future will be woken up. If the former happens then `buffer_unordered` would poll the `do_upload()` future and continue executing. However if the latter happens (i.e. `bonsai_diff()` future was woken up) then it causes a deadlock - buffer_unordered() aren't going to poll `bonsai_diff()` stream, and so bonsai_diff() stream won't be able to make any progress. At the same time `do_upload()` futures aren't going to be polled at all because they weren't woken up in the first place, and so we deadlock.

There are a few things that seem unlikely in this theory (e.g. why we never wake up any of the `do_upload()` futures), so I do think it's worth validating. We could either add a bunch of println! in buffer_unordered and semaphore code, or try to setup [tokio-console](https://github.com/tokio-rs/console).

### Potential fix

Potential fix is simple - let's just add another spawn around `GitPool::with()` function. If the theory is right, then this fix should work. Once a semaphore is released and a `bonsai_diff()` future is woken up then this future will be spawned and hence will be able to complete successfully regardless of whether `buffer_unordered()` has its buffer full or not. And once `bonsai_diff()` fully completes then finally `do_upload()` futures will be woken up and `buffer_unordered()` will progress until completion.

If the theory is wrong then we just added one useless `tokio::spawn()`. We have a lot of spawns all over the place and `tokio::spawn` are cheap. So landing this fix shouldn't make things worse, but it might actually improve them (and the testing I did suggests that it indeed should improve them - see `Test plan` section below). I don't think it should be seen as a reason to not find the root cause

Reviewed By: mojsarn

Differential Revision: D31541432

fbshipit-source-id: 0260226a21e6e5e0b41736f86fb54a3abccd0a0b
facebook-github-bot pushed a commit that referenced this pull request Oct 13, 2021
Summary:
Background: I've been looking into derived data performance and found that
while overall performance is good, it depends quite a lot on the blobstore
latency i.e. the higher the latency the slower the derivation. What's worse is
that increasing blobstore latency even by 100ms might increase time of
derivation of 100 commits from 12 to 65 secs! [1]

However we have ways to mitigate it:
* **Option 1** If we use "backfill" mode then it makes derived data derivation less
sensitive to the put() latency
* **Option 2** If we use "parallel" mode then it makes derived data derivation less
sensitive to the get() latency.

We can use "backfill" mode for almost all derived data types (only exception is
filenodes), however "parallel" only enabled for a few derived data types (e.g.
fsnodes, skeleton manifests, filenodes).

In particular, we didn't have a way to do batch derived data derivation for
unodes, and so unodes derivation might get quite sensitive to the blobstore
get() latency. So this diff tries to address that.

I considered three options:
* **Option 1** The simplest option of implementing "parallel" mode for unodes is to just
do a unode warmup before we start a sequential derivation for a stack of commits. After the
warmup all necessary entries should be in cache, so derivation should be less latency sensitive.
This could work, but it has a few disadvantages, namely:
* We do additional traversal - not the end of the world, but it might get
 expensive for large commits
* We might fetch large directories that don't fit in cache more often than we
need to.

That said, because of it's simplicity it might be a reasonable option to keep
in mind, and I might get back to it later.

* **Option 2** Do a derivation for a stack of commits. We have a function to derive a
manifest for a single commit, but we could write a similar function to derive the whole stack at once.
That means for each changed file or directory we generate not a single change
but a stack of changes.
I was able to implement it, but the code was too complicated. There were quite
a few corner cases (particularly when a file was replaced with a directory, or
when deriving a merge commit), and dealing with all of them was a pain.
Moreover, we need to make sure it works correctly in all scenarios, and that
wouldn't be an easy thing to do.

* **Option 3** Do a derivation for a "simple" stack of commits. That's basically the
simplified version of option #2. Let's allow doing batch derivation only for
stacks that have no
a) merges
b) path changes that are ancestors of each other (which cause file/dir
conflicts).

This implementation is significantly simpler than option #2, and it should
cover most of the cases and hopefully bring perf benefits (though this is
something I'm yet about to measure). So this is what this diff implements

Reviewed By: yancouto

Differential Revision: D30989888

fbshipit-source-id: 2c50dfa98300a94a566deac35de477f18706aca7
facebook-github-bot pushed a commit that referenced this pull request Jun 1, 2022
…e (try #2)

Reviewed By: chadaustin

Differential Revision: D36567197

fbshipit-source-id: 154ae388491de43ee00d76f0aff9e1df476400c2
facebook-github-bot pushed a commit that referenced this pull request Jun 14, 2022
…if possible (try #2)"

Summary:
We need to release the InodePtr after markUnlinked on the child -- otherwise there is a potential deadlock. InodePtr's destructor can end up acquiring the parent's contents lock.

Original commit changeset: 154ae388491d

Original Phabricator Diff: D36567197 (27f27c0)

Reviewed By: miaoyipu, xavierd

Differential Revision: D37144357

fbshipit-source-id: d182e9bbe8fb0ef00232ca65c52cb7a2a1a6fde7
facebook-github-bot pushed a commit that referenced this pull request Jul 12, 2022
Summary:
During shutdown, the overlay background thread is terminated, and the overlay
closed, but EdenServer timers are still running. One of which being the
manageOverlay one which performs a maintenance on the Overlay. In some cases,
shutdown and this timer are racing each other, causing the timer to run after
the overlay has been closed, causing a use after free and crashing EdenFS.

To solve this, we can either add locks around closing the overlay and the
maintenance to guarantee that they do not race, or we can move the maintenance
operation to a thread that is known to be running only when the overlay is
opened. This diff takes the second approach.

The bug manifest itself like so:

  I0712 01:16:48.253296 21620 EdenServiceHandler.cpp:3394] [000002517432E1E0] initiateShutdown() took 368 µs
  I0712 01:16:48.253838 24700 PrjfsChannel.cpp:1185] Stopping PrjfsChannel for: C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main
  I0712 01:16:48.258533 19188 EdenServer.cpp:1624] mount point "C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main" stopped
  V0712 01:16:48.258814 19188 EdenMount.cpp:851] beginning shutdown for EdenMount C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main
  V0712 01:16:48.259895 19188 EdenMount.cpp:855] shutdown complete for EdenMount C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main
  V0712 01:16:48.287378 19188 EdenMount.cpp:861] successfully closed overlay at C:\\cygwin\\tmp\\eden_test.stop_at_ea.m3931hyz\\mounts\\main
  Unhandled win32 exception code=0xC0000005. Fatal error detected at:
  #0 00007FF707BA3D81 (7a686d9) facebook::eden::SqliteDatabase::checkpoint Z:\shipit\eden\eden\fs\sqlite\SqliteDatabase.cpp:99
  #1 00007FF7072D4090 facebook::eden::EdenServer::manageOverlay Z:\shipit\eden\eden\fs\service\EdenServer.cpp:2142
  #2 00007FF7074765D0 facebook::eden::PeriodicTask::timeoutExpired Z:\shipit\eden\eden\fs\service\PeriodicTask.cpp:32
  #3 00007FF707500F93 folly::HHWheelTimerBase<std::chrono::duration<__int64,std::ratio<1,1000> > >::timeoutExpired Z:\shipit\folly\folly\io\async\HHWheelTimer.cpp:286
  #4 00007FF70757BB44 folly::AsyncTimeout::libeventCallback Z:\shipit\folly\folly\io\async\AsyncTimeout.cpp:174
  #5 00007FF708E7AD45 (645b6fc) event_priority_set
  #6 00007FF708E7AA3A event_priority_set
  #7 00007FF708E77343 event_base_loop
  #8 00007FF707515FC5 folly::EventBase::loopMain Z:\shipit\folly\folly\io\async\EventBase.cpp:405
  #9 00007FF707515C62 folly::EventBase::loopBody Z:\shipit\folly\folly\io\async\EventBase.cpp:326
  #10 00007FF7072DC5EE facebook::eden::EdenServer::performCleanup Z:\shipit\eden\eden\fs\service\EdenServer.cpp:1212
  #11 00007FF707219BED facebook::eden::runEdenMain Z:\shipit\eden\eden\fs\service\EdenMain.cpp:395
  #12 00007FF7071C624A main Z:\shipit\eden\eden\fs\service\oss\main.cpp:23
  #13 00007FF708E87A94 __scrt_common_main_seh d:\A01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  #14 00007FFC96DC7034 BaseThreadInitThunk
  #15 00007FFC98C5CEC1 RtlUserThreadStart

Reviewed By: fanzeyi

Differential Revision: D37793444

fbshipit-source-id: cd33302789c2c7a29d566d5bac6e119eccf0a5f2
facebook-github-bot pushed a commit that referenced this pull request Aug 29, 2022
Summary:
X-link: facebookarchive/sapling-staging#2

Instead of hard coding 0.1, let's take the version as a parameter.

Reviewed By: sggutier

Differential Revision: D39068760

fbshipit-source-id: 1eda642b19ac961544161fa9f5ecacda9a292f6b
facebook-github-bot pushed a commit that referenced this pull request Oct 20, 2022
Summary:
It turns out that initializing objc types before (disabling appnap) and after
(via libcurl [2]) fork() would abort the program like:

  objc[<pid>]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.

It seems objc really dislikes fork. Disabling the objc logic before fork seems
to be the only way to unblock the issue.

Other approaches considered:
- Avoid `fork()`: not really fesiable for performance (startup, Python GIL) reasons.
- Ensure chgserver does not create threads (see https://bugs.python.org/issue33725).
  Not possible since appnope implicitly creates a (short-lived?) thread.
- Disable AppNap without using objc: does not seem trivial.
- Set `OBJC_DISABLE_INITIALIZE_FORK_SAFETY` to `YES`. Abort with a different
  message [1].

[1]:

```
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
```

[2]:

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00007ff81395f067 libobjc.A.dylib`objc_initializeAfterForkError
    frame #1: 0x00007ff81395f187 libobjc.A.dylib`performForkChildInitialize(objc_class*, objc_class*) + 274
    frame #2: 0x00007ff81394a479 libobjc.A.dylib`initializeNonMetaClass + 617
    frame #3: 0x00007ff813949f12 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt<false>&, bool) + 232
    frame #4: 0x00007ff813949c93 libobjc.A.dylib`lookUpImpOrForward + 1087
    frame #5: 0x00007ff81394e02f libobjc.A.dylib`object_getMethodImplementation + 153
    frame #6: 0x00007ff813b12934 CoreFoundation`_NSIsNSString + 55
    frame #7: 0x00007ff813b128dc CoreFoundation`-[NSTaggedPointerString isEqual:] + 36
    frame #8: 0x00007ff813b05609 CoreFoundation`CFEqual + 533
    frame #9: 0x00007ff813b0b2a3 CoreFoundation`_CFBundleCopyBundleURLForExecutableURL + 220
    frame #10: 0x00007ff813b069cb CoreFoundation`CFBundleGetMainBundle + 116
    frame #11: 0x00007ff813b28ade CoreFoundation`_CFPrefsGetCacheStringForBundleID + 71
    frame #12: 0x00007ff813b2f52f CoreFoundation`-[CFPrefsPlistSource setDomainIdentifier:] + 92
    frame #13: 0x00007ff813b2f46c CoreFoundation`-[CFPrefsPlistSource initWithDomain:user:byHost:containerPath:containingPreferences:] + 99
    frame #14: 0x00007ff813b2f351 CoreFoundation`__85-[_CFXPreferences(PlistSourceAdditions) withManagedSourceForIdentifier:user:perform:]_block_invoke + 156
    frame #15: 0x00007ff813c622a7 CoreFoundation`-[_CFXPreferences withSources:] + 60
    frame #16: 0x00007ff813cac80f CoreFoundation`-[_CFXPreferences withManagedSourceForIdentifier:user:perform:] + 240
    frame #17: 0x00007ff813b2a1ab CoreFoundation`-[CFPrefsSearchListSource addManagedSourceForIdentifier:user:] + 98
    frame #18: 0x00007ff813c83a18 CoreFoundation`__108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke.160 + 287
    frame #19: 0x00007ff813c836e8 CoreFoundation`-[_CFXPreferences withSearchLists:] + 60
    frame #20: 0x00007ff813b29f50 CoreFoundation`__108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke + 279
    frame #21: 0x00007ff813c83879 CoreFoundation`-[_CFXPreferences withSearchListForIdentifier:container:cloudConfigurationURL:perform:] + 374
    frame #22: 0x00007ff813b29a42 CoreFoundation`-[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:] + 137
    frame #23: 0x00007ff813b29978 CoreFoundation`_CFPreferencesCopyAppValueWithContainerAndConfiguration + 101
    frame #24: 0x00007ff8145c672b SystemConfiguration`SCDynamicStoreCopyProxiesWithOptions + 155
    frame #25: 0x000000010272a315 hg`Curl_resolv(data=0x00007f8dea888a00, hostname="api.github.com", port=443, allowDOH=true, entry=0x000000030a12be10) at hostip.c:675:30 [opt]
    frame #26: 0x000000010272a68c hg`Curl_resolv_timeout(data=<unavailable>, hostname=<unavailable>, port=<unavailable>, entry=<unavailable>, timeoutms=<unavailable>) at hostip.c:908:8 [opt] [artificial]
    frame #27: 0x0000000102753e1e hg`create_conn at url.c:3440:12 [opt]
    frame #28: 0x0000000102753cfe hg`create_conn(data=0x00007f8dea888a00, in_connect=0x000000030a12bed8, async=0x000000030a12bf8f) at url.c:4077:12 [opt]
    frame #29: 0x000000010275026b hg`Curl_connect(data=0x00007f8dea888a00, asyncp=0x000000030a12bf8f, protocol_done=0x000000030a12bfb2) at url.c:4156:12 [opt]
    frame #30: 0x000000010273dbd1 hg`multi_runsingle(multi=<unavailable>, nowp=0x000000030a12c020, data=0x00007f8dea888a00) at multi.c:1858:16 [opt]
    frame #31: 0x000000010273d6ae hg`curl_multi_perform(multi=0x00007f8deb0699c0, running_handles=0x000000030a12c074) at multi.c:2636:14 [opt]
    frame #32: 0x0000000102726ada hg`curl_easy_perform at easy.c:599:15 [opt]
    frame #33: 0x0000000102726aa8 hg`curl_easy_perform [inlined] easy_perform(data=0x00007f8dea888a00, events=false) at easy.c:689:42 [opt]
    frame #34: 0x00000001027269a4 hg`curl_easy_perform(data=0x00007f8dea888a00) at easy.c:708:10 [opt]
    frame #35: 0x00000001025e1cf6 hg`http_client::request::Request::send::h13a4e600f6bc5508 [inlined] curl::easy::handler::Easy2$LT$H$GT$::perform::h2ba0ae1da25a8852(self=<unavailable>) at handler.rs:3163:37 [opt]
```

Reviewed By: bolinfest

Differential Revision: D40538471

fbshipit-source-id: cd8611c8082fbe2d610efb78cb84defdb16d7980
facebook-github-bot pushed a commit that referenced this pull request Mar 9, 2023
Summary:
There are 2 reasons why this test is disabled:

1) `eden chown` crashes on macOS
2) Even if `eden chown` worked on macOS, it would require passwordless `sudo` which is not available on macOS Sandcastle hosts where we run our contbuild and diff time tests.

Since #2 is a big issue that we can't work around, let's disable the test.

Due to the test war room, I won't be fixing #1 in this diff. However, I created T147554174 to track fixing it.

Reviewed By: fanzeyi

Differential Revision: D43926151

fbshipit-source-id: f1572d2b876d45affdcdebf9b567e23c7730076a
facebook-github-bot pushed a commit that referenced this pull request Aug 21, 2023
Summary:
Creating an initial version of the gitexport CLI (for more context on why we need it, see T160586594).

This tool is supposed to take a repo and a list of paths as input and it should export all the history of those paths in a git repo.

## What does it do now?
Currently, this binary doesn't do anything useful. it just gets the history of a single path to be exported and prints their changeset ids and commit messages (for manual debugging).

The main point of this diff is to **set most of the structure/flow of the tool to get some early feedback** before I start implementing anything more complex.

Most of the functions don't have an actual implementation, but just do something simple (e.g. returning the first element of a vector) so it typechecks.

## What's my current plan?
1) Get the history of all the given paths. (This is mostly done in this diff already)
2) Merge the changesets into a single, topologically sorted, list of changesets
3) Strip irrelevant changes from every commit (T161205476).
4) Create a CommitGraph from this list (T161204758).
5) Export that CommitGraph to a new, temporary, Mononoke repo (T160787114).
6) Use existing tools to export that temporary repo to a git repo (T160787114).

The tricky bits are steps 2,3 and 4, which is where I expect to spend most of my time.

First, I'm not sure if event to create a CommitGraph at all, to be able to export the processed changesets to a new repo. If I do need to, I'm not sure if I should (a) strip the irrelevant file changes before or after creating the graph and (b) how to create a new repo and populate it with the commits from the graph I created. (b) is more of a implementation detail, so I don't worry about now...

The main unknowns for me are #2 and #4. Basically, how can I create a proper commit graph from a set of commits that are not direct descendants of each other. Assuming a linear history, I don't think it would be very complicated, but we also have to support branching, so I'm not sure how to do this efficiently...

## Examples
Let me put as simple example below. Commits with uppercase letters are relevant (i.e. should be exported) and lower case letters should now.

```
A -> b -> C -> D -> e
|-> f -> G
```

In this case, I want to have the following commit graph in the end:
```
A' -> C' -> D'
|-> G'
```
where X' is X stripped of irrelevant changes

## RFC
- This is my first Rust diff ever, so please LMK what horrible things I'm doing, bc I'm very likely doing a few 😂
- Does the plan I described above make sense?
- Any suggestions/ideas on how to efficiently stitch the changesets together would be appreciated! But I'll probably set up some time to discuss this problem specifically once I spend more time thinking about it...

## Next steps
- Implement steps #5 and #6 (T160787114) to get the entire E2E solution working with the simplest case (i.e. one path with linear history). This is basically exporting the commit graph to a git repo (maybe through a temporary mononoke repo).
  - Update integration test case to actually run and test the tool with the simple case.
- Figure out how to properly create a commit graph from a list of changeset lists.
- Add test cases for multiple paths and edge cases, like having multiple branches.

Reviewed By: RajivTS

Differential Revision: D48226070

fbshipit-source-id: eed970a8e4697ab10682e3b93863e6d621adaacc
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2023
Summary:
## Problem
If a export path was created by renaming another, the user might want to follow the history from the old path name. An example of this is `eden/mononoke` which was renamed in 2020 from `scm/mononoke` (D19722832).

See T164121717 and its attached diffs for more details about this problem.

## Solutions considered

**This diff implements solution #2.**

### 1. Try to automatically detect and follow renames
I spent some time implementing this, but there are a lot of edge cases and implementation would still take significant time and is likely to have issues with correctness (e.g. export what we don't want).

We decided it wasn't worth pursuing this given that this tool will be used by engineers.

### 2. Warn user and support passing rename revision manually (this diff)
During an export, we can (a) warn users when a changeset is likely creating the export path by renaming another one and (b) provide a way for the user to specify that we should export commits from a path but **only up to a specific changeset** (i.e. a HEAD commit).

This is what this diff and the one above implements.

## This diff
This diff implements support in the library, i.e. when export paths are given with specific head commits, it builds the proper `GitExportGraphInfo` and commits are copied properly given this `GitExportGraphInfo`.

The next diff adds arguments to the CLI so that this functionality can be used.

There's still one small improvement I want to do to the unit tests, but I figured I could publish this to speed up review while I update the tests (and possibly think of more test cases).

## TODO:
- Update unit tests to check copy_from field

Reviewed By: mitrandir77

Differential Revision: D49682305

fbshipit-source-id: 535f314c76214cfaa6aca90c9b48a4c7523b95de
facebook-github-bot pushed a commit that referenced this pull request Oct 19, 2023
Summary:
## Context
Implicit deletes for a given changeset are found by:
- Iterating over all the file changes and keeping only the ones that modify/add,
- For each of these file changes, lookup an entry in the parent's manifest for that path.
  - If an entry is found and it's a leaf, no implicit deletes.
  - If an entry is found and it's a tree, return all the leaf entries from that tree, i.e. all the files/directories under the directory being implicitly deleted.

## Problem
`find_entries` ([code](https://fburl.com/code/ugxvq3jv)) seems to more efficient way to do manifest lookups for multiple paths.

The current implementation iterates over each file change and looks up its entry using `find_entry`, which just calls `find_entries` passing a single path.

**Improvement #1:** Pass all the paths directly to `find_entries`.

The other problem is that the calls to `list_leaf_entries` returns a stream and the streams were being flattened using `try_flatten_stream`, which IIUC preserves the order of the streams and **executes them serially**.

**Improvement #2:** Use `try_flatten_unordered` to introduce concurrency in the execution of the streams.

#thanks mitrandir77 for helping me flatten streams concurrently and without forcing execution!

## Notes
I initially used 100 as the concurrency limit, but increasing it to 10000 led to a big improvement, so I decided to go with that for now.

`get_implicit_deletes_single_parent` gets called for each parent of the changeset. I'm assuming that most changesets don't have a high number of parents and, because there's already a high degree of concurrency in `get_implicit_deletes_single_parent`, I decided to set the concurrency limit to 10 when flattening the results of each parent.

Reviewed By: markbt

Differential Revision: D50083311

fbshipit-source-id: 04f20f691597001b0f60651a0b635bff2d6e1596
facebook-github-bot pushed a commit that referenced this pull request Nov 8, 2023
Summary:
I found this code that still used structopt  as the `arg_enum!` macro caused a clippy warning:
```
warning: useless use of `vec!`
  --> third-party/rust/vendor/clap-2.34.0/src/macros.rs:317:1
   |
   = note: in this expansion of `arg_enum!` (#1)
   |
   = note: in this expansion of `arg_enum!` (#2)
  ::: third-party/rust/vendor/clap-2.34.0/src/macros.rs:380:9
   |
   = note: in this macro invocation (#2)
   |
  ::: fbcode/eden/mononoke/facebook/derived_data_service/client/main.rs:33:1
   |
33 | / arg_enum! {
34 | |     #[derive(Debug)]
35 | |     enum ArgDerivationTypes {
36 | |         Single,
...  |
39 | |     }
40 | | }
   | |_- in this macro invocation (#1)
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#useless_vec
   = note: `#[warn(clippy::useless_vec)]` on by default

warning: 1 warning emitted
```

Time to move-on.

We try to minimize behaviour changes as this diff is a refactoring.
However, the derivation type now expects a lowercase argument as that's the default behaviour for ValueEnum. We're not going out of our way to stick to the old behaviour and update the tests instead.

Reviewed By: mitrandir77

Differential Revision: D51109785

fbshipit-source-id: 8ad55f842466c2305e97b85c7a733d2840d41b90
facebook-github-bot pushed a commit that referenced this pull request Dec 14, 2023
Summary:
Synced with pierrec offline about this.

The current TW spec uses this flag and we haven't been able to deploy a new version since I migrated the forward syncer to Mononoke app and clap v4 in D51583743.

We'll fix this by:

1. Enabling logging to the table by default and adding a dummy `--log-to-scuba` that doesn't do anything but maintains backwards compatibility with the command.
2. After all tasks are updated to the new version, update the tw spec and remove the `--log-to-scuba` flag and add the `--scuba-dataset "mononoke_x_repo_sync"` arg, which is the new way to enable scuba logging.
3. Then disable logging by default. Prod will still log because we enabled it in step #2.

Differential Revision: D51846944

fbshipit-source-id: f106b33fa91a1f227ce6ca7e8f589a106edc459d
facebook-github-bot pushed a commit that referenced this pull request Mar 21, 2024
Summary:
## This stack
See D55064943.

## This diff
Fixes issue #2 surfaced in D55068950.

Differential Revision: D55126396

fbshipit-source-id: 43df6c1f61625f947e3f850133ced3dca50c1555
facebook-github-bot pushed a commit that referenced this pull request Mar 26, 2024
Summary:
# Context

We received several reports of scm-aware Watchman queries returning incorrect results when FilteredFS was being used and filters were actively enabled[1]. In these reports, users would expect that modified files between commits would be reported as changed, however Watchman (err, EdenFS in this case) reported these files as unmodified between commits. This was causing build tools to use stale file content.

After further investigation, I found that disabling filters (running `hg filteredfs disable`) caused Watchman to report correct results. Therefore, the bug only occurred when a non-null filter was enabled on FilteredFS. This significantly narrowed down the possible causes.

# Root Cause

The root cause was a bug in the ObjectID comparison logic for FilteredBackingStores. FilteredBackingStores use FilteredObjectIDs. When determining differences between these IDs, we must take into account three things:

1) Do the FilteredObjectIDs have the same type (Blob, Tree, or Unfiltered Tree)?
2) Do the Filters associated w/ each FilteredObjectID resolve to the exact same coverage (i.e. do they filter the same descendents)?
3) Do the underlying ObjectIDs associated w/ each FilteredObjectID refer to the same object(s).

We were successfully determining #1 and #2, but we failed to determine #3 correctly in one specific edge case. This was causing us to incorrectly calculate two FilteredObjectIDs as identical when they were actually different.

# This diff

This diff introduces a test that shows the bad behavior. We would expect this test to fail if we EXPECT_NE() instead of EXPECT_EQ() on line 824 of FilteredBackingStoreTest.cpp

Reviewed By: kmancini

Differential Revision: D55345652

fbshipit-source-id: 998427cd09c9f8965ec9c84cb21f3c25222ff7d0
facebook-github-bot pushed a commit that referenced this pull request Mar 26, 2024
Summary:
# Context

We received several reports of scm-aware Watchman queries returning incorrect results when FilteredFS was being used and filters were actively enabled[1]. In these reports, users would expect that modified files between commits would be reported as changed, however Watchman (err, EdenFS in this case) reported these files as unmodified between commits. This was causing build tools to use stale file content.

After further investigation, I found that disabling filters (running `hg filteredfs disable`) caused Watchman to report correct results. Therefore, the bug only occurred when a non-null filter was enabled on FilteredFS. This significantly narrowed down the possible causes.

# Root Cause

The root cause was a bug in the ObjectID comparison logic for FilteredBackingStores. FilteredBackingStores use FilteredObjectIDs. When determining differences between these IDs, we must take into account three things:

1) Do the FilteredObjectIDs have the same type (Blob, Tree, or Unfiltered Tree)?
2) Do the Filters associated w/ each FilteredObjectID resolve to the exact same coverage (i.e. do they filter the same descendents)?
3) Do the underlying ObjectIDs associated w/ each FilteredObjectID refer to the same object(s).

We were successfully determining #1 and #2, but we failed to determine #3 correctly in one specific edge case. This was causing us to incorrectly calculate two FilteredObjectIDs as identical when they were actually different.

# This diff

This diff ensures that object equality (aka #3 from above) is checked in all cases. Therefore we don't hit a scenario where #1 and #2 are equal and #3 is ignored altogether.

Reviewed By: kmancini

Differential Revision: D55350885

fbshipit-source-id: bdafab99e56ddfa98446b4ba26dc0bde96121dad
facebook-github-bot pushed a commit that referenced this pull request Apr 24, 2024
Summary:
# Context

Our last attempt at removing the Python version of `eden redirect` failed (see the SEV attached to D55426809). Our next attempt should be done in a phased manner to avoid similar breakages.

# Solution

We will slowly remove all possible fallbacks to Python so that we can be sure no commands are running the Python version of `eden redirect` prior to deleting the code. New rollout plan is as follows:

1) Remove the Chef recipe that writes `/etc/eden/edenfsctl_rollout.json` (Done)
2) Remove "redirect" from the default list of experimental commands in the Rust code.
3) Add a fallback to Rust if Python parsing fails for experimental commands
4) Add an EdenFS event that's logged every time we fallback to the Python `eden redirect` codepath
5) Monitor Scuba data to ensure the Python codepath isn't being hit
6) Delete Python `eden redirect` altogether

# This diff

This diff accomplishes #2 from the above list which should make it pretty much impossible to hit the Python code path *unless* some other bug exists in the Rust redirect code that causes valid commands to fail to parse.

Reviewed By: kmancini

Differential Revision: D56277269

fbshipit-source-id: 9ca14fe960fe0e7deff5a424a9b7074983473d03
facebook-github-bot pushed a commit that referenced this pull request May 1, 2024
Summary:
We have discovered a deadlock in EdenFS in S412223. The deadlock appears when
all fsChannelThreads have a stack trace that looks like:

```
  thread #225, name = 'FsChannelThread', stop reason = signal SIGSTOP
    frame #0: 0x00007f7ca85257b9
    frame #1: 0x0000000007d73aa4 edenfs`bool folly::DynamicBoundedQueue<folly::CPUThreadPoolExecutor::CPUTask, false, false, true, 8ul, 7ul, folly::DefaultWeightFn<folly::CPUThreadPoolExecutor::CPUTask>, std::atomic>::canEnqueue<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>>(std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>> const&, unsigned long) + 212
    frame #2: 0x0000000007d73557 edenfs`bool folly::DynamicBoundedQueue<folly::CPUThreadPoolExecutor::CPUTask, false, false, true, 8ul, 7ul, folly::DefaultWeightFn<folly::CPUThreadPoolExecutor::CPUTask>, std::atomic>::tryEnqueueUntilSlow<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>, folly::CPUThreadPoolExecutor::CPUTask>(folly::CPUThreadPoolExecutor::CPUTask&&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>> const&) + 39
    frame #3: 0x0000000007d722c0 edenfs`facebook::eden::EdenTaskQueue::add(folly::CPUThreadPoolExecutor::CPUTask) + 576
    frame #4: 0x0000000005172cbd edenfs`folly::CPUThreadPoolExecutor::add(folly::Function<void ()>) + 557
    frame #5: 0x000000000503c4d8 edenfs`void folly::Executor::KeepAlive<folly::Executor>::add<folly::Function<void (folly::Executor::KeepAlive<folly::Executor>&&)>>(folly::Function<void (folly::Executor::KeepAlive<folly::Executor>&&)>&&) && + 216
    frame #6: 0x000000000503c257 edenfs`folly::futures::detail::DeferredExecutor::addFrom(folly::Executor::KeepAlive<folly::Executor>&&, folly::Function<void (folly::Executor::KeepAlive<folly::Executor>&&)>) + 135
    frame #7: 0x000000000503baf5 edenfs`folly::futures::detail::CoreBase::doCallback(folly::Executor::KeepAlive<folly::Executor>&&, folly::futures::detail::State) + 565
    frame #8: 0x00000000052e5e0d edenfs`folly::futures::detail::CoreBase::proxyCallback(folly::futures::detail::State) + 493
    frame #9: 0x00000000052e5bb2 edenfs`folly::futures::detail::CoreBase::setProxy_(folly::futures::detail::CoreBase*) + 66
    ...
```

This stack means that an FsChannelThread is blocked waiting to enque a task to
the EdenTaskQueue. The EdenTaskQueue is the task queue that FsChannelThread's
pull work from. So the deadlock is that all FsChannelThreads are blocked trying
to add to a full queue which can only be emptied by FsChannelThreads.

There are two contributing reasons for this happening:
1. The FsChannelThread will queue a bunch of network fetches into the Sapling
queue. When those all finish the future callback chain runs on the
FsChannelThreads. So the backing store accumulates a bunch of work then when
the fetches complete it dumps a bunch of tasks onto the FsChannelThread's queue.
So the backing store is filling up the FsChannelThread task queue. Other threads
could theoretically do this to, but backingstore is the main one I have seen.

2. the FsChannelThread might enqueue to their own threads. Folly futures have
some smarts to try to prevent a task running on executor a from enqueueing work
onto executor a (and instead just run the callback inline), see: https://www.internalfb.com/code/fbsource/[c7c20340562d2eab5f5d2f7f45805546687942d9]/fbcode/folly/futures/detail/Core.cpp?lines=147-148.
However, that does not prevent a future that is unknowningly running on an
executor's thread from enqueueing to that thread's executor queue. I belive that
kind of thing happens when we do stuff like this: https://www.internalfb.com/code/fbsource/[824f6dc95f161e141bf9b821a7826c40b570ddc3]/fbcode/eden/fs/inodes/TreeInode.cpp?lines=375-376
The outerlamba is aware which exector it's running on, but the future we start
inside the lambda is not, so when we add that thenValue, it doesn't realize it's
enqueuing to it's own executor. I wrote up this toy program to show when folly
will enqueue vs run inline: https://www.internalfb.com/intern/commit/cloud/FBS/bce3a906f53913ab8dc74944b8b50d09d78baf9a.
script: P1222930602, output: P1222931093. This shows if you return a future
from a future callback, the next callback is enqueued. and if you have a callback
on a future returned by another future's callback, the callback on the returned
future is enqueued.

So in summary, backingstore fills up the FsChannelThread work queue then
FsChannelThread trys to push work onto it's own queue and deadlocks.

Potential solutions:
**1- Queued Immediate Executor.**

This behavior was likely introduced here: D51302394. We moved from queued
immediate executor to the fschannelthreads. queued immediate executor uses an
unbounded queue looks like: https://www.internalfb.com/code/fbsource/[c7c20340562d2eab5f5d2f7f45805546687942d9]/fbcode/folly/executors/QueuedImmediateExecutor.h?lines=39
so that is why we didn't have the problem before.

I don't love going back to this solution because we are moving away from
queued immediate exector because it makes it easy to cause deadlocks. For
example the one introduced in D50199539.

**2-  Make the queue error instead of block when full.**

We use to do that for the Eden CPU thread pool, and it resulted in a lot of
errors from eden that caused issues for clients. See: D6490979. I think we are
kicking the can down the road if we error rather than block.

**3- Don't bother switching to the fschannelthreads to complete the fuse request.**

This is likely going to be the same as 1. Unless we are going to undo the
semifuture-ization we have been doing. or perhaps we could start the fuse request
on the fschannel threads, then finish it on the eden cpu threads. Which is pretty
much the same thing as this diff except more sets of threads involved. So I
prefer this change.

**4- add backpressure somewhere else.**

If we prevent the backingstore/other threads from being able to fill up the
fschannelthread queue then it should be impossible for the queue to fill up.
Because there would be no fan out (one task out then one task in).

However, this is fragile, we could easily introduce fan out again and then
end up back here. Also this would mean we basically block all requests in the
fuse layer instead of the lower layers of eden. We would need to redo the
queueing in the backing store layer.

The fragile-ness and complexity makes me not like this solution.

**5 - linearize all future chains.**
The reason that the fschannelthreads are enqueing to their own queue is that we
have nested futures. If we didn't nest then folly future would run callbacks
inline. So if we de-nest all our futures this issue should theoritically go away,
because the fschannelthreads will not enqueue to their own queue.

So if we de-nest all our futures this issue should theoritically go away.
However, I don't know if we can completely get rid of returning a future in
a future callback.

I don't love this solution as I know there are some explicit places where we
choose to nest (I'm looking at you PrjFSDispatcher). so it would be VERY
confusing where am I supose to nest and where am I not. it would be easy to
do the wrong thing and re-intoduce this bug. Also this is a ton of places we
need to change and they are not easy to find. So don't like this option because
it's not very "pit of success" - too fragile and too easy to get wrong the first
time.

**6 - unbound the fschannelthread queue.**

It's not great. But there is precident for this. We unbounded the eden cpu
thread pool for basically the same reason. See D6513572.

The risk here is that we are opening out selves up to OOM. The queue might
grow super duper large and then get eden oom killed. We probably should add a
config to this change so we can roll this out carefully and watch for ooms as
we rollout. Additionally, long term we likely want to rethink how we do threading
to archetect eden away from this.

I prefer this solution the most. That's what I have implemented here.

---------------------------
note: I am removing the limit on the number of outstanding fs request we
process at once in this diff. That config was not exactly working how we wanted
any ways. Queueing in the backing store let us handle essentially infinite
requests at once as the Sapling request queue does not have a max size.
I can follow up with a semaphore in the fuse/nfs layers to rate limit the
number of active requests. Though fwiw I will likely bump the limit at least
initially by a lot when I do that since we realisiticly were allowing clients
to do infinite requests previously.

Reviewed By: jdelliot, genevievehelsel

Differential Revision: D56553375

fbshipit-source-id: 9c6c8a76bd7c93b00d48654cd5fc31d1a68dc0b2
facebook-github-bot pushed a commit that referenced this pull request May 15, 2024
Summary:
## This stack
See D57204338.

## This diff
As mentioned in D56826268 and D56881703, there's currently a bug in `get_git_submodule_action_by_version` (just renamed from `get_strip_git_submodules_by_version`) that leads using the default submodule action (i.e. STRIP) when it's called during backsyncing (large to small).

Luckily this led to a behaviour very close to what we're looking for **now** while backsyncing submodule changes is not supported:
1. If the commit doesn't touch submodule expansions, everything works normally.
2. If it does, it will sync it but the synced commit will be "broken". It will fail to derive fsnodes and other manifests.

We still want to keep #1, but we don't want #2. What we want right now is to **explicitly fail sync if someone tries to backsync any changes modifying submodule expansions**.

This diff implements this.

Reviewed By: mitrandir77

Differential Revision: D57151944

fbshipit-source-id: b4931d3cc58815b2beaeaae90de99dda15b5fbb9
facebook-github-bot pushed a commit that referenced this pull request Jun 15, 2024
Summary:
start write the version #2 of the format

the format v2 was introduced a long while ago (more than a month), now new code is everywhere capable of
reading the new format. Let's move on and write the v2 format.

Reviewed By: muirdm

Differential Revision: D58591864

fbshipit-source-id: d86fe4c8521d2cd66cd5f7a905da64f82d1eb204
facebook-github-bot pushed a commit that referenced this pull request Jul 1, 2024
Summary: implement support for eagerepo - add a store

Reviewed By: quark-zju

Differential Revision: D58870596

fbshipit-source-id: 4d2f9d7cdaa3993f17304711188cea1cfb41887f
facebook-github-bot pushed a commit that referenced this pull request Sep 24, 2024
Summary: implement split algo

Reviewed By: muirdm

Differential Revision: D62855251

fbshipit-source-id: 2e24b94eb2fa681cfb46907d8abba0aaf8075724
facebook-github-bot pushed a commit that referenced this pull request Oct 2, 2024
Summary:
# Context

We're very inconsistent with the way we refer to extra information that's attached to blobs or trees. In some places we use Metadata, and in others we use "aux data". For example, in Eden we have used BlobMetadata and TreeMetadata to refer to this extra data. In Sapling, we also refer to this extra data as File and Tree Aux Data. In Sapling, "metadata" is used to refer to an entirely different type of data. And finally, Mononoke uses Metadata in most places. Lastly, EdenFS uses "Metadata" in a number of places, the most confusing being InodeMetadata which refers to the information that's returned by a `stat` call (ex. mtime, ctime, blk_size, etc).

Given all the above information, during the RE CAS focus room we decided that unifying the clients on "Aux data" would be the best path forward.

# This Diff

This diff migrates as many references from {Tree,Blob}Metadata -> {Tree,Blob}AuxData as possible. A few key places are left out:

1) Any thrift definitions cannot be changed. This is unfortunate, but we can possibly use aliases? That decision is open for discussion.
2) Any ODS counter names (i.e. the actual strings used to differentiate counters when querying ODS) are left as "metadata". I'd like to eventually change these, but we're stuck for now. We might be able to start duplicating some of those counters and then slowly migrate after ~90 days (or whatever our retention is).
3) The Markdown files for Eden stats are left unchanged due to #2 above.
4) The SerializedBlobMetadata fuzzing target and file are left unchanged. There's some weird security setup that I can't figure out, and it prevents me from modifying the target name or the sources of the target.

As far as I can tell, everything else is changed. Please let me know if you think I missed anything.

Reviewed By: jdelliot

Differential Revision: D63152972

fbshipit-source-id: 91d935bdb92a7d0f59db5fbbd8c54572b3585a1d
facebook-github-bot pushed a commit that referenced this pull request Oct 29, 2024
Summary:
# Context
Processes having open handles to files when source control actions happen can leave eden in a corrupted state

# This Diff
This diff adds test runners for each open flag

# Technical Details
Default state: commits A->B->C
There is a test file in commit B that is modified from commit A. In commit C, the file is removed.

Each test performs the following:
checks if the test file can be read
checks if checking out commit A succeeds
checks if checking out commit C succeeds

# Discussion Points
The following bugs are revealed by this integration test:
 - When a process has a handle with only share_read, when checking out a commit where that file does not exist, the checkout will succeed. The file will not be changed, and diff/status will not report that the file is modified from the expected state(aka the file shouldn't exist on that commit so it should show up as a new file, or the checkout should fail/partial checkout state)
 - When a process has a handle with share_read and share_delete, checkouts will succeed completely. I'm not sure if this is actually a bug or if share_delete is a superset of share_write

Reviewed By: ViniGupta08

Differential Revision: D63797373

fbshipit-source-id: b00b342300d54cd08765655e2e63bdd764b8988e
facebook-github-bot pushed a commit that referenced this pull request Oct 30, 2024
Summary:
Fixes this (which was polluting my `arc rust-check` output):

```
warning: unused variable: `time`
   --> fbcode/eden/scm/saplingnative/bindings/modules/pymetalog/src/lib.rs:119:38
    |
119 |     def commit(&self, message: &str, time: Option<u64> = None, pending: bool = false) -> PyResult<Bytes> {
    |                                      ^^^^
    |
help: `time` is captured in macro and introduced a unused variable
   --> third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class.rs:478:1
    |
    = note: in this expansion of `py_class!` (#1)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class.rs:537:9
    |
    = note: in this macro invocation (#2)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class.rs:543:1
    |
    = note: in this expansion of `$crate::py_class_impl_item!` (#18)
   --> third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:30:1
    |
    = note: in this expansion of `$crate::py_class_impl!` (#2)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#3)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#4)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#5)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#6)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#7)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#8)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#9)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#10)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#11)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#12)
    |
    = note: in this expansion of `$crate::py_class_impl!` (#13)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:249:5
    |
    = note: in this macro invocation (#3)
    |
    = note: in this macro invocation (#4)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:2392:5
    |
    = note: in this macro invocation (#5)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:2970:5
    |
    = note: in this macro invocation (#7)
    |
    = note: in this macro invocation (#13)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:2999:13
    |
    = note: in this macro invocation (#14)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:3005:5
    |
    = note: in this macro invocation (#8)
    |
    = note: in this macro invocation (#11)
    |
    = note: in this macro invocation (#12)
   ::: third-party/rust/vendor/cpython-0.7.2/src/py_class/py_class_impl3.rs:3120:5
    |
    = note: in this macro invocation (#6)
    |
    = note: in this macro invocation (#9)
    |
    = note: in this macro invocation (#10)
   --> third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:196:1
    |
    = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#14)
    |
    = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#15)
    |
    = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#16)
    |
    = note: in this expansion of `$crate::py_argparse_parse_plist_impl!` (#17)
   ::: third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:201:9
    |
    = note: in this macro invocation (#18)
   ::: third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:271:9
    |
    = note: in this macro invocation (#15)
   ::: third-party/rust/vendor/cpython-0.7.2/src/argparse.rs:331:9
    |
    = note: in this macro invocation (#16)
    |
    = note: in this macro invocation (#17)
    |
   ::: fbcode/eden/scm/saplingnative/bindings/modules/pymetalog/src/lib.rs:36:1
    |
36  | /   py_class!(pub class metalog |py| {
37  | |       data log: Arc<RwLock<MetaLog>>;
38  | |       data fspath: String;
...   |
226 | |       }
227 | |   });
    | |____- in this macro invocation (#1)
    = note: `#[warn(unused_variables)]` on by default

```

Reviewed By: quark-zju

Differential Revision: D65218833

fbshipit-source-id: fa69c1a24a32b7eff857070528f6337ec0b3711c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants