Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several issues uncovered while investigating recent CI failures #155

Merged
merged 6 commits into from
Nov 22, 2023

Commits on Nov 17, 2023

  1. Remove unneeded code

    These variables are unused, as `ProcessManager.create_from_pid` now
    parses the maps file itself; the caller doesn't need to.
    
    Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
    godlygeek committed Nov 17, 2023
    Configuration menu
    Copy the full SHA
    e262ce2 View commit details
    Browse the repository at this point in the history
  2. Fetch maps after pausing the process

    Previously we were pausing the process only after we had already fetched
    the processes mappings, leading to a potential TOCTOU issue: the fetched
    mappings might be stale by the time the process was paused and we were
    ready to use them.
    
    Handle this by creating a new `ProcessTracer` class responsible for
    pausing the process. This class matches the behavior that was previously
    found in `BlockingProcessMemoryManager`, except that it's not also an
    instance of `AbstractRemoteMemoryManager`: it handles only tracing the
    process on construction, and detaching on destruction.
    
    In blocking mode, we create a `ProcessTracer` before reading the
    mappings. Later, we pass it to the `ProcessManager`, which takes over
    ownership of it, destroying it when we're done analyzing the process and
    want to detach.
    
    Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
    godlygeek committed Nov 17, 2023
    Configuration menu
    Copy the full SHA
    d895cfc View commit details
    Browse the repository at this point in the history
  3. Avoid a redundant call to get the process's TIDs

    By having the tracer expose the list of thread IDs that it stopped, the
    manager can avoid a call to find the list of thread IDs itself. Since
    every thread must be paused when there is a tracer, the tracer's list of
    threads must be accurate.
    
    Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
    godlygeek committed Nov 17, 2023
    Configuration menu
    Copy the full SHA
    1972249 View commit details
    Browse the repository at this point in the history
  4. Handle threads dying or starting while we attach

    Previously, we were fetching the process's list of threads once, and
    iterating over that list attaching to each thread. This is racy, though:
    after we get the list of threads and before we've suspended every
    thread, it's still possible for new threads to start or old threads to
    finish.
    
    Handle this by fetching the list of threads at least twice. When we
    encounter an error trying to attach to a thread, don't report it unless
    the next call to list all of the threads says that thread still belongs
    to the process we're attaching to (if the thread has disappeared from
    the list, then the reason we failed to attach to that thread is that it
    has died).
    
    Similarly, if any new threads show up in the second list of threads,
    we'll try to pause those, repeating this until the set of threads that
    we've paused contains every thread that the process currently owns.
    
    Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
    godlygeek committed Nov 17, 2023
    Configuration menu
    Copy the full SHA
    da0f4ae View commit details
    Browse the repository at this point in the history
  5. Mark our Cython logging handler as noexcept

    This is called from C++ code that doesn't know how to propagate a Python
    exception if one should occur. Tell Cython not to allow an exception to
    escape from this function, and to instead print and discard one if it
    should occur.
    
    Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
    godlygeek committed Nov 17, 2023
    Configuration menu
    Copy the full SHA
    3938788 View commit details
    Browse the repository at this point in the history
  6. Improve handling of non-UTF-8 log messages

    Leverage our existing `_try_to_decode_string` function which uses
    `errors="replace"` for decoding, guaranteeing that log messages won't be
    lost entirely if they contain non-UTF-8 data.
    
    Also, pass the log message around as a C++ string rather than
    a C string, which is more efficient anyway (since the size is known and
    doesn't need to be computed).
    
    Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
    godlygeek committed Nov 17, 2023
    Configuration menu
    Copy the full SHA
    db30c2f View commit details
    Browse the repository at this point in the history