Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: git/git
base: 11e5d0a262c9ba664b7203a279513385bf055745
Choose a base ref
...
head repository: git/git
compare: 2d3491b117c6dd08e431acc3904a546c4304d276
Choose a head ref
  • 6 commits
  • 2 files changed
  • 1 contributor

Commits on Sep 7, 2021

  1. tr2: remove NEEDSWORK comment for "non-procfs" implementations

    I'm fairly sure that there is no way on Linux to inspect the process
    tree without using procfs, any tool such as ps(1), top(1) etc. that
    shows this sort of information ultimately looks the information up in
    procfs.
    
    So let's remove this comment added in 2f732bf (tr2: log parent
    process name, 2021-07-21), it's setting us up for an impossible task.
    
    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Acked-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    avar authored and gitster committed Sep 7, 2021
    Copy the full SHA
    7d9c80f View commit details
    Browse the repository at this point in the history
  2. tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux

    Rewrite a comment added in 2f732bf (tr2: log parent process name,
    2021-07-21) to describe what we might do under
    TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring
    to "something extra".
    
    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Acked-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    avar authored and gitster committed Sep 7, 2021
    Copy the full SHA
    f2cc888 View commit details
    Browse the repository at this point in the history
  3. tr2: stop leaking "thread_name" memory

    Fix a memory leak introduced in ee4512e (trace2: create new
    combined trace facility, 2019-02-22), we were doing a free() of other
    memory allocated in tr2tls_create_self(), but not the "thread_name"
    "struct strbuf".
    
    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Acked-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    avar authored and gitster committed Sep 7, 2021
    Copy the full SHA
    48f6871 View commit details
    Browse the repository at this point in the history
  4. tr2: leave the parent list empty upon failure & don't leak memory

    In a subsequent commit I'll be replacing most of this code to log N
    parents, but let's first fix bugs introduced in the recent
    2f732bf (tr2: log parent process name, 2021-07-21).
    
    It was using the strbuf_read_file() in the wrong way, its return value
    is either a length or a negative value on error. If we didn't have a
    procfs, or otherwise couldn't access it we'd end up pushing an empty
    string to the trace2 ancestry array.
    
    It was also using the strvec_push() API the wrong way. That API always
    does an xstrdup(), so by detaching the strbuf here we'd leak
    memory. Let's instead pass in our pointer for strvec_push() to
    xstrdup(), and then free our own strbuf. I do have some WIP changes to
    make strvec_push_nodup() non-static, which makes this and some other
    callsites nicer, but let's just follow the prevailing pattern of using
    strvec_push() for now.
    
    We'll also need to free that "procfs_path" strbuf whether or not
    strbuf_read_file() succeeds, which was another source of memory leaks
    in 2f732bf, i.e. we'd leak that memory as well if we weren't on a
    system where we could read the file from procfs.
    
    Let's move all the freeing of the memory to the end of the
    function. If we're still at STRBUF_INIT with "name" due to not having
    taken the branch where the strbuf_read_file() succeeds freeing it is
    redundant. So we could move it into the body of the "if", but just
    handling freeing the same way for all branches of the function makes
    it more readable.
    
    In combination with the preceding commit this makes all of
    t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux.
    
    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Acked-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    avar authored and gitster committed Sep 7, 2021
    Copy the full SHA
    6eccfc3 View commit details
    Browse the repository at this point in the history
  5. tr2: do compiler enum check in trace2_collect_process_info()

    Change code added in 2f732bf (tr2: log parent process name,
    2021-07-21) to use a switch statement without a "default" branch to
    have the compiler error if this code ever drifts out of sync with the
    members of the "enum trace2_process_info_reason".
    
    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Acked-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    avar authored and gitster committed Sep 7, 2021
    Copy the full SHA
    326460a View commit details
    Browse the repository at this point in the history
  6. tr2: log N parent process names on Linux

    In 2f732bf (tr2: log parent process name, 2021-07-21) we started
    logging parent process names, but only logged all parents on Windows.
    on Linux only the name of the immediate parent process was logged.
    
    Extend the functionality added there to also log full parent chain on
    Linux.
    
    This requires us to lookup "/proc/<getppid()>/stat" instead of
    "/proc/<getppid()>/comm". The "comm" file just contains the name of the
    process, but the "stat" file has both that information, and the parent
    PID of that process, see procfs(5). We parse out the parent PID of our
    own parent, and recursively walk the chain of "/proc/*/stat" files all
    the way up the chain. A parent PID of 0 indicates the end of the
    chain.
    
    It's possible given the semantics of Linux's PID files that we end up
    getting an entirely nonsensical chain of processes. It could happen if
    e.g. we have a chain of processes like:
    
        1 (init) => 321 (bash) => 123 (git)
    
    Let's assume that "bash" was started a while ago, and that as shown
    the OS has already cycled back to using a lower PID for us than our
    parent process. In the time it takes us to start up and get to
    trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent
    process might exit, and be replaced by an entirely different process!
    
    We'd racily look up our own getppid(), but in the meantime our parent
    would exit, and Linux would have cycled all the way back to starting
    an entirely unrelated process as PID 321.
    
    If that happens we'll just silently log incorrect data in our ancestry
    chain. Luckily we don't need to worry about this except in this
    specific cycling scenario, as Linux does not have PID
    randomization. It appears it once did through a third-party feature,
    but that it was removed around 2006[1]. For anyone worried about this
    edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate
    it, but not eliminate it.
    
    One thing we don't need to worry about is getting into an infinite
    loop when walking "/proc/*/stat". See 353d3d7 (trace2: collect
    Windows-specific process information, 2019-02-22) for the related
    Windows code that needs to deal with that, and [2] for an explanation
    of that edge case.
    
    Aside from potential race conditions it's also a bit painful to
    correctly parse the process name out of "/proc/*/stat". A simpler
    approach is to use fscanf(), see [3] for an implementation of that,
    but as noted in the comment being added here it would fail in the face
    of some weird process names, so we need our own parse_proc_stat() to
    parse it out.
    
    With this patch the "ancestry" chain for a trace2 event might look
    like this:
    
        $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry
        [
          "bash",
          "screen",
          "systemd"
        ]
    
    And in the case of naughty process names like the following. This uses
    perl's ability to use prctl(PR_SET_NAME, ...). See
    Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on
    assignment to $0 on Linux, 2010-04-15)[4]:
    
        $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry
        [
          "sh",
          "(naughty\nname)",
          "bash",
          "screen",
          "systemd"
        ]
    
    1. https://grsecurity.net/news#grsec2110
    2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/
    3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
    4. Perl/perl5@7636ea9
    
    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Acked-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    avar authored and gitster committed Sep 7, 2021
    Copy the full SHA
    2d3491b View commit details
    Browse the repository at this point in the history