Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with go binary version detect by aligning it with the go cli implementation #1605

Conversation

ddelnano
Copy link
Member

Summary: Fix issues with go binary version detect by aligning it with the go cli implementation

Our previous go version detection resulted in a number of bugs -- 32 bit binaries caused memory allocation errors (#1300) and failed with binaries built with go 1.20.4 and later (#1318). This change revamps the implementation to follow how the go version cli command works (the core logic can be found here). This should be more maintainable moving forward since our previous implementation relied on unstable ELF symbols and fixes the outstanding bugs.

Relevant Issues: Closes #1300 #1318

Type of change: /kind feature

Test Plan: New tests verify functionality for endian agnostic and little endian case (big endian case untested)

cli implementation

Signed-off-by: Dom Del Nano <ddelnano@pixielabs.ai>
@ddelnano ddelnano marked this pull request as ready for review July 11, 2023 18:27
@ddelnano ddelnano requested review from etep and a team July 11, 2023 18:27
Comment on lines +101 to +136
switch (endianness) {
case 0x0: {
if (ptr_size == 4) {
read_ptr = [&](u8string_view str_view) {
return utils::LEndianBytesToInt<uint32_t, 4>(str_view);
};
} else if (ptr_size == 8) {
read_ptr = [&](u8string_view str_view) {
return utils::LEndianBytesToInt<uint64_t, 8>(str_view);
};
} else {
return error::NotFound(absl::Substitute(
"Binary reported pointer size=$0, refusing to parse non go binary", ptr_size));
}
break;
}
case 0x1:
if (ptr_size == 4) {
read_ptr = [&](u8string_view str_view) {
return utils::BEndianBytesToInt<uint64_t, 4>(str_view);
};
} else if (ptr_size == 8) {
read_ptr = [&](u8string_view str_view) {
return utils::BEndianBytesToInt<uint64_t, 8>(str_view);
};
} else {
return error::NotFound(absl::Substitute(
"Binary reported pointer size=$0, refusing to parse non go binary", ptr_size));
}
break;
default: {
auto msg =
absl::Substitute("Invalid endianness=$0, refusing to parse non go binary", endianness);
DCHECK(false) << msg;
return error::NotFound(msg);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to consolidate these two cases, e.g. binding either LEndianBytesToInt or BEndianBytesToInt to a function then following the same logical path?

Copy link
Member Author

@ddelnano ddelnano Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to find a solution that did what you described. Unfortunately LEndianBytesToInt and BEndianBytesToInt each have two cases (4 and 8 byte pointers) and I wasn't sure how to address dynamically selecting the pointer size at runtime (since the function pointer has to bind the pointer size at a later time).

I did have a previous implementation that relied on promoting 4 byte pointers to a 8 byte return value in #1590. That simplified the branching but not by the way you asked about. However, there were concerns about encoding that in a core utility function to simplify the caller (discussion).

src/stirling/obj_tools/go_syms.cc Outdated Show resolved Hide resolved
}

// Reads the buildinfo header embedded in the .go.buildinfo ELF section in order to determine the go
// toolchain version. This function emulates what the go version cli performs as seen
// https://github.com/golang/go/blob/cb7a091d729eab75ccfdaeba5a0605f05addf422/src/debug/buildinfo/buildinfo.go#L151-L221
StatusOr<std::string> ReadBuildVersion(ElfReader* elf_reader) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity (I'm assuming there is good reason...) but why is the ElfReader passed in as a param., i.e. rather than the binary path? (could create the elf reader inside this function).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because many of the uprobe deployment helpers need to perform some form of ELF parsing. At the start of each UProbe deployment function (Go, NodeJS, OpenSSL), we create a shared ELFReader and pass it to all of the functions that need it.

Signed-off-by: Dom Del Nano <ddelnano@pixielabs.ai>
@ddelnano ddelnano requested a review from a team July 13, 2023 18:13
@JamesMBartlett JamesMBartlett merged commit 42177b7 into pixie-io:main Jul 13, 2023
@ddelnano ddelnano deleted the ddelnano/implement-fully-featured-go-buildinfo-parsing branch July 13, 2023 18:44
ddelnano added a commit that referenced this pull request Aug 15, 2024
…ries (#1976)

Summary: Address PEM crash caused by parsing certain older Go
application binaries

This PR fixes a crash caused by certain older Go application binaries.
In addition, this change includes the
`//src/stirling/binaries:go_binary_parse_profiling` cli tool. This tool
was helpful for debugging the previous 32 bit issue and aided in
debugging this problem (see Background section for more details). This
change is best reviewed commit by commit.

**Background**

Our Golang binary parsing was revamped in
#1605 to support Go 1.20.4
applications and later (#1318)
in addition to fixing a PEM crash caused by 32 bit go binaries
(#1300). While this solved the
aforementioned issues, it resulted in a new crash that we weren't able
to reproduce (#1646).

I was able to work with a Pixie Community Slack user to track down where
one of these issues originate from. The overview is that Go embeds
virtual addresses within the `.go.buildinfo` ELF section. These virtual
addresses are used in certain cases to read the build settings used when
the binary was created (toolchain version, go experiments, etc). In
order to properly read these strings, these virtual addresses need to be
converted into file offsets (binary addresses).

This bug presents itself when the `LOAD` ELF segments in the binary are
not contiguous or ordered by increasing virtual memory address. Meaning
if there are LOAD segments for segments 1, 2 and 3, this bug occurs if
those segments aren't adjacent to each other or don't have increasing
virtual memory addresses (vaddr of segment 1 < vaddr of segment 2 <
vaddr of segment 3). Instead the virtual address that needs to be looked
up, should be matched against the relevant segment and that segment's
virtual address offset should be used.

Relevant Issues: Partially addresses #1646 -- there is one more known
case, which must be investigated further

Type of change: /kind bug

Test Plan: Verified this change through the following
- [x] User from the community slack
[verified](https://pixie-community.slack.com/archives/CQ63KEVFY/p1722271309767939?thread_ts=1721315312.198319&cid=CQ63KEVFY)
that the issue was fixed.
- [x] New ElfReader function is covered with a test
- [x] go 1.17 test case ("[little
endian](https://github.com/pixie-io/pixie/blob/50ddcd32eb217e1aa5e87124883ee284a36052a1/src/stirling/obj_tools/go_syms_test.cc#L51)"
case) still works despite it not triggering this bug
- I was unable to recreate a binary that had the segments in an
unordered fashion.

Changelog Message: Fixed an issue with Go uprobe attachment that
previously caused crashes for a subset of older Go applications (Go 1.17
and earlier)

---------

Signed-off-by: Dom Del Nano <ddelnano@pixielabs.ai>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parsing 32 bit Go binaries causes large memory allocations
3 participants