Skip to content

Conversation

andrurogerz
Copy link
Collaborator

@andrurogerz andrurogerz commented Oct 31, 2024

Purpose

In response to a qModuleInfo packet, send a crc32 of the file contents as the uuid when an ELF file has no build ID. The change mimics the behavior observed in lldb-server, enabling a number of additional lldb test cases to pass against ds2 running on Linux and Android.

Overview

  • When an ELF file has no build ID in the note section, look for a .gnu_debuglink section and return the crc32 value it contains as the "uuid"
  • When an ELF file has no build ID and has no .gnu_debuglink section, calculate a crc32 checksum of the entire file contents and return it as the "uuid"
  • Rather than take a dependency on zlib or another crc32 provider, import a small C crc32 implementation from freebsd to use
  • Implement a stub version of File::crc32 for Windows following the pattern of the other File methods.
  • Enable lldb test cases that now pass on Android and Linux
  • Disable TestGlobalModuleCache.GlobalModuleCacheTestCase.test_OneTargetOneDebugger test cases on Linux because the test now fails (appears to be a bug in the test case)

Validation

Ran the build & test GitHub workflow

@andrurogerz andrurogerz force-pushed the module-crc branch 6 times, most recently from 5198467 to 334e9cb Compare November 5, 2024 01:12
@andrurogerz andrurogerz changed the title return a crc32 of the file contents as uuid when an ELF file has no build ID return a crc32 uuid when an ELF file has no build ID Nov 5, 2024
@andrurogerz andrurogerz marked this pull request as ready for review November 5, 2024 02:07
@andrurogerz andrurogerz requested a review from compnerd November 5, 2024 02:07

template <typename ELFHeader, typename SectionHeader>
static bool ReadStringTable(int fd, const ELFHeader &ehdr,
SectionHeader &shdr, std::vector<char> &table);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I feel about this. This is effectively a full copy of the string table. Do we need a copy? Is the mapped view not guaranteed to outlive this the reference? That is, more specifically, can we get away with a std::span<char> &?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no mmap of the file at this point, just an open fd. This string table should just include section names so I don't expect it to ever be very large-- just one string per section.

Let me rename the method to ReadSectionNameStringTable for clarity.

// return a crc32 of the contents of the file. The documentation for
// `qModuleInfo` suggests an md5 hash can be returned as "md5" in the
// response instead of "uuid." However, returning a crc32 is consistent
// with behavior of lldb-server.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, MD5 is the right hash here. The default build ID hash was MD5, though it is not required to be MD5. The modern default has changed to SHA1, though UUID is becoming somewhat common too. The user also has the ability to specify a custom (hex) string of an even length as well. Is there some expectation that it is the CRC? I'd prefer MD5/SHA1 over CRC32.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to follow the docs and use md5 here, but there are a number of lldb tests that that start failing when we report an md5 hash instead of uuid. I don't know if sha1 is allowed-- there is nothing in the documentation.

This crc implementation is what I can observe lldb-server doing while running tests, and it gets the most tests passing. I will go back to md5 and see if there's anything more I can do to get tests passing.

// response instead of "uuid." However, returning a crc32 is consistent
// with behavior of lldb-server.
uint32_t crc;
CHK(File::crc32(path, crc));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, the computation is over the normative portions of the file, not the entire file. I suppose that there is the performance aspect - digesting the entire file is faster than parsing and processing the normative parts. Not doing the slower process means that we would not be able to correctly identify files, which leaves this partially incorrect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've observed, the CRCs calculated this way exactly match those sent by lldb-server on the same files. Am I getting lucky? Is there any documentation on what parts of the file should be covered by the crc/hash?

Copy link
Owner

@compnerd compnerd Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation? Ha! The source code to the linker is the documentation 😢. IIRC, it was .text, .data, and any custom sections that are digested.

continue;

const std::string sectionName(&table[shdr.sh_name]);
if (sectionName != ".gnu_debuglink")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should create a global constant for the section name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will. That was lazy.

@andrurogerz
Copy link
Collaborator Author

I don't think making these changes will get us much, and the majority of the tests enabled in this PR were enabled without it in #177.

@andrurogerz andrurogerz closed this Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants