Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ElfObject::debug_link #450

Merged
merged 5 commits into from
Nov 22, 2021
Merged

Add ElfObject::debug_link #450

merged 5 commits into from
Nov 22, 2021

Conversation

dureuill
Copy link
Contributor

@dureuill dureuill commented Nov 18, 2021

Hello 👋,

I needed ElfObject to expose the debug link of the ELF. The debug link is one of the means (together with the build ID) for GDB to specify the location of a separate debug info file (see here for more information). This allows eg debuggers to find the debug files even when the build id is not specified.

This PR adds support for exposing the debug link as a CStr with its CRC.

Tests

Tested manually on binaries of two systems (manjaro current and Debian stretch): made sure that the executable below would return the same debug_link filename as readelf --string-dump=.gnu_debuglink <binary>. I'm not sure how to automate the tests though.

# Cargo.toml
[package]
name = "test_symbolic"
version = "0.1.0"
edition = "2021"

[dependencies]
anyhow = "1"
symbolic = {git = "https://github.com/dureuill/symbolic.git", branch = "debug_link"}
// main.rs
use anyhow::Context;

fn main() -> anyhow::Result<()> {
    let filename = std::env::args_os()
        .nth(1)
        .context("missing binary name as first argument")?;
    let display_filename = filename.to_string_lossy();
    let data = std::fs::read(&filename)
        .with_context(|| format!("could not open filename '{}'", display_filename))?;
    let elf = symbolic::debuginfo::elf::ElfObject::parse(&data)
        .with_context(|| format!("could not parse '{}' as ELF", display_filename))?;
    let (debug_link, crc) = elf
        .debug_link()
        .unwrap_or_default()
        .map(|d| (d.filename().to_string_lossy().into_owned(), d.crc()))
        .unwrap_or_default();
    println!(
        "{} | {} | {} ({})",
        display_filename,
        elf.code_id().unwrap_or_default(),
        debug_link, crc
    );
    Ok(())
}

Thank you for your time and consideration!

@dureuill dureuill requested a review from a team November 18, 2021 08:17
@dureuill dureuill force-pushed the debug_link branch 2 times, most recently from 7eea04d to 3c3a8b9 Compare November 18, 2021 08:48
@dureuill
Copy link
Contributor Author

Apologies, I had missed a bare link. cargo doc passes locally now.

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good I think, though I'm not very comfortable with all the .expect() calls and would prefer if this would return an error code instead, even though these should never happen (programming errors always happen at some point).


// let's be liberal and assume that the padding is correct and all 0s,
// and just check that we have enough remaining length for the CRC.
if data.len() <= filename_len_with_nul + 4 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the expected length is supposed to be filename_len_with_null + (filename_len_with_nul % 4) + 4. Does it not make sense to check that size exactly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You understand correctly.

Doing the check depends on where you stand wrt strictness of parsers. For example, in PE we often see binaries that don't completely respect the specification, but are still run successfully by Windows. A strict parser will not be able to retrieve information from these PE files though. OTOH, a liberal parser might occasionally retrieve wrong information (here, we could get a truncated CRC with a byte of padding if the CRC is not actually of 4 bytes and we have a byte of padding).

I have no idea of the prevalence of this problem in the gnu_debuglink section though.

If you are more comfortable with a strict parser, I can update the code to do a stricter length check, and let the user do their own parsing if they want to recover for malformed sections (this is why the error type exposes the original data section contents).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you have a good point about being lenient when parsing. I'm not sure what usually the symbolic behaviour is, maybe @jan-auer can chime in?


let crc = match endianity {
Endian::Little => {
u32::from_le_bytes(data[data.len() - 4..].try_into().expect("wrong size"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of an abundance of caution I"d be tempted to use data.get(...) instead of direct slicing here. Though the checks before do make sure this can't happen but hey.

I'm also a bit uncomfortable about using .expect() for the same reason, another error variant can do this without panicking code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that seeing expect() is not a reassuring sight, especially in code by a new contributor 😅.

I'll try to refactor the code because I believe we can remove the expects here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a new version of the diff that removes all expect() calls.

@dureuill
Copy link
Contributor Author

Thank you for the swift review! I will take another try to remove the expect calls!

@dureuill
Copy link
Contributor Author

I pushed a new version of the diff that removes all expect() calls. I had to refactor a bit the code, and the vector is no longer reused when the Cow was originally owned (I couldn't remove that particular instance of expect otherwise, as I couldn't recover the original data after truncating the vector).

DebugLinkError has been split into its kind (without lifetime, for my sanity), and its original data.

I see no difference in behavior with the previous version when running my (manual) tests again

@dureuill
Copy link
Contributor Author

If the new version is good for you, I can squash my commits

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this looks good, thanks for making the changes!

If you're squashing the changes anyway, mind adding an entry to the changelog with Unreleased as version name in the section?

@flub
Copy link
Contributor

flub commented Nov 19, 2021

oh, i forgot to write about tests, it would still be very nice to add some. What I'd propose is that you make a tiny hello world binary compiled with the debug link, hopefully the resulting debug info is not too large to put into symbolic-debuginfo/tests/fixtures/ and then you can write a test in rust extracting the expected debug link from it. maybe even once as owned and once as borrowed data.

Do you think that's reasonable?

@dureuill
Copy link
Contributor Author

Thanks again for the review!

mind adding an entry to the changelog with Unreleased as version name in the section?

Sure! Will do!

oh, i forgot to write about tests, it would still be very nice to add some.

Yes, your proposition looks good. I'm just unsure how the fixtures work exactly. Do you have any pointer on how I should add new fixtures? Otherwise I can attach the binaries (and/or its source, if you prefer to build it yourself) when I have it, so that you can "fixturize" it?

First, I will write the tests and produce the binaries and get back to you.

@flub
Copy link
Contributor

flub commented Nov 19, 2021

I'm just unsure how the fixtures work exactly. Do you have any pointer on how I should add new fixtures?

You save the files and open them directly in the test code, e.g. looking at an existing file in the fixtures directory yields

std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.plist").unwrap();
which I expect is very similar to what you will need.

Otherwise I can attach the binaries (and/or its source, if you prefer to build it yourself) when I have it, so that you can "fixturize" it?

It would be nice to have the source of the fixture in the repo as well i guess with a tiny script or comment on how to produce the .debug file that's committed. But in general we have been putting binary objects in the test suite to test with so there's nothing special to do in order to run the tests.

@dureuill
Copy link
Contributor Author

dureuill commented Nov 19, 2021

OK, I was thinking they were generated in some way due to the UUID in the filename 😅

I will do that and then update!

@flub
Copy link
Contributor

flub commented Nov 19, 2021

OK, I was thinking they were generated in some way due to the UUID in the filename 😅

Heh, in that case the filename is important because the filename is a DebugID and it is part of the test that the filename is parsed correctly. But otherwise the filename has no significance, whatever works for your test and makes sense.

@dureuill
Copy link
Contributor Author

I pushed a new version:

  • Rebase on master
  • Update changelog
  • Make DebugLinkError fields public (this was the initial intent)
  • Add a script to generate fixtures
  • Add the generated fixtures
  • Add tests using the generated fixtures

Tell me what you think :-)

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very thorough testing, thanks!

cd $OUTPUT

# 1. compile our C example. To keep size low, let's compile the simplest program we can write.
cc -x c -Os -o elf_without_debuglink - << EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the debuglink a GCC-specific thing? should this be directly gcc? Or is this a POSIX thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the debuginfo is added to the ELF afterwards using objcopy and is fake anyway (for testing purposes), I'm not sure if this can pose a problem.

It is no harm to change it though, since apparently I failed to run clippy on the tests and need to update 👀.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed (FTR, I probably missed the clippy error due to running cargo clippy instead of cargo clippy --all-features --workspace --tests --examples -- -D clippy::all like the CI)

@flub flub merged commit 6bc23c1 into getsentry:master Nov 22, 2021
@flub
Copy link
Contributor

flub commented Nov 22, 2021

Thanks you very much for the PR and following up on everything! ❤️

@dureuill
Copy link
Contributor Author

Well thank you for the review, it has been my pleasure! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants