-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
elf: Define Go Build ID constants #549
Conversation
What do you need this for? I don't think this change is correct. |
Yeah, I wasn't sure whether it was correct to expose this in the same function. I just need to read this build id when looking at Go binaries, and they do use this form xor the GNU form (gccgo produces the GNU form). |
What do you need to do with it? I can make a better decision about the API when I know what it is used for. Currently my preference would be to not add any specific knowledge of these notes. If users require it, then they can use the existing ELF note parser to read it. It is likely that the current API exposed by |
We want to use this to identify the binary. By default, the value of the note is a content hash of the binary.
Indeed, the current constellation of APIs doesn't make this pleasant:
I think you'd have to:
Are you imagining an API that makes this possible starting from FWIW, the current |
I think we need to add this.
This is what
There exists We could also add a method to
This should already possible. Do you think this is unsuitable? If so, what could be done to improve it?
I think both should be made possible. Use
It's special because this crate originally existed only to find and load debug info for gimli, and when it was added the lower level API didn't exist. |
Today
True, if we could go
That would work, but with the above, I think it wouldn't be strictly necessary.
Yeah, it is possible today (that's what I was saying). I think the ideas above would make this more ergonomic and wouldn't require parsing the file header twice.
In this case what I'm doing is specific to ELF (other formats don't have notes AFAIK) but it would be nice to start with
Perhaps that's a good reason to remove it (once it's possible to ergonomically go from I think redefining Let me know if this sounds sensible, and I'll rework this PR. |
It's something I've considered, and it's okay, but ideally I would like something better. The shortcoming of this approach is that it means you must be working with
Why do you need to parse the file header twice? Do you mean you would still need the parsing that
It's not worth breaking existing users for. |
Yeah, I think you're right - we wouldn't need to use
this code doesn't compile because the match arms have different types. Should this crate provide an enum to unify the 32-bit and 64-bit headers? I tried writing this in my code, but ran into problems:
|
I sent #551 to remove the |
Instead of writing code like that, can you call a function that is generic over object/crates/examples/src/readobj/mod.rs Lines 192 to 207 in ffe0cee
object/crates/examples/src/readobj/elf.rs Lines 6 to 20 in ffe0cee
If that is not possible, I agree that it might be nice to have an enum for both 32-bit and 64-bit (this is what Instead of trying to implement |
Sort of. The code I ended up writing (which didn't require any patches to this crate) is: fn notes<'data, Elf, R>(
elf: &object::read::elf::ElfFile<'data, Elf, R>,
) -> object::read::Result<
impl Iterator<Item = object::read::Result<object::read::elf::NoteIterator<'data, Elf>>>,
>
where
Elf: object::read::elf::FileHeader,
R: object::ReadRef<'data>,
{
use object::read::elf::{ProgramHeader as _, SectionHeader as _};
let endian = elf.endian();
let data = elf.data();
let header = elf.raw_header();
let sections = header.sections(endian, data)?;
let sections = sections
.iter()
.filter_map(move |header| header.notes(endian, data).transpose());
let segments = header.program_headers(endian, data)?;
let segments = segments
.iter()
.filter_map(move |header| header.notes(endian, data).transpose());
Ok(std::iter::empty().chain(sections).chain(segments))
}
// TODO(https://github.com/gimli-rs/object/pull/549): Simplify this.
fn build_id<'data, Elf, R>(
elf: &object::read::elf::ElfFile<'data, Elf, R>,
) -> object::read::Result<Option<&'data [u8]>>
where
Elf: object::read::elf::FileHeader,
R: object::ReadRef<'data>,
{
let endian = elf.endian();
const ELF_NOTE_GNU: &[u8] = b"GNU";
const ELF_NOTE_GO: &[u8] = b"Go";
const NT_GO_BUILD_ID: u32 = 4;
let mut notes = notes(elf)?;
while let Some(mut notes) = notes.next().transpose()? {
while let Some(note) = notes.next()? {
let mut name = note.name();
while let [rest @ .., 0] = name {
name = rest;
}
match (name, note.n_type(endian)) {
(ELF_NOTE_GNU, object::elf::NT_GNU_BUILD_ID) | (ELF_NOTE_GO, NT_GO_BUILD_ID) => {
return Ok(Some(note.desc()));
}
_ => {}
}
}
}
Ok(None)
}
enum ElfFile<'data, Endian: object::Endian, R: object::ReadRef<'data>> {
Elf32(object::read::elf::ElfFile<'data, object::elf::FileHeader32<Endian>, R>),
Elf64(object::read::elf::ElfFile<'data, object::elf::FileHeader64<Endian>, R>),
}
impl<'data, Endian: object::Endian, R: object::ReadRef<'data>> ElfFile<'data, Endian, R> {
fn endian(&self) -> Endian {
match self {
Self::Elf32(elf) => elf.endian(),
Self::Elf64(elf) => elf.endian(),
}
}
fn section_data_by_name(
&self,
section_name: &str,
) -> Option<object::Result<std::borrow::Cow<'_, [u8]>>> {
use object::Object as _;
match self {
Self::Elf32(elf) => elf
.section_by_name(section_name)
.as_ref()
.map(object::ObjectSection::uncompressed_data),
Self::Elf64(elf) => elf
.section_by_name(section_name)
.as_ref()
.map(object::ObjectSection::uncompressed_data),
}
}
fn build_id(&self) -> object::read::Result<Option<&'data [u8]>> {
match self {
Self::Elf32(elf) => build_id(elf),
Self::Elf64(elf) => build_id(elf),
}
}
}
Yeah, I agree. I think we can perhaps move that discussion to #551. Apart from the last commit in this PR, I think it's an uncontroversial cleanup - do you agree? If yes, I can either:
Let me know what you think, and hopefully we can get this landed. Thanks for helping me iterate here. |
Is that satisfactory? Are you doing things with the file in addition to reading the build id? That is more code than I hoped would be needed simply to read a build id, and it could be simplified, but if you are doing other things too then maybe it is needed.
I think we should do this. I've added some review comments. |
Yeah, I go on to pass this into gimli to read the debug info.
Done, thanks for the comments. |
This allows these items to be used in match expressions.
Add Note::name_bytes which returns the entire content of the name field.
elf: Define Go Build ID constants
See individual commits.