Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for BCSymbolMap #336

Merged
merged 36 commits into from
Mar 30, 2021
Merged

Initial support for BCSymbolMap #336

merged 36 commits into from
Mar 30, 2021

Conversation

flub
Copy link
Contributor

@flub flub commented Mar 8, 2021

There is one TODO left to decide, but otherwise this is pretty much done.

symbolic-debuginfo/src/bcsymbolmap.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/bcsymbolmap.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/bcsymbolmap.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/object.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/plist.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/plist.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/plist.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/plist.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/macho.rs Outdated Show resolved Hide resolved
symbolic-debuginfo/src/macho.rs Outdated Show resolved Hide resolved
@flub flub changed the title Initial support for BCSymbolMap and PList Initial support for BCSymbolMap Mar 24, 2021
symbolic-debuginfo/src/macho.rs Outdated Show resolved Hide resolved
@flub flub marked this pull request as ready for review March 24, 2021 11:30
@flub flub requested a review from a team March 24, 2021 11:31
Comment on lines 82 to 84
/// let object_data =
/// std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden")
/// .unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is great! It could be confusing to see the "test/fixtures..." path in a doc comment, though, so what if we hide that line (and same for the symbolmap below)?

Suggested change
/// let object_data =
/// std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden")
/// .unwrap();
/// # let object_data =
/// # std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden")
/// # .unwrap();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it would make the object_data variable come out of nowhere with no idea on what its type would be. I prefer this as it's fully self contained, I don't think it's a big leap to infer that's just a path name. Plus anyone who knows doctests, which will be most people reading this, will know exactly what's going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be convinced to do something ugly like:

/// let object_data = std::fs::read("some/path/object.dwarf");
/// # let object_data = <the full thing that actually works>

I guess, if you really prefer. But I don't see it as much better to be honest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out, in the standard library they annotate such examples with norun and simply ensure it compiles, since there will be tests covering this anyway. Given that this is quite an odd path to read, we could do that, too?

See std::fs::File source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep the example as a test-case too. I've gone with some compromise where I fake-read the data from a path like it appears in an .xcarchive in a comment while hiding where we actually load it from.

Floris Bruynooghe added 22 commits March 24, 2021 14:25
This is still copying data, so not great.
This is for object files, these are not object files.
We now actually look up the hidden symbols correctly
And fixup utf-8 errors
this test is weird
Doesn't belong here, certainly not in the public api
I kind of liked it, but whatever.
Since the iterator does not depend on the lifetime of the object we
need to clone and there's no reasonable way around.  This clones a
vector of pointers, so it is not as bad as it could be.
Floris Bruynooghe added 4 commits March 24, 2021 14:25
this simplifies the macho symbol iterator as well now which always
returns slices of the same lifetime.
This makes usage more convenient, see the huge symplification on the
MachObject symbols iterator.
I don't understand why this only surfaced in this PR, but whatever.
symbolic-debuginfo/src/bcsymbolmap.rs Outdated Show resolved Hide resolved
///
/// If the name matches the `__hidden#NNN_` pattern that indicates a [`BCSymbolMap`]
/// lookup it will be looked up the resolved name will be returned. Otherwise the name
/// is returned unchanged.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Bonus points for giving the lookup methods an illustrative # Example.

symbolic-debuginfo/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines 82 to 84
/// let object_data =
/// std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden")
/// .unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out, in the standard library they annotate such examples with norun and simply ensure it compiles, since there will be tests covering this anyway. Given that this is quite an odd path to read, we could do that, too?

See std::fs::File source.

Copy link
Member

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 This looks really straight forward so far.

Can you double check my question regarding filenames?

Comment on lines 86 to 88
if pattern.len() > bytes.len() {
pattern = &pattern[..bytes.len()];
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that a truncated file would pass this check. Is this really what we want? Or is this a limitation that test does get called with a small slice of the file only?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, actually all you want to do is bytes.starts_with(pattern), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the assumption is that this only gets called with a small slice. This is kind of mimicking what ObjectLike::test does, this test is sufficient to decide between "is this an object? (Archive::test)", "is this a BCSymbolMap (BCSymbolMap::test)" or "is this a PList? (PList::test)". So in a way the API assumption here is very weak and test is only useful in relation to the other test functions.
On the other hand to call ::parse(), which is the only thing that actually gives you a full answer to this question, you need to read all the data from disk. So this is useful as a quick check if it's even worth reading the entire file from disk. Again this only makes sense if a BCSymbolMap is regularly larger than the filesystem block size, but it seems that is the case.
Anyway, TL;DR is that this is symmetric with ObjectLike and makes sense when using it in some of the workflows we do.

symbolic-debuginfo/src/bcsymbolmap.rs Outdated Show resolved Hide resolved
}

#[test]
fn test_data_lifetime() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Comment on lines +3655 to +3656
/Users/philipphofmann/git-repos/sentry-cocoa/Sources/SentryCrash/Recording/Monitors/SentryCrashMonitor_Signal.c
Sources/SentryCrash/Recording/Monitors/SentryCrashMonitor_Signal.c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m a bit surprised to find (absolute) filenames here. Are those being obfuscated as well? Do we need to resolve them when we extract file/line info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, intersting. i must admit i did not check this file... Yes, I imagine that this is also obfuscated. I guess question is whether we need to de-obfuscate other things than just symbols as well I guess. We should look at this sometime.

symbolic-debuginfo/src/macho.rs Show resolved Hide resolved
Floris Bruynooghe and others added 9 commits March 29, 2021 14:34
If users want this they'll have to newtype it.
Co-authored-by: Arpad Borsos <arpad.borsos@sentry.io>
Use some indicators of the pathnames inside an xcarchive instead which
is more commonly what people might be finding themselves navigating.
Otherwise we clone the entire vector again which could be a large
number of entries, even if the entries are small.  There is no real
downside to using an Arc here as this is entirely internal.
@flub flub merged commit 51b639b into master Mar 30, 2021
@flub flub deleted the feat/bcsymbolmap branch March 30, 2021 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants