Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split binary format parsing into a shared crate #59

Closed
Shnatsel opened this issue Jul 30, 2020 · 6 comments
Closed

Split binary format parsing into a shared crate #59

Shnatsel opened this issue Jul 30, 2020 · 6 comments

Comments

@Shnatsel
Copy link
Contributor

I'm currently working on a project that requires very basic parsing of ELF, Mach-O and PE formats - essentially enumerating sections and extracting the data from a section with a specific name. libgoblin seems to be an overkill for this task, and I'd like to avoid unsafe code, so the parsers in cargo-bloat seem to fit the bill nicely.

Would you be interested in splitting the parsers into a shared crate? I'll harden the crate against untrusted input (panics, OOMs, etc), but that is probably not a concern for cargo-bloat. Perhaps wider testing of the shared code would be beneficial.

@RazrFalcon
Copy link
Owner

I'm not sure I have time to do this. Especially to support it later. But if you like, you can simply copy-paste the current code to your library and I will happily switch to it, as long is it doesn't have many dependencies (proc-macros in particular).

@Shnatsel
Copy link
Contributor Author

Great! I will have to audit and support some code anyway, so I don't mind doing the work. I'll let you know once I have something usable.

@Shnatsel
Copy link
Contributor Author

I've split the parsers into a shared library https://github.com/Shnatsel/binfarce, made the ELF parsers more general and hardened them against untrusted input so that they don't over-allocate or panic.

The changes this entails for cargo-bloat can be seen here: https://github.com/RazrFalcon/cargo-bloat/compare/master...Shnatsel:shared-library?expand=1 (although it still has a bit of file format detection code, I'm going to drop that too)

I'm open to feedback. If you don't have any objections, I'll give the Mach-o and PE parsers a similar treatment and open a pull request against cargo-bloat. I'm happy to add you as an owner on the git repo and on crates.io.

@RazrFalcon
Copy link
Owner

Yes, looking good.

@Shnatsel
Copy link
Contributor Author

Mach-O is now converted too. I know section extraction works, but I couldn't test the full cargo-bloat workflow because I don't have a mac.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Sep 5, 2020

The code looks pretty much complete to me now. I'll publish on crates.io and send a PR to cargo-bloat soon-ish.

I've refactored all 4 parsers, switched section extraction from allocations to iterators and callbacks, and fuzzed all 4 parsers. Removing allocations didn't affect the APIs used by cargo-bloat at all (symbol extraction still allocates).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants