stabilize ELF analysis and prevent crashes/hangs#2914
stabilize ELF analysis and prevent crashes/hangs#2914akshat4703 wants to merge 4 commits intomandiant:masterfrom
Conversation
There was a problem hiding this comment.
Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the stability and robustness of ELF file analysis within capa. It addresses several long-standing issues that led to crashes, hangs, or excessive resource consumption when processing ELF binaries, particularly those with unsupported architectures or large symbol tables. By implementing graceful error handling, optimizing vivisect workspace loading, and introducing analysis timeouts and scope limits, the changes ensure a more reliable and predictable analysis experience for ELF files. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
CHANGELOG updated or no update needed, thanks! 😄
There was a problem hiding this comment.
Code Review
This pull request significantly improves the stability of ELF file analysis by introducing timeouts, gracefully handling unsupported architectures, and implementing workarounds for performance issues in vivisect. The changes are logical and well-supported by new tests. My review includes a few suggestions to address code duplication and improve readability, which will enhance the long-term maintainability of this new functionality.
|
@akshat4703 in #2913 I asked you to discuss ideas with the maintainers before opening PRs. Less than 12h later you've opened this PR with multiple unexpected changes. We are already spread thin, and these PRs make it even harder to maintain capa. Please discuss ideas and proposals with us first; otherwise, we will close PRs without discussion. |
I am sorry for being so naive, understood. Since #2780 already exists for the stability issues, would you prefer that I post the proposed approaches and discussion there, or open a separate issue to outline the ideas before any PR? or should i open a discussion? Happy to follow whichever workflow you prefer. |
Summary
This PR improves the stability of ELF analysis in capa and resolves several
issues reported in #2780 where ELF binaries could cause crashes or hangs.
Problems addressed
exceptions during analysis.
Changes
Clean handling of unsupported architectures
When encountering unsupported ELF architectures, capa now exits gracefully
with
E_INVALID_FILE_ARCHinstead of raising a fatal exception.Prevent viv loader stalls
Adjusted ELF workspace loading to avoid cases where vivisect section-symbol
parsing causes the loader to get stuck.
Disable problematic viv modules for ELF
Viv modules known to trigger unstable behavior during ELF analysis are
disabled for ELF workspaces.
Bound analysis scope for large ELF binaries
Introduced a safety bound for viv function analysis:
CAPA_ELF_MAX_FUNCTIONS (default: 1000)
This prevents excessive analysis in very large ELF binaries and avoids
hangs similar to those observed with
/usr/bin/gimp.Testing
Added tests:
tests/test_loader_segfault.py
Results:
pytest tests/test_loader_segfault.py -q→ 6 tests passed
Manual verification:
capa /bin/ls -> RC 0
capa /usr/bin/gimp -> RC 0 (completes successfully)
capa –debug /usr/bin/gimp -> RC 0
Additionally verified behavior on an aarch64 sample (yara-x):
E_INVALID_FILE_ARCHFiles changed
capa/loader.py
capa/main.py
tests/test_loader_segfault.py
Result
ELF analysis is now more robust and avoids: