New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add light weight ElfFeatureExtractor #770
Conversation
capa/features/extractors/elffile.py
Outdated
args: | ||
elf (elftools.elf.elffile.ELFFile): the parsed ELFFile | ||
buf: the raw sample bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use type hints for this
b769dea
to
06d238a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great!
@@ -344,7 +344,6 @@ def freeze_deserialize(cls, args): | |||
|
|||
class Arch(Feature): | |||
def __init__(self, value: str, description=None): | |||
assert value in VALID_ARCH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we remove this check then we should probably add checks within the rule parser to validate what a user specifies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, reason I removed them here is because our arch
extraction fails for packed samples, e.g. via UPX
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -9,7 +9,7 @@ | |||
import abc | |||
from typing import Tuple, Iterator, SupportsInt | |||
|
|||
from capa.features.basicblock import Feature | |||
from capa.features.common import Feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is better?!
# see https://github.com/eliben/pyelftools/blob/0664de05ed2db3d39041e2d51d19622a8ef4fb0f/scripts/readelf.py#L372 | ||
symbol_tables = [(idx, s) for idx, s in enumerate(elf.iter_sections()) if isinstance(s, SymbolTableSection)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to be independent or is it ok to rely on elftools?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this makes sense. and, maybe we want to migrate the OS detection to pyelftools? well, it works as is, but its an option.
@@ -344,7 +344,6 @@ def freeze_deserialize(cls, args): | |||
|
|||
class Arch(Feature): | |||
def __init__(self, value: str, description=None): | |||
assert value in VALID_ARCH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
def extract_file_arch(elf, **kwargs): | ||
# TODO merge with capa.features.extractors.elf.detect_elf_arch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's some overlap/dup between some of the functions in capa.features.extractors.elf and what's provided by elftools
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, happy to use elftools
merging so i can get a medium-scale run against ELF files before v3 |
Add a basic ELF feature extractor similar to the pefile one in support of #699.
Checklist