Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add light weight ElfFeatureExtractor #770

Merged
merged 4 commits into from Sep 13, 2021
Merged

Conversation

mr-tz
Copy link
Collaborator

@mr-tz mr-tz commented Sep 10, 2021

Add a basic ELF feature extractor similar to the pefile one in support of #699.

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

Comment on lines 84 to 86
args:
elf (elftools.elf.elffile.ELFFile): the parsed ELFFile
buf: the raw sample bytes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use type hints for this

Copy link
Collaborator

@williballenthin williballenthin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great!

@@ -344,7 +344,6 @@ def freeze_deserialize(cls, args):

class Arch(Feature):
def __init__(self, value: str, description=None):
assert value in VALID_ARCH
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we remove this check then we should probably add checks within the rule parser to validate what a user specifies

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, reason I removed them here is because our arch extraction fails for packed samples, e.g. via UPX

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -9,7 +9,7 @@
import abc
from typing import Tuple, Iterator, SupportsInt

from capa.features.basicblock import Feature
from capa.features.common import Feature
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better?!

Comment on lines +23 to +24
# see https://github.com/eliben/pyelftools/blob/0664de05ed2db3d39041e2d51d19622a8ef4fb0f/scripts/readelf.py#L372
symbol_tables = [(idx, s) for idx, s in enumerate(elf.iter_sections()) if isinstance(s, SymbolTableSection)]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to be independent or is it ok to rely on elftools?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this makes sense. and, maybe we want to migrate the OS detection to pyelftools? well, it works as is, but its an option.

@@ -344,7 +344,6 @@ def freeze_deserialize(cls, args):

class Arch(Feature):
def __init__(self, value: str, description=None):
assert value in VALID_ARCH
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



def extract_file_arch(elf, **kwargs):
# TODO merge with capa.features.extractors.elf.detect_elf_arch()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's some overlap/dup between some of the functions in capa.features.extractors.elf and what's provided by elftools

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, happy to use elftools

capa/features/extractors/elffile.py Outdated Show resolved Hide resolved
capa/features/extractors/elffile.py Show resolved Hide resolved
@williballenthin
Copy link
Collaborator

merging so i can get a medium-scale run against ELF files before v3

@williballenthin williballenthin merged commit 297d9aa into master Sep 13, 2021
@williballenthin williballenthin deleted the elffile-extractor branch September 13, 2021 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants