Skip to content

static analysis and feature extraction of Portable Executable files

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

patrickarmengol/pegreet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pegreet

PyPI - Version PyPI - Python Version

Greet your malware samples before you tear them apart.

pegreet is a tool that performs static analysis and feature extraction on Portable Executable files. As a cli app, it should help with first steps in malware analysis / reverse engineering. As a library, it can be used to extract useful information from samples in bulk for use in exploratory data analysis or building malware classification models.


Table of Contents

Features

Implemented

  • dump general file information
  • compute hashes (MD5, SHA1, SHA256, Imphash, SSDEEP)
  • calculate entropy
  • detect packers via PEiD signatures
  • dump info from headers
  • dump info from sections
  • dump imports and exports
  • annotate suspicious Windows API functions
  • display file parsing warnings
  • disassemble code from entry point
  • find strings
  • categorize strings

To Do

  • recognize known malicious section names
  • annotate suspicious entropy and size mismatches
  • extract resources
  • lookup on VirusTotal
  • lookup for public sandbox reports
  • check file against YARA rules
  • check digital signature
  • sort strings with StringSifter
  • extract obfuscated strings with FLOSS
  • custom output (csv, json, markdown)

Screenshots

i

Installation

as a module

<virtual environment shenanigans>
pip install pegreet

as a cli app

pipx install pegreet

Usage

as a module

from pathlib import Path
import pegreet

pe = pegreet.load(Path('data/samples/petya.exe'))

info_data = pegreet.info(pe)
print(info_data)
print(pegreet.pretty_info(info_data))

strings_data = pegreet.find_strings(pe)
print(strings_data)
print(pegreet.pretty_strings(strings_data))

print(pegreet.disasm(pe, num_lines=40))

as a cli app

$ pegreet --help

 Usage: pegreet [OPTIONS] COMMAND [ARGS]...

╭─ Options ───────────────────────────────────────────╮
│ --help                        Show this message and │
│                               exit.                 │
╰─────────────────────────────────────────────────────╯
╭─ Commands ──────────────────────────────────────────╮
│ disassemble  disassemble a specified number         │
│              instructions from entry point          │
│ info         print useful info                      │
│ strings      print strings                          │
╰─────────────────────────────────────────────────────╯


$ pegreet info data/samples/petya.exe
...


$ pegreet strings --show-uncategorized data/samples/petya.exe
...


$ pegreet disassemble data/samples/petya.exe 40
...

Notes

I started this project in 2020 in an attempt to learn about PE files and feature extraction for use in malware data science.

There are many other (better) tools available that implement similar functionality (see below). What I tried to do with pegreet is to focus on only the features that are useful to malware analysis to make it easier to digest the information. pegreet also provides annotations for suspicious indicators that can be used as jumping points for an investigation.

The pefile library was used extensively to implement the parsing of PE files. I would like to explore using the LIEF project instead as it supports multiple executable formats and it was used in the EMBER dataset. Maybe I'll follow this project up with an 'ELFgreet'.

Resources

Similar Tools

PE file info

License

pegreet is distributed under the terms of any of the following licenses: