Skip to content

marcusbotacin/ELF.Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Forseti

Forseti is a prototype for binary classification (malware/goodware) targeting ELF/Linux binaries.

Authors

Forseti was authored by Lucas Galante under supervision of Marcus Botacin, André Grégio and Paulo de Geus.

Motivation

Forseti is inspired by the lack of didactic, academic tools for exploring Linux binaries.

Repository Organization

The repository is organized as follows:

  • code: Contains Forseti's Python scripts.
  • data: Contains Forseti's configuration files, databases and configuration files' samples.
  • paper: Contains a copy of our white-paper.
  • tests: Contains Forseti's test-case files.

Papers

Forseti is supported by a series of developments and described in multiple papers:

  • Forseti feature extraction mechanisms are described in the course Introdução à Engenharia Reversa de Aplicações Maliciosas em Ambientes Linux, published in the XIX SBSEG. Check Here
  • Forseti feature extraction capabilities were used to describe the landscape of Linux malware presented in the paper Malicious Linux Binaries: A Landscape, published in the XVIII SBSEG Check Here
  • Forseti classification capabilities are described in the paper Forseti: Extração de características e classificação de binários ELF, published in the XIX SBSEG Check Here
  • Forseti evaluation was presented in the paper Machine Learning for Malware Detection: Beyond Accuracy Rates, published in the XIX SBSEG Check Here

Installation

Install the following dependencies to run Forseti:

pip install pyelftools
pip install pickle
pip install configparser
pip install sklearn

Execution

Forseti can be trained by providing it with a list of goodware and malware files:

python main.py -g goodware.txt -m malware.txt

The list should look like:

$> cat data/malware.txt 
> tests/static_malicious.bin
> tests/upx.bin
> tests/fork.bin

Alternatively, one can also provide a list of suspicious files to be classified:

python main.py -g goodware.txt -m malware.txt -s suspicious.txt

Or, automate everything using our script:

./run-forsite.sh

Usage Example

If you start Forseti using the previously presented parameters, you should see a screen like this:

Forseti Execution Example

Notice that Forseti: (i) initially displays the feature vectors for all considered binaries; (ii) display all parameters used for the selected classifier; and (iii) finally starts displaying the classification metrics for each folding step.

Forseti Internals

If you want to take a look on how Forseti extracts features, you might want to look:

  • static.py: Static feature extraction.
  • dynamic.py: Dynamic feature extraction.

If you want to take a look on Forseti training, you might want to look:

  • kfold.py: Folding training implementation.
  • ml.py: Classifiers implementation.

More specifically, you might want to change the implemented classifiers by implementing a new class that inherits the MachineLearning class, as Forseti does to implement its classifiers. The currently implemented classifiers are:

class RandomForest(MachineLearing)
class Svm(MachineLearing)
class MLP(MachineLearing)

About

Additional Material for the FORSETI paper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published