Skip to content

MohsenEbadpour/Windows-malware-detection-based-on-dynamic-behaviors-APIs-call-using-Multilayer-Perceptron-MLP

Repository files navigation

Windows malware detection based on dynamic behaviors(Similarity API calls) using Multilayer Perceptron (MLP)

Our project is about to train an MLP prediction model to detect windows malwares based on counts of API calls similarity.

Description

We have collected about 12000 windows executable files from different public sources during web scraping and combining various datasets. Their 79% are malwares.
Then, We implemented the Cuckoo sandbox locally to collect their dynamic behaviors report. After ordering, we selected the count of API calls as our feature vector to input the network. Based on the above decision, we created our CSV dataset and SQLite database. The databsed has 3 table:

  1. "APIs" : List of APIs that has seen in the whole of our reports.(311 = feature vector)
  2. "Reports" : List of reports with their md5 and VirusTotal rank ("positive" column).
  3. "APIs_Reports" : A many-to-many relationship between the above two tables plus a column("repetition") that indicates for the given report and given API how many calls occurred.

The CSV dataset was created based on the above database. Column "OUTPUT" is our output(label) that it shows is given file is a malware or not. For files with equal or greater 10 rank in VirusTotal, we labeled 1, and for files with equal 0 rank in VirusTotal, we labeled 0.

Result of the trained model

Accuracy.png Result

Loss Result

Confusion matrix on unseen test data(20% of dataset):

Confusion matrix on unseen test data

Contributor

Reports

For reports of executable files that Cuckoo sandbox generated(78~93GB), please contact to mohsenebadpour@outlook.com

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages