This is the repository for our deep learning based malware detection model. Please read the paper for any details of our research(https://doi.org/10.1016/j.ins.2020.05.026). One thing you may concern, these source codes are operatable for the fully runable 32bit PE files which are not applied any preprocessings. Also the dataset used for our experiments is consist of 1,000 malwares and benign files, respectively. However it is extremly dangerous to release fully workable malware so we uploaded only the file names or SHA1 hash values for the binary file in dataset.
- Seungho Jeon
- Jongsub Moon
Python3.6 numpy-1.16.2 tensorfllow-1.13.1 scikit-learn-0.20.3 capstone-4.0.1 pefile-2019.4.18
This is the core source codes for our research. You may only concern
main.py to run the our deep learning based malware detection model. Chaning some parameters written in
main.py, you can train or test the our mdoel.
belows are the part of
... for strategy in ['noguided', 'random', 'prob_f', 'prob_b']: DetectionModel.run(n_sampled_path, 'train', '.\\indexed-paths', # data root, strategy, ...
Put 'train' in string type for the second parameter of
DetectionModel.run to train the
DRNN, 'train_lc' to train the linear classifier applying after
DRNN, or 'test' to test the performance of our model.
For your convinience, we uploaded the trained model parameters so you can skip the training part :)