This is a project I did for GSI Technology internship 2019. This repo provides an implementation of the Gemini network for binary code similarity detection in this paper.
Unzip the data by running:
unzip data.zip
The network is written using Tensorflow 1.4 in Python 2.7. You can install the dependencies by running:
pip install -r requirements.txt
The model is implemented in graphnnSiamese.py
.
Run the following code to train the model:
python train.py
or run python train.py -h
to check the optional arguments.
After training, run the following code to evaluate the model:
python eval.py
or run python eval.py -h
to check the optional arguments.
The graphEmbeddings notebook contains details about attemp to visualize embeddings in Tensorflow Projector (t-SNE.) In the notebook it uses the model that I trained (included in the repo.)