Skip to content

Latest commit

 

History

History

Results

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Results

This folder contains the results of the experiments and the IPython notebooks to extract the different metrics and generate the plots.

Download the output data for each model we tested

Warning: the following steps will require about 13GB of free disk space.

To download the data from Google Drive use the gdrive_download.py Python3 script and follow the instructions below:

  1. Install the Python3 virtualenv

  2. Create a new virtualenv and install the required packages

# create a new "env" environment
python3 -m venv ../env
# enter the virtual environment
source ../env/bin/activate

# Install the requirements in the current environment
pip install -r ../requirements.txt
  1. Download and unzip the data in the corresponding folders:
python3 ../gdrive_download.py --results

The data will be unzipped in the following directories:

Results/data/Dataset-1
Results/data/Dataset-1-CodeCMR
Results/data/Dataset-2
Results/data/Dataset-Vulnerability
Results/data/raw_results

Process the data to extract the different metrics and generate the plots

Most of the model implementations directly return the similarity between the function pairs for each dataset we tested. The CSV files with the results are saved in the corresponding Dataset folder under the data directory.

All the CSV files use the same header:

idb_path_1,fva_1,idb_path_2,fva_2,sim
  • idb_path and fva are used as "primary keys" to identify a single function
  • The sim column contains the similarity (distance) value computed using the specific metric required by each approach.

However, some models require an intermediate step to convert the output to this standard form. The data/raw_results folder includes the output from Asm2vec/Doc2vec, Catalog1, CodeCMR and FunctionSimSearch.

Finally, there are three IPython notebooks to extract the metrics for all the experiments:

The output is saved in the metrics_and_plots folder.