GitHub - CommissarSilver/CVT: This repository contains the replication package of our paper "Assessing the Security of GitHub Copilot’s Generated Code

This repository contains the codes and results for replication of the results of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Replication Study"

A collaboration between researchers from Polytechnique Montreal (Canada) and Massey University (New Zealand)

Authors: Vahid Majdinasab, Michael Joshua Bishop, Arghavan Moradidakhel, Shawn Rasheed, Amjed Tahir, Foutse Khomh

The study aims to replicate the finding of Pearce et al. (2022) study Asleep at the keyboard? assessing the security of github copilot’s code contributions"

Structure of the repository

The repository is structured as follows:

CWE_repilcation contains the results of the CWE replication study.
CodeQL_results contains the results of CodeQL analysis on files inside CWE_replication.
The main repository contains the codes for generating the results using Copilot and the codes for comparing the results.

CWE-replication Structure tree

├── CWE_replication
│   ├── Category of the CWE (e.g. cwe-20)
│   │   ├── Sub-scenarios for the CWE (e.g. codeql-eg-IncompleteHostnameRegExp)
│   │   │   ├── copilot_raw (Contains the codes generated by Copilot)
│   │   │   ├── unique_solutions (Contains all the unique solutions extracted from copilot_raw)
│   │   │   │   ├── comparison_results.csv (Raw results of similairty comparison between the solutions)
│   │   │   ├── scenario_code_ql_results.csv (Results of CodeQL analysis on the solutions)
│   │   │   ├── scenario.py (Original CWE scenario)
...

How to run the codes

All experiments were run on Python 3.8 and Copilot version 1.77.922.
Install the required libraries using pip install -r requirements.txt
Run python clank_loop.py to generate the results using Copilot. The script will automatically get the results and check similairty between the solutions.
- IMPORTANT: clank_loop requires you to keep the display on and not interact with the computer while it is running.
Run python mark.py to generate the necessary queries for CodeQL.
Run python collate_results.py to create all the results from running CodeQL.

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
CWE_replication		CWE_replication
CodeQL_results		CodeQL_results
.gitignore		.gitignore
README.md		README.md
Seperate_suggestions.py		Seperate_suggestions.py
clank.py		clank.py
clank_loop.py		clank_loop.py
collate_results.py		collate_results.py
config.py		config.py
mark.py		mark.py
pycode_similar.py		pycode_similar.py
python_comparison.py		python_comparison.py
requirements.txt		requirements.txt
similarity_chcker.py		similarity_chcker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CWE_replication

CWE_replication

CodeQL_results

CodeQL_results

.gitignore

.gitignore

README.md

README.md

Seperate_suggestions.py

Seperate_suggestions.py

clank.py

clank.py

clank_loop.py

clank_loop.py

collate_results.py

collate_results.py

config.py

config.py

mark.py

mark.py

pycode_similar.py

pycode_similar.py

python_comparison.py

python_comparison.py

requirements.txt

requirements.txt

similarity_chcker.py

similarity_chcker.py

Repository files navigation

Structure of the repository

CWE-replication Structure tree

How to run the codes

About

Releases

Packages

Contributors 5

Languages

CommissarSilver/CVT

Folders and files

Latest commit

History

Repository files navigation

Structure of the repository

CWE-replication Structure tree

How to run the codes

About

Resources

Stars

Watchers

Forks

Languages