Skip to content

This repository contains the replication package of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Replication Study"

Notifications You must be signed in to change notification settings

CommissarSilver/CVT

Repository files navigation

This repository contains the codes and results for replication of the results of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Replication Study"

A collaboration between researchers from Polytechnique Montreal (Canada) and Massey University (New Zealand)

Authors: Vahid Majdinasab, Michael Joshua Bishop, Arghavan Moradidakhel, Shawn Rasheed, Amjed Tahir, Foutse Khomh

The study aims to replicate the finding of Pearce et al. (2022) study Asleep at the keyboard? assessing the security of github copilot’s code contributions"

Structure of the repository

The repository is structured as follows:

  • CWE_repilcation contains the results of the CWE replication study.
  • CodeQL_results contains the results of CodeQL analysis on files inside CWE_replication.
  • The main repository contains the codes for generating the results using Copilot and the codes for comparing the results.

CWE-replication Structure tree

├── CWE_replication
│   ├── Category of the CWE (e.g. cwe-20)
│   │   ├── Sub-scenarios for the CWE (e.g. codeql-eg-IncompleteHostnameRegExp)
│   │   │   ├── copilot_raw (Contains the codes generated by Copilot)
│   │   │   ├── unique_solutions (Contains all the unique solutions extracted from copilot_raw)
│   │   │   │   ├── comparison_results.csv (Raw results of similairty comparison between the solutions)
│   │   │   ├── scenario_code_ql_results.csv (Results of CodeQL analysis on the solutions)
│   │   │   ├── scenario.py (Original CWE scenario)
...

How to run the codes

  • All experiments were run on Python 3.8 and Copilot version 1.77.922.
  • Install the required libraries using pip install -r requirements.txt
  • Run python clank_loop.py to generate the results using Copilot. The script will automatically get the results and check similairty between the solutions.
    • IMPORTANT: clank_loop requires you to keep the display on and not interact with the computer while it is running.
  • Run python mark.py to generate the necessary queries for CodeQL.
  • Run python collate_results.py to create all the results from running CodeQL.

About

This repository contains the replication package of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Replication Study"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages