Skip to content

Latest commit

 

History

History
38 lines (33 loc) · 1.92 KB

README.md

File metadata and controls

38 lines (33 loc) · 1.92 KB

Introduction

This repository contains the data used by the paper "Automated Repair of Code from Language Models". The repisotory is split into multiple main folders with explanations given below:

  • APR_Patches
  • Defects_Classifications
  • LMDefects

LMDefects

The LMDefects folder contains the LMDefects the dataset, split into two main folders - Codex_Generated_Solutions and Codex_Generated_Solutions_Ground_Truth, the former having the originally generated solutions and the later fixed versions of the problems, assuming such a fix was found.

APR_Patches

The APR_Patches folder contsins the fault localization information, used by both Recorder and TBar, the patches generated by the aforementioned tools and the patches generated by the three versions of Codex-e we have used.

Defects_Classifications

The Defects_Classifications sheet contains the data of our classifications of the different solutions, whether they could compile or not, whether the solution is plausible and the type of fix needed.

Folder hierarchy

.
├── APR_Patches // Contain all patches generated by APR tools
│   ├── Codex_Edit_Patches
│   │   ├── Codex_Edit_Bug  // All correct patches produced by Codex_e_bug
│   │   ├── Codex_Edit_Line // All correct patches produced by Codex_e_line
│   │   ├── Codex_Edit_Stmt // All correct patches produced by Codex_e_stmt
│   │   └── raw_data // All patches produced by Codex Edit Mode
│   ├── Codex_fl_result // Fault localization info used by APR tools
│   ├── Recoder_Patches
│   └── TBar_Patches
├── Defects_Classifications // All defects category classification for Codex produced incorrect solution
├── LMDefects 
│   ├── Codex_Generated_Solutions // All solutions generated by Codex
│   └── Codex_Generated_Solutions_Ground_Truth // All constructed ground truth
└── README.md