Skip to content

APR-Comp/autocode-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

APR-COMP Autocode Python 2024

Description

This repository contains a collection of 20 problems taken from Leetcode for the 1st edition of the Automated Program Repair Competition (APR-COMP 2024). Each problem has 5 solutions generated by OpenAI's GPT-3.5 Turbo and GPT-4 chat models. The solutions are located in the respective problem folders.

The benchmark distribution is 1:2:1 in difficulty Easy:Medium:Hard.

Problem setup

The problems are set up as python projects, running on Python 3.9 using the pytest package for tests.

Criterion for selection

The (in)correctness of each solution has been manually evaluated, ensuring that each solution has passed and failed at least one test case in Leetcode's judging system.

Test suite generation

The test suite used for evaluation is generated from the public test cases provided by Leetcode and using a generator based fuzzer using the source code of the Fuzzing book over a reference solution. Every problem's directory contains a reference.py file, which is a reference solution collected from Leetcode's forums. The implementation of the test suite generation is in the testcases folder. To generate the test suite, execute the generate_tests.sh script to generate the public and private test suite.

For further information, the file testcases/<PROBLEM>/fuzzer.py and testcases/<PROBLEM>/bug.py presents the fuzzer and test harness over the testcases/<PROBLEM>/reference.py file, based on reference.py.

Metadata generation

  • To regenerate the metadata, execute the metadata-generator.py script. This script will traverse all problems, insert the subject scripts (run_test, setup_subject, install_deps) and generate metadata entries with all required information. To ensure correct execution of the script, ensure that Java 11 and Maven are installed on the machine and this repository must be located in the home directory of the user due to the usage of relative paths. To change this requirement, modify line 6 in run_test_local.

Dataset generation

The repository contains a folder "crawler" which contains the code used to generate solutions using GPT-3.5 and GPT-4 with a crawler. To execute the crawler, add an OpenAI API key in crawler.py line 44. The key must have access to GPT-4.

APR-COMP reproduction

In order to reproduce the results from APR-COMP, ensure that the benchmark is on the apr-comp-<YEAR> branch. Subsequently, invoke cerberus with all tool configs with valkyrie.cerberus.config being last in order to run as a validation over the generated patches. After all runs have executed, run the process_results.py script to get the final data in a file called aggregated.json.

About

Large Language Model Generated code in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •