Skip to content

NL2Code/CodeS

Repository files navigation

CodeS: Natural Language to Code Repository via Multi-Layer Sketch

Paper | Video Demo

What is this about?

The impressive performance of large language models (LLMs) on code-related tasks has shown the potential of fully automated software development. In light of this, we introduce a new software engineering task, namely Natural Language to code Repository (NL2Repo). This task aims to generate an entire code repository from its natural language requirements. To address this task, we propose a simple yet effective framework CodeS, which decomposes NL2Repo into multiple sub-tasks by a multi-layer sketch. Specifically, CodeS includes three modules: RepoSketcher, FileSketcher, and SketchFiller. RepoSketcher first generates a repository's directory structure for given requirements; FileSketcher then generates a file sketch for each file in the generated structure; SketchFiller finally fills in the details for each function in the generated file sketch. To rigorously assess CodeS on the NL2Repo task, we carry out evaluations through both automated benchmarking and manual feedback analysis. For benchmark-based evaluation, we craft a repository-oriented benchmark, SketchEval, and design an evaluation metric, SketchBLEU. For feedback-based evaluation, we develop a VSCode plugin for CodeS and engage 30 participants in conducting empirical studies. Extensive experiments prove the effectiveness and practicality of CodeS on the NL2Repo task.

Project Directory

.
├── assets
├── clean_repo.py # ./repos/ -> ./cleaned_repos/
├── cleaned_repos
├── craft_train_data.py # ./output -> ./training_data
├── extract_sketch.py # ./cleaned_repos/ -> ./output
├── outputs
├── projects # two projects
├── prompt_construction_utils.py
├── repos
├── requirements.txt
├── run_step1_clean.sh # runing ./clean_repo.py
├── run_step2_extract_sketch.sh # runing ./extract_sketch.py
├── run_step3_make_data.sh # runing ./craft_train_data.py
├── scripts
├── train # *train codes model* scripts
├── training_data
└── validation # *evaluation* scripts

Creating Instruction Data for 100 Repositories

  1. Download the selected repositories to the ./repos directory and unzip them;
  2. Preprocess the repositories;
bash run_step1_clean.sh
  1. Extract instruction training data for RepoSketcher, FileSketcher, and SketchFiller.
bash run_step2_extract_sketch.sh
bash run_step3_make_data.sh

Training

  1. Place the created instruction data into ./train/data and configure dataset_info.json according to the structure described at https://github.com/hiyouga/LLaMA-Factory/tree/main/data.

  2. Start the training process:

vim ./train/run_train_multi_gpu.sh
bash ./train/run_train_multi_gpu.sh

Evaluation

  1. Install SketchBLEU, similar to CodeBLEU.

  2. Perform inference on SketchEval:

python ./codes/validation/evaluation-scripts/from_scratch_inference.py
  1. Convert the inference results for the entire repository:
python ./codes/validation/evaluation-scripts/transfer_output_to_repo.py
  1. Evaluate the generated repository as with CodeBLEU:
python ./codes/validation/evaluation-scripts/batch_eval/get_metric.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published