Skip to content

qzydustin/LLM4CodeGeneration

Repository files navigation

LLM4CodeGeneration

A concise guide to generate code/, data/, and LLM solutions with IO-level testing.


📦 Setup

# Install dependencies
pip install -r requirements.txt

🚀 Three-Step Workflow

Step 1: Download Dataset → data/dataset/

Download the complete benchmark dataset (626 problems) from Hugging Face:

python hf_dataset.py download

This creates data/dataset/ with 626 JSON files containing:

  • Problem descriptions
  • Canonical solutions (7 languages)
  • Test case generators
  • Evaluators

Step 2: Generate Code Files → code/

Generate executable code files from the dataset:

python evaluate_solution.py evaluate -o data/evaluation

This creates code/{python,cpp,java,javascript,golang,ruby}/ with complete runnable code files including:

  • Solution implementation
  • Test framework
  • Auto-generated test runners

Step 3: Generate IO Solutions → out-gpt-4o/

Use LLM to generate and test solutions for IO-type problems:

Configure API Key (edit io_generate_and_test.py):

OPENAI_API_KEY = "sk-your-api-key-here"
OPENAI_API_URL = "https://api.openai.com/v1"
MODEL_NAME = "gpt-4o"

Run generation + testing:

# Generate solutions and test them (lite mode: stop after first failure)
python io_generate_and_test.py both

# Full mode: test all cases
python io_generate_and_test.py both --full

# Generate only (no testing)
python io_generate_and_test.py generate

# Test only (existing solutions)
python io_generate_and_test.py test

Output structure:

out-gpt-4o/
├── {problem_name}/
│   ├── solution.py       # Generated solution
│   ├── prompt.txt        # LLM prompt
│   └── raw_output.txt    # Raw LLM response
├── generation_results.json
├── testing_results.json
└── all_results.json      # Combined results

📊 What Gets Generated

Folder Content Command
data/dataset/ 626 problem files from HF Hub python hf_dataset.py download
code/ Executable code (7 languages) python evaluate_solution.py evaluate
out-gpt-4o/ LLM-generated IO solutions python io_generate_and_test.py both

🔍 Verify Installation

# Check dataset
ls data/dataset/*.json | wc -l  # Should show 626

# Check code folder
ls code/python/*.py | wc -l     # Should show multiple files

# Check results (after Step 3)
cat out-gpt-4o/all_results.json | python -m json.tool | head -30

📖 Full Documentation

See README_SETUP.md for detailed documentation.


💡 Notes

  • IO vs Functional: The dataset contains both IO-type (stdin/stdout) and functional-type (function calls) problems
  • Lite Mode: Stops testing after first failure per problem (faster)
  • Full Mode: Tests all test cases for each problem (comprehensive)
  • Model Configuration: Change MODEL_NAME in io_generate_and_test.py to use different LLMs (output goes to out-{MODEL_NAME}/)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages