Forked from OpenAI's repo. Should run out of the box with python 3.11 (as of 11/23)
To generate the completions (after pip installing the requirements), run:
mkdir results
python run.py
Then to evaluate the completion results, run
python human_eval/evaluate_functional_correctness "YOUR RESULTS FILE.jsonl"