-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add: learning performance-improving code edits 馃ェ #65
base: main
Are you sure you want to change the base?
Conversation
Nice! Some comments:
|
@Muennighoff Those are test cases and are needed in order to evaluate the correctness of the generated program. |
I think they should be uploaded to a dataset on the HF Hub that is then loaded like it's done for the other eval tasks |
|
lm_eval/tasks/pie_perf.py
Outdated
cmd = "git clone https://huggingface.co/datasets/rootacess/pie-perf-testcases lm_eval/tasks/custom_metrics/pie_perf_metric/public_test_cases" | ||
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) | ||
output, error = process.communicate() | ||
logging.error(f'An error occurred: {error}') | ||
|
||
# running evaluations | ||
res = compute(generations, references, dataset=self.get_dataset()[:limit]) | ||
|
||
# cleaning up | ||
cmd = "rm -rf lm_eval/tasks/custom_metrics/pie_perf_metric/public_test_cases" | ||
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) | ||
output, error = process.communicate() | ||
logging.error(error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we load the test cases with HF datasets?
If not at least should check that the path doesn't already exist I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the invariable numbers of test cases per type of problem, it is not suitable to convert it into Dataset format. I can add the condition to check if path exists
PR to add LEARNING PERFORMANCE-IMPROVING CODE EDITS with PIE dataset, few shot evaluations for program performance improvement