Evaluation code #8

sbmaruf · 2023-11-07T17:10:26Z

Sample scripts of the evaluation.

Since running gen_program_synthesis.py can be costly, I have uploaded the openai projected samples in this temporary dropbox. If you are reviewing this code or just browsing, please do not upload this
data anywhere for the integrity (cross site data contamination for LLM) of the benchmark. You may use this projected data just for testing in your local machine.

yazdanbakhsh · 2024-09-19T08:44:06Z

Sounds good.

sbmaruf and others added 14 commits November 8, 2023 01:08

generation script

3109bd5

execeval eval request

c575a1e

pass@k caculation

e6bc4a4

update readme

6f6da07

Add evaluation code for code_translation, apr

ca3726b

Add env setup

f62e4ae

Update env and README

0248e98

Update README.md

50282b5

Update README.md

073bd65

Fix bug

9821269

Fix: folder path bug

9bc7fa1

add jsonline dependency

e9639b1

fig: update linking

b607cd5

update README

392f05e

Jackal1586 merged commit 4993b99 into main Sep 17, 2024

Jackal1586 mentioned this pull request Sep 17, 2024

Prompts? #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation code #8

Evaluation code #8

sbmaruf commented Nov 7, 2023 •

edited

Loading

yazdanbakhsh commented Sep 19, 2024

Evaluation code #8

Evaluation code #8

Conversation

sbmaruf commented Nov 7, 2023 • edited Loading

yazdanbakhsh commented Sep 19, 2024

sbmaruf commented Nov 7, 2023 •

edited

Loading