-
Notifications
You must be signed in to change notification settings - Fork 92
Closed
Description
Problem Description
In eval model, limit=100 is set but all samples are evaluated.
Reproduction Steps
auto-round facebook/opt-125m --eval --eval_task_by_task --tasks lambada_openai,piqa --limit 100
Environment Information
Linux
Error Logs
root@ip-10-0-146-1: auto-round facebook/opt-125m --eval --eval_task_by_task --tasks lambada_openai,piqa --limit 100
100%|????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????| 5153/5153 [00:05<00:00, 901.36it/s]
Running loglikelihood requests: 0%| | 0/5153 [00:00<?, ?it/s]Passed argument batch_size = auto:8.0. Detecting largest batch size
Determined largest batch size: 64
Running loglikelihood requests: 11%|?????????? | 578/5153 [00:01<00:07, 642.86it/s]Passed argument batch_size = auto:8.0. Detecting largest batch size
Determined largest batch size: 64
Running loglikelihood requests: 100%|??????????????????????????????????????????????????????????????????????????| 5153/5153 [00:03<00:00, 1550.92it/s]bootstrapping for stddev: perplexity
100%|????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????| 100/100 [00:00<00:00, 185.14it/s]
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------------|------:|------|-----:|----------|---|------:|---|-----:|
|lambada_openai| 1|none | 0|acc |?? | 0.3788|�? |0.0068|
| | |none | 0|perplexity|?? |26.0217|�? |0.9382|
100%|??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????| 1838/1838 [00:01<00:00, 1832.37it/s]
Running loglikelihood requests: 100%|??????????????????????????????????????????????????????????????????????????| 3676/3676 [00:01<00:00, 3374.97it/s]
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------------|------:|------|-----:|----------|---|------:|---|-----:|
|lambada_openai| 1|none | 0|acc |?? | 0.3788|�? |0.0068|
| | |none | 0|perplexity|?? |26.0217|�? |0.9382|
|piqa | 1|none | 0|acc |?? | 0.6295|�? |0.0113|
| | |none | 0|acc_norm |?? | 0.6197|�? |0.0113|
total eval time: 34.39299297332764Additional Context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working