-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
bugSomething isn't workingSomething isn't working
Description
As with Issue #18 (involving the eval stage with the Correctness metric), if the OpenAI calls fail for any reason, the metacoder runner also gets caught in a try/fail/repeat loop until manually aborted (or perhaps some internal retry limit is hit, but that is not apparent from my testing). This likely has a similar solution as in the cited issue.
This is reproducible from the same test case as in parent Issue #19 .
(metacoder) PS C:\Users\CTParker\PycharmProjects\metacoder> uv run metacoder eval .\tests\input\goose_no_server_test.yaml
🔬 Running evaluations from: tests\input\goose_no_server_test.yaml
📊 Loaded dataset: pubmed tools evals
Models: claude-sonnet
Coders: goose, dummy (all available)
Cases: 1
Total evaluations: 2
🚀 Starting evaluations...
Progress: 1/2 - goose/claude-sonnet/PMID_28027860_Full_Text (no servers)
Running goose with claude-sonnet on case 'PMID_28027860_Full_Text'
📁 Preparing workdir: eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_no_servers\claude-sonnet_goose_PMID_28027860_Full_Text
🔒 Obtaining lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_no_servers\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔧 Writing config object: .config/goose/config.yaml type=yaml
🔓 Releasing lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_no_servers\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔒 Obtaining lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_no_servers\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🦆 Running command: goose run -t What is the first sentence of section 2 in PMID: 28027860?
🦆 Command took 27.97 seconds
🔓 Releasing lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_no_servers\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
Evaluating with CorrectnessMetric
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
OpenAI API Key has insufficient quota: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}
OpenAI quota exhausted; downgrading to Claude...
HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
✓ Tests finished 🎉! Run 'deepeval view' to analyze, debug, and save evaluation results on Confident AI.
Progress: 2/2 - goose/claude-sonnet/PMID_28027860_Full_Text with servers: mcp-simple-pubmed
Running goose with claude-sonnet on case 'PMID_28027860_Full_Text'
📁 Preparing workdir: eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text
🔒 Obtaining lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔧 Writing config object: .config/goose/config.yaml type=yaml
🔓 Releasing lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔒 Obtaining lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🦆 Running command: goose run -t What is the first sentence of section 2 in PMID: 28027860?
🦆 Command took 25.19 seconds
🔓 Releasing lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
Evaluating with CorrectnessMetric
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.465641 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.780335 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
OpenAI Error: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}} Retrying: 1 time(s)...
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.445647 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.816545 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
OpenAI Error: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}} Retrying: 2 time(s)...
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.395725 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.944088 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
OpenAI Error: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}} Retrying: 3 time(s)...
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.386913 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying request to /chat/completions in 0.798842 seconds
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
OpenAI Error: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}} Retrying: 4 time(s)...
Aborted!
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working