You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
style of call hangs at this point for many minutes, even though EVALS_THREAD_TIMEOUT is set to 10 seconds. This is devastating to eval turnaround time.
Code snippets
No response
OS
macOS Ventura (13.4)
Python version
3.11.3
Library version
1.0.3
The text was updated successfully, but these errors were encountered:
Indeed, I have encountered a similar issue. It appears that the output might take an extended period to return. In my specific experience, the model only completed its return after generating all 4097 tokens. To address this problem, I found success in augmenting the EVALS_THREAD_TIMEOUT to 500, as the previous setting of 100 was insufficient for my requirements.
As has been brought up before (#1384, #1292,
#270), evals suffer from a hanging
issue, where an evaluation run will hang for a very long time (if not
indefinitely) at the end of a run (say, on the 99th sample of out 100).
This PR addresses this issue, by replacing a seemingly redundant
single-threaded thread creation that was happening when making requests,
nested inside the already multi-threaded eval loop. My impression is
that this nested multithreading was causing overhead that resulted in
the hanging experienced.
I had also noticed this hanging issue in `EVALS_SEQUENTIAL=1` mode
(where it no longer occurs at the end, but instead randomly in the
middle of the run).
I was able to identify the source of this issue though debugging print
statements that ultimately pointed to the `request_with_timeout`
function as the culprit.
We have tested the new `request_with_timeout` code on a fork where we
have run multiple new and pre-existing evals, including with 3rd party
solvers, and found no change in behaviour or errors, and a clear
improvement on the hanging issue.
Describe the bug
oaieval
hangs near the end, before reporting, a lot.To Reproduce
style of call hangs at this point for many minutes, even though
EVALS_THREAD_TIMEOUT
is set to 10 seconds. This is devastating to eval turnaround time.Code snippets
No response
OS
macOS Ventura (13.4)
Python version
3.11.3
Library version
1.0.3
The text was updated successfully, but these errors were encountered: