Description
- This is actually a bug report.
- I am not getting good LLM Results
- I have tried asking for help in the community on discord or discussions and have not received a response.
- I have tried searching the documentation and have not found an answer.
What Model are you using?
- gpt-3.5-turbo
- gpt-4-turbo
- gpt-4
- Other (Ollama mistral-small:24B)
Describe the bug
I tested the bugfix of #1597. Unfortunately, the bug still persists by iterating over all retries without accepting the global timeout.
To Reproduce
I tested it on a slightly modified example from docs/integrations/ollama.md namely:
import logging
logging.basicConfig(level=logging.DEBUG)
from openai import OpenAI
from pydantic import BaseModel
import instructor
class Character(BaseModel):
name: str
age: int
client = instructor.from_openai(
OpenAI(
base_url="http://10.10.10.115:11434/v1",
api_key="ollama", # required, but unused
),
mode=instructor.Mode.JSON,
)
resp = client.chat.completions.create(
model="mistral-small:24b-9k",
messages=[
{
"role": "user",
"content": "Tell me about Harry Potter",
}
],
response_model=Character,
max_retries=2,
timeout=1.0, # Total timeout across all retry attempts
)
Here are the logs:
DEBUG:instructor:Patching client.chat.completions.create
with mode=<Mode.JSON: 'json_mode'>
DEBUG:instructor:Instructor Request: mode.value='json_mode', response_model=<class 'main.Character'>, new_kwargs={'messages': [{'role': 'system', 'content': '\n As a genius expert, your task is to understand the content and provide\n the parsed objects in json that match the following json_schema:\n\n\n {\n "properties": {\n "name": {\n "title": "Name",\n "type": "string"\n },\n "age": {\n "title": "Age",\n "type": "integer"\n }\n },\n "required": [\n "name",\n "age"\n ],\n "title": "Character",\n "type": "object"\n}\n\n Make sure to return an instance of the JSON, not the schema itself\n'}, {'role': 'user', 'content': 'Tell me about Harry Potter'}], 'model': 'mistral-small:24b-9k', 'timeout': 1.0, 'response_format': {'type': 'json_object'}}
DEBUG:instructor:max_retries: 2, timeout: 1.0
DEBUG:instructor:Retrying, attempt: 1
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/chat/completions', 'timeout': 1.0, 'files': None, 'idempotency_key': 'stainless-python-retry-583e0c32-bf69-4644-8654-9b148cf38734', 'json_data': {'messages': [{'role': 'system', 'content': '\n As a genius expert, your task is to understand the content and provide\n the parsed objects in json that match the following json_schema:\n\n\n {\n "properties": {\n "name": {\n "title": "Name",\n "type": "string"\n },\n "age": {\n "title": "Age",\n "type": "integer"\n }\n },\n "required": [\n "name",\n "age"\n ],\n "title": "Character",\n "type": "object"\n}\n\n Make sure to return an instance of the JSON, not the schema itself\n'}, {'role': 'user', 'content': 'Tell me about Harry Potter'}], 'model': 'mistral-small:24b-9k', 'response_format': {'type': 'json_object'}}}
DEBUG:openai._base_client:Sending HTTP Request: POST http://10.10.10.115:11434/v1/chat/completions
DEBUG:httpcore.connection:connect_tcp.started host='10.10.10.115' port=11434 local_address=None timeout=1.0 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7be5ae49a000>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.failed exception=ReadTimeout(TimeoutError('timed out'))
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
DEBUG:openai._base_client:Encountered httpx.TimeoutException
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
yield
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 250, in handle_request
resp = self._pool.handle_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
raise exc from None
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
response = connection.handle_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
raise exc
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
) = self._receive_response_headers(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
event = self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 217, in _receive_event
data = self._network_stream.read(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_backends/sync.py", line 126, in read
with map_exceptions(exc_map):
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/contextlib.py", line 158, in exit
self.gen.throw(value)
File "/usr/local/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 969, in request
response = self._client.send(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 914, in send
response = self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth
response = self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
response = self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1014, in _send_single_request
response = transport.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 249, in handle_request
with map_httpcore_exceptions():
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/contextlib.py", line 158, in exit
self.gen.throw(value)
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout: timed out
DEBUG:openai._base_client:2 retries left
INFO:openai._base_client:Retrying request to /chat/completions in 0.475531 seconds
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/chat/completions', 'timeout': 1.0, 'files': None, 'idempotency_key': 'stainless-python-retry-583e0c32-bf69-4644-8654-9b148cf38734', 'json_data': {'messages': [{'role': 'system', 'content': '\n As a genius expert, your task is to understand the content and provide\n the parsed objects in json that match the following json_schema:\n\n\n {\n "properties": {\n "name": {\n "title": "Name",\n "type": "string"\n },\n "age": {\n "title": "Age",\n "type": "integer"\n }\n },\n "required": [\n "name",\n "age"\n ],\n "title": "Character",\n "type": "object"\n}\n\n Make sure to return an instance of the JSON, not the schema itself\n'}, {'role': 'user', 'content': 'Tell me about Harry Potter'}], 'model': 'mistral-small:24b-9k', 'response_format': {'type': 'json_object'}}}
DEBUG:openai._base_client:Sending HTTP Request: POST http://10.10.10.115:11434/v1/chat/completions
DEBUG:httpcore.connection:connect_tcp.started host='10.10.10.115' port=11434 local_address=None timeout=1.0 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7be5adf56030>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.failed exception=ReadTimeout(TimeoutError('timed out'))
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
DEBUG:openai._base_client:Encountered httpx.TimeoutException
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
yield
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 250, in handle_request
resp = self._pool.handle_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
raise exc from None
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
response = connection.handle_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
raise exc
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
) = self._receive_response_headers(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
event = self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 217, in _receive_event
data = self._network_stream.read(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpcore/_backends/sync.py", line 126, in read
with map_exceptions(exc_map):
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/contextlib.py", line 158, in exit
self.gen.throw(value)
File "/usr/local/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 969, in request
response = self._client.send(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 914, in send
response = self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth
response = self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
response = self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1014, in _send_single_request
response = transport.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 249, in handle_request
with map_httpcore_exceptions():
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/contextlib.py", line 158, in exit
self.gen.throw(value)
File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout: timed out
DEBUG:openai._base_client:1 retry left
INFO:openai._base_client:Retrying request to /chat/completions in 0.785212 seconds
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/chat/completions', 'timeout': 1.0, 'files': None, 'idempotency_key': 'stainless-python-retry-583e0c32-bf69-4644-8654-9b148cf38734', 'json_data': {'messages': [{'role': 'system', 'content': '\n As a genius expert, your task is to understand the content and provide\n the parsed objects in json that match the following json_schema:\n\n\n {\n "properties": {\n "name": {\n "title": "Name",\n "type": "string"\n },\n "age": {\n "title": "Age",\n "type": "integer"\n }\n },\n "required": [\n "name",\n "age"\n ],\n "title": "Character",\n "type": "object"\n}\n\n Make sure to return an instance of the JSON, not the schema itself\n'}, {'role': 'user', 'content': 'Tell me about Harry Potter'}], 'model': 'mistral-small:24b-9k', 'response_format': {'type': 'json_object'}}}
DEBUG:openai._base_client:Sending HTTP Request: POST http://10.10.10.115:11434/v1/chat/completions
DEBUG:httpcore.connection:connect_tcp.started host='10.10.10.115' port=11434 local_address=None timeout=1.0 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7be5adf56b10>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 500, b'Internal Server Error', [(b'Content-Type', b'application/json'), (b'Date', b'Wed, 18 Jun 2025 17:32:37 GMT'), (b'Content-Length', b'119')])
INFO:httpx:HTTP Request: POST http://10.10.10.115:11434/v1/chat/completions "HTTP/1.1 500 Internal Server Error"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
DEBUG:openai._base_client:HTTP Response: POST http://10.10.10.115:11434/v1/chat/completions "500 Internal Server Error" Headers({'content-type': 'application/json', 'date': 'Wed, 18 Jun 2025 17:32:37 GMT', 'content-length': '119'})
DEBUG:openai._base_client:request_id: None
DEBUG:openai._base_client:Encountered httpx.HTTPStatusError
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1014, in request
response.raise_for_status()
File "/usr/local/lib/python3.12/site-packages/httpx/_models.py", line 829, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://10.10.10.115:11434/v1/chat/completions'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
DEBUG:openai._base_client:Re-raising status error
DEBUG:instructor:Retry error: RetryError[<Future at 0x7be5adf57aa0 state=finished raised InternalServerError>]
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/instructor/retry.py", line 184, in retry_sync
response = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_utils/_utils.py", line 287, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 925, in create
return self._post(
^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1239, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1034, in request
raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500 - {'error': {'message': 'unexpected server status: llm server loading model', 'type': 'api_error', 'param': None, 'code': None}}
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/instructor/retry.py", line 179, in retry_sync
for attempt in max_retries:
^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/tenacity/init.py", line 445, in iter
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/tenacity/init.py", line 378, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/tenacity/init.py", line 421, in exc_check
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7be5adf57aa0 state=finished raised InternalServerError>]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/runpy.py", line 198, in _run_module_as_main
return _run_code(code, main_globals, None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/runpy.py", line 88, in _run_code
exec(code, run_globals)
File "/home/ai/.cursor-server/extensions/ms-python.debugpy-2025.8.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/home/ai/.cursor-server/extensions/ms-python.debugpy-2025.8.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/home/ai/.cursor-server/extensions/ms-python.debugpy-2025.8.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/home/ai/.cursor-server/extensions/ms-python.debugpy-2025.8.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ai/.cursor-server/extensions/ms-python.debugpy-2025.8.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/home/ai/.cursor-server/extensions/ms-python.debugpy-2025.8.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "/home/ai/dev/dev/model_validator/lithon/production_pipeline/retry.py", line 24, in
resp = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/instructor/client.py", line 366, in create
return self.create_fn(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/instructor/patch.py", line 195, in new_create_sync
response = retry_sync(
^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/instructor/retry.py", line 210, in retry_sync
raise InstructorRetryException(
instructor.exceptions.InstructorRetryException: Error code: 500 - {'error': {'message': 'unexpected server status: llm server loading model', 'type': 'api_error', 'param': None, 'code': None}}
Expected behavior
Timeout happens after the global time without retries.