Skip to content

Conversation

@giladfrid009
Copy link
Contributor

  • Do not add additional load the server while performing health-check by sending inference requests; rather check if the actual model is loaded on the server via openai_client.models.retrieve
  • added configurable timeout for the health check via env variable "ART_SERVER_MONITOR_TIMEOUT"

- do not load the server while performing health-check with additional inference requests; rather check if the actual model is loaded on the server
-  added configurable timeout for the health check
@giladfrid009 giladfrid009 changed the title local backend. monitor OpenAI server() improvement LocalBackend._monitor_openai_server improvement Sep 2, 2025
@giladfrid009
Copy link
Contributor Author

I added the ART_SERVER_MONITOR_TIMEOUT option since I encountered several times unexplained timeouts from this function, and ended up completely commenting it out in my source. I think ART_SERVER_MONITOR_TIMEOUT might be a good middle-point (although it introduces additional env variable).

Possible modification:ART_SERVER_MONITOR_TIMEOUT=-1 or overall negative values disable the timeout altogether.

@giladfrid009
Copy link
Contributor Author

@bradhilton requesting review :)

Copy link
Collaborator

@bradhilton bradhilton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bradhilton bradhilton merged commit 78bfef1 into OpenPipe:main Sep 28, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants