-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Implement health check backend API and storage functionality #11678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement health check backend API and storage functionality #11678
Conversation
- Introduced methods for saving health check results to the database, including validation and cleaning of data. - Added new health check endpoints to retrieve health check history and latest health statuses for models. - Updated model prices and context window configuration for new Azure transcription models.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
- Introduced tests for PrismaClient health check methods, including saving results and retrieving health check history. - Added tests for the _save_health_check_to_db function to ensure proper handling of healthy and unhealthy endpoints. - Implemented mock objects to simulate database interactions and validate method behaviors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed
) | ||
|
||
# Optionally save health check result to database (non-blocking) | ||
if prisma_client is not None and target_model is not None: | ||
await _save_health_check_to_db( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use asyncio.create_task so we don't need to wait for writing to the DB
litellm/proxy/utils.py
Outdated
if details is not None and isinstance(details, dict): | ||
try: | ||
# Serialize and deserialize to ensure valid JSON and remove unsupported values | ||
serialized = json.dumps(details, default=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use safe_dumps
in safe_json_dumps.py
litellm/proxy/utils.py
Outdated
try: | ||
# Serialize and deserialize to ensure valid JSON and remove unsupported values | ||
serialized = json.dumps(details, default=str) | ||
clean_details = json.loads(serialized) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use safe_json_loads.py
- Updated health endpoint to use `get_deployment` for retrieving model names based on model IDs, enhancing error handling for missing models. - Changed health check result saving to the database to be non-blocking by using `asyncio.create_task`. - Cleaned up code for better readability and maintainability.
…nd error handling - Removed unused imports and simplified exception handling in `_get_redoc_url` and `_get_docs_url` functions to manage circular imports. - Cleaned up logging statements for consistency and clarity. - Streamlined error message formatting in `handle_exception_on_proxy` function.
… improved clarity and robustness - Added type hints for `_end_user_list_transactions` to specify it as a dictionary mapping end user IDs to spend amounts. - Updated default values for optional fields in `SpendLogsPayload` to ensure they are initialized properly, enhancing error handling. - Refactored `_premium_user_check` function to improve model validation logic and error handling.
- Updated the disable_spend_updates method to return False if the environment variable DISABLE_SPEND_UPDATES is not set or is None, improving robustness in configuration handling.
- Enhanced the join_paths function to better manage leading and trailing slashes, ensuring correct path concatenation. - Added logic to handle cases where either base_path or route is empty, improving robustness and usability.
- Introduced a new method `_save_health_check_to_db` for saving health check results to the database, utilizing safe JSON functions for data integrity. - Refactored existing health check methods to streamline the process and improve error logging. - Updated email sending logic to ensure secure connections and better error handling. - Improved spend update logic with batch processing and retry mechanisms for database operations. - Added utility functions for projected spend calculations and enhanced validation for team configurations.
- Introduced `save_health_check_result` method to save health check results with detailed logging and validation. - Added `get_health_check_history` method for retrieving health check records with optional filtering. - Implemented `get_all_latest_health_checks` method to fetch the latest health checks for each model. - Enhanced error handling and logging for all new methods to improve reliability and traceability.
- Updated the `_save_health_check_to_db` function to call `save_health_check_result` with explicitly typed arguments instead of a dictionary spread, enhancing code clarity and type safety. - Removed unused method bindings in the mock Prisma client tests to streamline the test setup.
…reamline code and improve maintainability.
…ck result saving - Added `_validate_response_time` method to ensure response time values are valid and handle exceptions gracefully. - Introduced `_clean_details` method to validate and clean details JSON, improving data integrity. - Refactored `save_health_check_result` to utilize these new methods for optional fields, enhancing code clarity and maintainability. - Updated tests to bind new methods to the mock Prisma client for comprehensive testing.
- Introduced `_convert_health_check_to_dict` to standardize health check record conversion to dictionary format for JSON responses. - Added `_check_prisma_client` helper function to streamline database availability checks and improve error handling. - Refactored health check endpoints to utilize the new utility functions, enhancing code clarity and maintainability.
- Simplified the mock PrismaClient setup by consolidating method bindings. - Updated health check result saving tests to use parameterized scenarios for better coverage. - Added tests for health check history retrieval and graceful handling when no database client is provided. - Removed redundant mock functions to streamline the test suite.
- Added `_perform_health_check_and_save` to encapsulate health check execution and optional database saving. - Refactored health endpoint logic to utilize the new helper function, improving code clarity and reducing redundancy. - Enhanced error handling and streamlined the process of saving health check results to the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
…#11678) * feat: Add health check functionality and endpoints - Introduced methods for saving health check results to the database, including validation and cleaning of data. - Added new health check endpoints to retrieve health check history and latest health statuses for models. - Updated model prices and context window configuration for new Azure transcription models. * test: Add unit tests for health check functionality - Introduced tests for PrismaClient health check methods, including saving results and retrieving health check history. - Added tests for the _save_health_check_to_db function to ensure proper handling of healthy and unhealthy endpoints. - Implemented mock objects to simulate database interactions and validate method behaviors. * Refactor health endpoint model ID handling and improve logging - Updated health endpoint to use `get_deployment` for retrieving model names based on model IDs, enhancing error handling for missing models. - Changed health check result saving to the database to be non-blocking by using `asyncio.create_task`. - Cleaned up code for better readability and maintainability. * Refactor utility functions in proxy module for improved readability and error handling - Removed unused imports and simplified exception handling in `_get_redoc_url` and `_get_docs_url` functions to manage circular imports. - Cleaned up logging statements for consistency and clarity. - Streamlined error message formatting in `handle_exception_on_proxy` function. * Enhance type hinting and default values in ProxyUpdateSpend class for improved clarity and robustness - Added type hints for `_end_user_list_transactions` to specify it as a dictionary mapping end user IDs to spend amounts. - Updated default values for optional fields in `SpendLogsPayload` to ensure they are initialized properly, enhancing error handling. - Refactored `_premium_user_check` function to improve model validation logic and error handling. * Fix disable_spend_updates method to handle None return value gracefully - Updated the disable_spend_updates method to return False if the environment variable DISABLE_SPEND_UPDATES is not set or is None, improving robustness in configuration handling. * Refactor join_paths function in utils.py for improved path handling - Enhanced the join_paths function to better manage leading and trailing slashes, ensuring correct path concatenation. - Added logic to handle cases where either base_path or route is empty, improving robustness and usability. * Enhance health check functionality and improve error handling - Introduced a new method `_save_health_check_to_db` for saving health check results to the database, utilizing safe JSON functions for data integrity. - Refactored existing health check methods to streamline the process and improve error logging. - Updated email sending logic to ensure secure connections and better error handling. - Improved spend update logic with batch processing and retry mechanisms for database operations. - Added utility functions for projected spend calculations and enhanced validation for team configurations. * Add health check methods for database interaction - Introduced `save_health_check_result` method to save health check results with detailed logging and validation. - Added `get_health_check_history` method for retrieving health check records with optional filtering. - Implemented `get_all_latest_health_checks` method to fetch the latest health checks for each model. - Enhanced error handling and logging for all new methods to improve reliability and traceability. * Refactor health check result saving to use typed arguments - Updated the `_save_health_check_to_db` function to call `save_health_check_result` with explicitly typed arguments instead of a dictionary spread, enhancing code clarity and type safety. - Removed unused method bindings in the mock Prisma client tests to streamline the test setup. * Remove unused `_save_health_check_to_db` function from utils.py to streamline code and improve maintainability. * Implement response time validation and details cleaning in health check result saving - Added `_validate_response_time` method to ensure response time values are valid and handle exceptions gracefully. - Introduced `_clean_details` method to validate and clean details JSON, improving data integrity. - Refactored `save_health_check_result` to utilize these new methods for optional fields, enhancing code clarity and maintainability. - Updated tests to bind new methods to the mock Prisma client for comprehensive testing. * Add health check utility functions and refactor existing endpoints - Introduced `_convert_health_check_to_dict` to standardize health check record conversion to dictionary format for JSON responses. - Added `_check_prisma_client` helper function to streamline database availability checks and improve error handling. - Refactored health check endpoints to utilize the new utility functions, enhancing code clarity and maintainability. * Refactor health check tests for improved clarity and coverage - Simplified the mock PrismaClient setup by consolidating method bindings. - Updated health check result saving tests to use parameterized scenarios for better coverage. - Added tests for health check history retrieval and graceful handling when no database client is provided. - Removed redundant mock functions to streamline the test suite. * Implement helper function for health check and database saving - Added `_perform_health_check_and_save` to encapsulate health check execution and optional database saving. - Refactored health endpoint logic to utilize the new helper function, improving code clarity and reducing redundancy. - Enhanced error handling and streamlined the process of saving health check results to the database.
…erriAI#11678)" This reverts commit 5f34cee.
Title
Implement health check backend API and storage functionality
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/
directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit
Type
🆕 New Feature
Changes
Backend API Enhancements
litellm/proxy/health_endpoints/_health_endpoints.py
with comprehensive health check functionalitylitellm/proxy/utils.py
Key Features
Screenshot of passing tests:

Dependencies: