Skip to content

[Fix] Replace backoff with tenacity and enhance database connection handling#167

Merged
JonnyTran merged 5 commits intodevelopfrom
fix/db-connection-doctor
Nov 23, 2025
Merged

[Fix] Replace backoff with tenacity and enhance database connection handling#167
JonnyTran merged 5 commits intodevelopfrom
fix/db-connection-doctor

Conversation

@JonnyTran
Copy link
Member

Summary

This PR improves database connection reliability and error handling by replacing the library with and implementing enhanced retry mechanisms for database operations.

Changes

Dependency Updates

  • Replaced with in pyproject.toml for more robust retry logic

Database Reliability Improvements

  • New decorator: Replaced the custom decorator with a more robust retry policy using tenacity
  • Enhanced error detection: Improved connection error filtering to only retry on actual connection issues, not SQL syntax errors
  • Better logging: Added connection pool status logging for debugging connection issues
  • Exponential backoff: Implemented proper exponential backoff with configurable wait times (0.1s to 2s max)

Database Health Checks

  • Workspace doctor enhancement: Added comprehensive database connection health checks to the endpoint
  • Autofix capability: Implemented automatic connection pool reset when database becomes unresponsive
  • PostgreSQL monitoring: Added detailed connection pool statistics and stale transaction detection
  • Safety timeout: Added 3-second timeout for health checks to prevent hanging

Code Simplification

  • Unified retry logic: Consolidated retry decorators across the codebase (, , , )
  • Cleaner error handling: Removed nested retry wrappers in favor of declarative decorators

Technical Details

Retry Policy Configuration

  • Retry condition: Only retries on connection-related errors (OperationalError, DBAPIError, DisconnectionError, etc.)
  • Wait strategy: Exponential backoff with 0.1s initial delay, max 2s delay
  • Max attempts: 3 attempts total
  • Logging: Warns on retry attempts and logs connection pool status

Database Health Check Features

  • Liveness check: Simple query to verify connection
  • PostgreSQL deep inspection: Monitors active connections and detects stale transactions
  • Pool reset: Automatically disposes connection pool on timeout/connection errors
  • Cross-database support: Handles PostgreSQL, SQLite, and other database types

Related Issues

Testing

  • All existing tests pass with the new retry mechanism
  • Database health checks tested with various connection failure scenarios
  • Retry logic validated with simulated network interruptions

Migration Notes

  • No breaking changes to existing APIs
  • Improved error messages and logging for better debugging
  • Automatic pool reset may cause temporary connection recreation (expected behavior)

- Implemented detailed monitoring for PostgreSQL and SQLite database connections within the workspace doctor command.
- Added checks for active connections, stale idle connections, and stale transaction connections for PostgreSQL.
- Included a simple connectivity test for SQLite.
- Enhanced error handling and reporting for database connection issues.
- Removed the redundant worker flag from the development mode in the server start script.
- Adjusted the maximum overflow for PostgreSQL database connections from 10 to 5.
- Changed the PostgreSQL connection pool recycle timeout from 300 to 240 seconds for improved resource management.
…d connection error handling

- Updated database operation retry logic across multiple modules to use the new db_retry_policy decorator.
- Simplified retry logic in user and dataset operations by removing redundant inner functions.
- Enhanced logging of connection pool status during retry attempts for better debugging.
… handling

- Removed the backoff library and replaced its usage with tenacity for retrying connection attempts in various modules.
- Updated the retry decorators to enhance error handling for database and search engine connections.
- Cleaned up the pdm.lock and pyproject.toml files by removing the backoff dependency.
@JonnyTran JonnyTran requested a review from a team as a code owner November 23, 2025 20:54
@JonnyTran JonnyTran merged commit 79e0032 into develop Nov 23, 2025
1 check passed
@JonnyTran JonnyTran deleted the fix/db-connection-doctor branch November 23, 2025 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments