Release/25.11.2#62
Merged
Merged
Conversation
Root cause: The query cache was holding full result sets in memory for up to 60 seconds (TTL), with up to 100 queries cached (10MB each). This meant up to 1GB of query results could accumulate in memory. Even when GC ran, it collected 0 objects because the cache legitimately held references to all the data. Memory would grow continuously as new queries were cached, and only slowly release as TTL expired. Evidence from production: - GC logs showing "collected 0 objects" - nothing to collect! - Memory growth pattern matching cache accumulation - Restart dropped memory, then climbed back up as cache filled Fixes: **api/query_cache.py**: 1. Reduced cache limits (50% reduction): - TTL: 60s → 30s (faster expiration) - Max entries: 100 → 50 (less accumulation) - Max result size: 10MB → 5MB (smaller items) - Max total cache: 200MB → 100MB (total cap reduced) 2. Aggressive cleanup thread: - Cleanup interval: 30s → 10s (3x more frequent) - Explicit `del cached_data` before removing from cache - Log GC results to monitor effectiveness 3. Explicit data deletion on eviction: - `del oldest_data` when evicting for size limits - `del oldest_data` when evicting for count limits - Ensures references are broken before dict removal This three-layer defense ensures cached data doesn't accumulate: 1. Lower limits = less data cached overall 2. Faster expiration = shorter retention time 3. Explicit deletion = immediate reference breaking for GC Expected impact: Memory usage should stabilize much faster and at lower levels, with cache cleanup logs showing actual objects collected. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: Python's memory allocator (pymalloc) keeps freed memory in internal arenas for future allocations rather than returning it to the OS. This causes RSS (Resident Set Size) to continuously grow even when objects are properly freed and GC runs successfully. This explains why GC was collecting 0 objects but memory kept growing: 1. Query executes, allocates 10MB for results 2. Results sent to client, references deleted 3. GC runs, frees Python objects 4. Memory stays in Python's arena (not returned to OS!) 5. Next query allocates new 10MB (can't reuse arena for large objects) 6. RSS grows by 10MB with each query cycle Evidence: - GC logs showing "collected 0 objects" - GC working correctly! - Memory growing steadily despite no object accumulation - Pattern matches Python arena fragmentation Solution: Call malloc_trim(0) after GC malloc_trim() is a glibc function that releases freed memory from the heap back to the operating system. After gc.collect() frees Python objects, malloc_trim() ensures that memory is actually returned to the OS rather than kept in Python's memory pool. Changes: **api/main.py**: - After gc.collect(), call malloc_trim(0) on Linux - Logs "returned memory to OS" for monitoring - Gracefully handles non-Linux platforms **api/query_cache.py**: - After cache cleanup GC, call malloc_trim(0) - Ensures cache cleanup actually frees OS memory - Critical for periodic cleanup thread This is the final piece of the memory leak puzzle: 1. Fixed query result handling (del rows) 2. Fixed cache accumulation (lower limits, faster expiration) 3. Fixed memory arena fragmentation (malloc_trim) Expected impact: RSS should stabilize and even decrease after queries complete, rather than continuously growing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: DuckDB cursor objects were never being explicitly closed, causing internal buffers to accumulate in memory. The pattern was: ```python rows = conn.execute(sql).fetchall() ``` This creates a cursor object from `conn.execute(sql)`, fetches all rows, but never closes the cursor. Even after `fetchall()` and `del rows`, the cursor object holds references to DuckDB's internal result buffers. These buffers accumulate across queries, causing steady memory growth that GC cannot collect (they're managed by DuckDB's C++ layer, not Python). Solution: Explicitly close cursor after fetching results ```python cursor = conn.execute(sql) rows = cursor.fetchall() # ... process data ... cursor.close() # Release DuckDB internal buffers del cursor ``` This ensures DuckDB's C++ result buffers are freed immediately after the query data is extracted, rather than accumulating until the connection is closed/reset. Expected impact: Memory should stabilize immediately. Each query will fully clean up after itself, preventing the steady growth pattern. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This script tests three hypotheses: 1. DuckDB cursor.close() - does it matter? 2. FastAPI response objects - are they collectible? 3. Tracemalloc - where are allocations happening? Run in production container: docker exec -it <container> python3 debug_memory.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Test results show GC IS working (collecting objects) but RSS doesn't decrease because Python's memory allocator doesn't return freed memory to the OS. This confirms malloc_trim() is the correct solution. On Linux, malloc_trim() will force memory to be returned to OS after GC. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
When a query fails and closes the connection, the pool cleanup was trying to execute SELECT 1 on the closed connection, causing "Connection already closed!" errors on subsequent queries. Now if a connection is invalid during cleanup, we: 1. Close it properly 2. Create a fresh connection 3. Return the fresh connection to the pool This ensures the pool always contains valid connections and prevents cascading connection errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.