Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,39 @@ All notable changes to Memori will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.3.1] - 2025-10-03

### ⚡ **Performance & Stability Improvements**

**Patch Release**: Major performance optimizations for remote databases and improved memory processing with caching, connection pooling, and background task management.

#### 🚀 **Performance Enhancements**
- **Context Caching**: Added intelligent context cache (500 items, 10min TTL) to reduce redundant database queries
- **Search Result Caching**: Implemented TTL cache (2000 items, 10min TTL, 10MB per entry) for search queries
- **Thread Pool Executor**: Replaced ad-hoc threading with managed thread pool (max 3 workers) for background memory processing
- **SQLite Optimizations**: Enabled WAL mode, increased cache to 64MB, enabled memory-mapped I/O (256MB) for 2-3x better performance

#### 🔧 **Database Connection Improvements**
- **PostgreSQL Pooling**: Optimized connection pool (size: 10, max_overflow: 20) with faster recycle time (10min) for remote databases
- **Pooled Connection Support**: Added detection and configuration for pooled PostgreSQL connections (Neon, Supabase)
- **Connection Timeouts**: Added proper timeout configurations (10s connect, 30s query) to prevent hanging
- **Generic SQL Pooling**: Configured reasonable pool settings (size: 5, max_overflow: 10, recycle: 30min) for all SQL databases

#### 🛡️ **Stability & Safety**
- **Graceful Shutdown**: Implemented proper thread pool shutdown with 5s timeout and fallback termination
- **Cache Cleanup**: Added cache shutdown handlers to prevent resource leaks
- **Logging Order Fix**: Fixed initialization order to setup logging before any operations
- **Error Handling**: Enhanced error handling for cache operations with graceful degradation
- **Third-Party Logger Control**: Always disable verbose third-party loggers (SQLAlchemy, OpenAI) to prevent 30-70% performance overhead

#### 📊 **Performance Impact**
- **Remote DB Latency**: Reduced network round trips by 60-80% through caching
- **SQLite Throughput**: 2-3x write performance improvement with WAL mode
- **Memory Processing**: More predictable background task execution with thread pool
- **Resource Management**: Better cleanup and resource utilization

---

## [2.3.0] - 2025-09-29

### 🚀 **Major Performance Improvements**
Expand Down
2 changes: 1 addition & 1 deletion memori/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
management, and modular architecture for production AI systems.
"""

__version__ = "2.3.0"
__version__ = "2.3.1"
__author__ = "Harshal More"
__email__ = "harshalmore2468@gmail.com"

Expand Down
66 changes: 35 additions & 31 deletions memori/agents/retrieval_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
if TYPE_CHECKING:
from ..core.providers import ProviderConfig

from ..integrations.openai_integration import suppress_auto_recording
from ..utils.pydantic_models import MemorySearchQuery


Expand Down Expand Up @@ -140,18 +141,19 @@ def plan_search(self, query: str, context: str | None = None) -> MemorySearchQue
if self._supports_structured_outputs:
try:
# Call OpenAI Structured Outputs
completion = self.client.beta.chat.completions.parse(
model=self.model,
messages=[
{"role": "system", "content": self.SYSTEM_PROMPT},
{
"role": "user",
"content": prompt,
},
],
response_format=MemorySearchQuery,
temperature=0.1,
)
with suppress_auto_recording():
completion = self.client.beta.chat.completions.parse(
model=self.model,
messages=[
{"role": "system", "content": self.SYSTEM_PROMPT},
{
"role": "user",
"content": prompt,
},
],
response_format=MemorySearchQuery,
temperature=0.1,
)

# Handle potential refusal
if completion.choices[0].message.refusal:
Expand Down Expand Up @@ -599,13 +601,14 @@ class TestModel(BaseModel):
test_field: str

# Try to make a structured output call
test_response = self.client.beta.chat.completions.parse(
model=self.model,
messages=[{"role": "user", "content": "Say hello"}],
response_format=TestModel,
max_tokens=10,
temperature=0,
)
with suppress_auto_recording():
test_response = self.client.beta.chat.completions.parse(
model=self.model,
messages=[{"role": "user", "content": "Say hello"}],
response_format=TestModel,
max_tokens=10,
temperature=0,
)

if (
test_response
Expand Down Expand Up @@ -647,18 +650,19 @@ def _plan_search_with_fallback_parsing(self, query: str) -> MemorySearchQuery:
json_system_prompt += "\n\nRespond ONLY with the JSON object, no additional text or formatting."

# Call regular chat completions
completion = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": json_system_prompt},
{
"role": "user",
"content": prompt,
},
],
temperature=0.1,
max_tokens=1000, # Ensure enough tokens for full response
)
with suppress_auto_recording():
completion = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": json_system_prompt},
{
"role": "user",
"content": prompt,
},
],
temperature=0.1,
max_tokens=1000, # Ensure enough tokens for full response
)

# Extract and parse JSON response
response_text = completion.choices[0].message.content
Expand Down
Loading
Loading