release: v0.19.1 — Advanced Agent Patterns + Quality Infrastructure#34
Merged
johnnichev merged 10 commits intomainfrom Mar 30, 2026
Merged
release: v0.19.1 — Advanced Agent Patterns + Quality Infrastructure#34johnnichev merged 10 commits intomainfrom
johnnichev merged 10 commits intomainfrom
Conversation
Critical: - astream(): guard response_msg.content with `or ""` (pitfall #7) - FileCheckpointStore.save(): atomic write via temp file + os.replace() - llm_evaluators: fence all user-controlled content in judge prompts to prevent injection - evals/regression.py, snapshot.py: atomic baseline/snapshot writes; sanitise suite_name path - tools/base.py: shared module-level executor instead of per-call ThreadPoolExecutor - rag/stores/memory.py: threading.Lock() on InMemoryVectorStore add/search/delete/clear High: - fallback.py: threading.Lock() on circuit breaker _failures/_circuit_open_until dicts - memory.py: ConversationMemory.branch() deep-copies tool_calls to prevent shared mutation - evals/report.py: fix p95/p99 off-by-one (use math.ceil, not int) - tools/registry.py: warn on silent tool name collision instead of silently overwriting
- H5: policy.py — deny empty tool_name immediately instead of falling
through pattern matching where fnmatch("","*") would match allow/deny/*
- H6: pii.py — extend SSN regex to detect space-separated format
(123 45 6789 was not detected; only dash-separated and 9-digit bare)
- H9: decorators.py — _unwrap_type now handles Python 3.10+ X | None
syntax (types.UnionType); previously str | None annotations raised
ToolValidationError on Python 3.10/3.11/3.12/3.13
- CLAUDE.md — add pitfalls #19 (eval judge prompt injection fencing),
#20 (ThreadPoolExecutor singleton), #21 (types.UnionType in tools)
SQLiteVectorStore (C1, C2, H1, L2):
- Lock now wraps the entire DB operation (connect → commit → close) in
add_documents(), search(), delete(), clear(), _init_db(); previously
only sqlite3.connect() was inside with self._lock, leaving cursors and
commits unprotected under concurrent access
- IDs switched from sha256+batch-index to uuid4 — the old scheme produced
identical IDs for the same document order across batches (silent overwrite)
and different IDs for different orderings (phantom duplicates)
- Removed dead _connect() method whose comment falsely implied callers
held the lock
loaders.py (H2, M3):
- pypdf page.extract_text() can return None for image/encrypted pages;
changed text.strip() → (text or "").strip()
- recursive=False now strips ** from caller-supplied glob patterns that
contain ** instead of silently recursing anyway
chroma.py (H3):
- chromadb.Client() removed in chromadb ≥ 0.4; changed to EphemeralClient()
- Updated test mock from Client to EphemeralClient
hybrid.py (H4, M6):
- Removed dead doc_scores[key] = 0.0 line before get(key, 0.0) + ...
(the get default made the explicit assignment unreachable)
- Added ValueError for top_k < 1
bm25.py (M1):
- _score_document divided by _avg_doc_len which is 0.0 when only empty-text
documents have been indexed; guarded with max(avg_doc_len, 1e-8)
stores/memory.py (M2):
- Capacity check (TOCTOU) moved inside with self._lock so the count read
and the subsequent add are atomic
chunking.py (M4, M5, L1):
- RecursiveTextSplitter overlap now built from complete segments (walk
backward through current_chunk) instead of raw character slice, so
multi-char separators like "\n\n" are never split mid-sequence
- ContextualChunker escapes </document> and </chunk> in user content
before interpolating into the LLM prompt template
- TextSplitter raises ValueError if length_function("a") != 1 so
token-counting functions are caught at construction time
- Added .hypothesis/ to .gitignore to exclude Hypothesis test artifacts. - Updated CHANGELOG.md to document the addition of new advanced agent patterns (PlanAndExecuteAgent, ReflectiveAgent, DebateAgent, TeamLeadAgent) and expanded evaluators in the eval framework, bringing the total to 50. - Incremented version to 0.19.1 in pyproject.toml and __init__.py. - Added new example scripts demonstrating the usage of the new agent patterns.
- Updated CHANGELOG.md to reflect the addition of new advanced agent patterns (PlanAndExecuteAgent, ReflectiveAgent, DebateAgent, TeamLeadAgent) and expanded evaluators in the eval framework, increasing the total to 50. - Added new entry in mkdocs.yml for the Patterns module. - Revised README.md to include new features and examples, and updated evaluator count. - Incremented test count in documentation to 2664 and examples to 73. - Updated ROADMAP.md to mark v0.19.1 as complete and highlight advanced agent patterns.
- Removed the provider parameter from the PlanAndExecuteAgent instantiation. - Updated evaluator count to 11 new evaluators, including the addition of ForbiddenWordsEvaluator. - Marked several features as complete, indicating progress in the development roadmap. - Added a section detailing new quality infrastructure initiatives, including Ralph loop, Bandit in CI, and various testing enhancements.
…AG bug fixes - Added new quality infrastructure initiatives including the Ralph loop, Bandit integration in CI, property-based tests, thread-safety smoke suite, and production simulations. - Documented 8 edge-case bug fixes in the RAG subsystem, improving stability and error handling across various components. - Updated test and example counts to reflect recent additions.
8-pass convergence system run across agent, providers, tools, rag, memory, evals, security. All modules achieved clean pass on pass 8. Key fixes: - ThreadPoolExecutor singletons for parallel dispatch + timeout enforcement - Async observer events on LLM cache hits in arun()/astream() - Prompt injection fencing on all LLM evaluators + coherence judge - Non-atomic file writes → tmp+replace (html, junit, snapshot, history) - Path traversal in BaselineStore, SnapshotStore, HistoryStore, policy - BM25.search() atomic snapshot under lock (concurrent access safety) - TextSplitter infinite loop guard; RecursiveTextSplitter empty chunk filter - Optional[List[T]] type unwrapping for tool parameters - Tool None optional params no longer raise ToolValidationError - Naive datetime normalization in 6+ knowledge/session locations - FallbackProvider mid-stream corruption, _is_retriable word-boundary regex - Gemini timeout parameter applied to all 4 methods - KnowledgeGraph confidence=null silent triple discard fixed - Policy from_dict type coercion and validation at construction time - GuardrailsPipeline trace step false-negative on WARN/REWRITE actions - compress_keep_recent=0 current message drop fixed - 254 new regression tests Test count: 2664 → 2918
… models - _openai_compat: retry without temperature if API rejects it (gpt-4o-mini and newer models no longer accept temperature=0.0) - test_regression: replace deprecated asyncio.get_event_loop() with asyncio.run() - test_evals_e2e: update snapshot key assertion (greeting -> greeting_0) after pass-5 snapshot key uniqueness fix All 3094 e2e tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Checklist