v2.0.4 Cleanup Sprint + TECH-DEBT-151 Chunking Fixes#28
Merged
Hidden-History merged 9 commits intomainfrom Feb 10, 2026
Merged
Conversation
PLAN-003 v2.0.4 Cleanup Sprint Phase 1: Security (Agent 1): - BUG-063: Replace hardcoded bcrypt hash in web.yml - TECH-DEBT-078: Safe placeholders in .env.example - TECH-DEBT-093: Prometheus basic auth with valid bcrypt Dashboards (Agent 2): - BUG-060: Fix metric prefix ai_memory_ → aimemory_ in 10 dashboards - BUG-061: Switch rate[5m] → increase[1h] for counter panels - TECH-DEBT-081: Auto-resolved (dashboards now show data) Classifier Metrics (Agent 3): - TECH-DEBT-140: Add project label to all 9 classifier metrics README Accuracy (Agent 4): - Fix CLAUDE.md → CONTRIBUTING.md reference - Consolidate duplicate Quick Start sections - Fix send_message_streaming → send_message_buffered - Update model IDs to claude-sonnet-4-5-20250929 - Clarify Python 3.11+ required for AsyncSDKWrapper - Update hook architecture diagram Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
PLAN-003 v2.0.4 Cleanup Sprint Phase 2: Hook Scripts (Agent 5): - TECH-DEBT-141: Add 3 missing types to VALID_HOOK_TYPES - TECH-DEBT-142: Convert all hooks from local metrics to push-based - TECH-DEBT-073: All hooks now push duration with hook_type label - TECH-DEBT-074: Verified trigger type labels across all scripts - TECH-DEBT-075: Verified capture metrics include collection label - BUG-062: NFR metrics now pushed to Pushgateway Monitoring API (Agent 6): - TECH-DEBT-072: Push collection_size to Pushgateway with per-project breakdown SessionStart (Agent 7): - BUG-020: File-based dedup lock prevents double execution on compact - TECH-DEBT-091: Remove 3 logging truncation patterns (content[:50], context[:200]) - TECH-DEBT-073: Verified track_hook_duration already wraps main() Quick Wins (Agent 8): - TECH-DEBT-085: Rename BMAD → AI Memory in 8 docs files (~35 refs) - TECH-DEBT-077: Verified /save-memory has logging, documented gaps Also updates CHANGELOG.md for v2.0.4. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Cross-cutting code review (Agent 9) found 4 hook scripts importing detect_project from memory.storage instead of memory.project. This caused silent project detection failure (ImportError caught by try/except, falling back to "unknown" project label). Pre-existing bug exposed by Phase 3 verification. Files fixed: - post_tool_capture.py (2 occurrences) - error_pattern_capture.py (2 occurrences) - user_prompt_capture.py (1 occurrence) - agent_response_capture.py (1 occurrence) Also updates CHANGELOG.md with Phase 3 findings. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ECH-DEBT-151) BREAKING CHANGE: Removes undocumented _enforce_content_limit() function that caused 97% data loss on guidelines (23KB → 600 chars). Changes: - Remove _enforce_content_limit() from storage.py - Add src/memory/chunking/truncation.py with 4 smart functions: - count_tokens() - Accurate token counting with tiktoken - smart_end() - Sentence-boundary truncation (keep beginning) - first_last() - Keep beginning + end with smart truncation - structured_truncate() - Preserve error context structure - Update 3 hook scripts to use smart_end() for whole-content messages - Add tiktoken>=0.5.0 dependency for accurate token counting - Add 28 unit tests + 2 integration tests (all passing) - Fix circular import in chunking/__init__.py (lazy import pattern) Per Chunking-Strategy-V2.md: - User messages: whole, max 2000 tokens, smart_end truncation - Agent responses: whole, max 3000 tokens, smart_end truncation - Guidelines: section-aware semantic, 512 tokens per chunk Research: 50+ sources, 2026 best practices (BP-001, BP-002) Impact: +20-40% retrieval precision vs hard truncation Test Results: - 28/28 unit tests passing (test_truncation.py) - 2 integration tests created (storage + hook scripts) - Circular import resolved and verified Fixes: #151 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Research Documentation: - docs/RAG-CHUNKING-BEST-PRACTICES-2026.md - Comprehensive research from 50+ authoritative sources - 2026 best practices for RAG chunking strategies - Qdrant, Jina AI, NVIDIA, LangChain, academic papers - docs/conversation-memory-best-practices.md - Implementation guide for conversation memory - Practical patterns for Claude Code context injection - Token budget management strategies Templates & Reports: - templates/conventions/rag-chunking-embedding-patterns.json - Convention template for chunking strategies - Ready to load into conventions collection - docker/grafana/dashboards/DASHBOARD-AUDIT-REPORT.md - Grafana dashboard audit findings - Hook metrics dashboard verification Supporting TECH-DEBT-151 implementation with research evidence. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Complete zero-truncation compliance across all storage paths: - Phase 3: Replace smart_end with ProseChunker topical chunking in all 3 store_async hook scripts (512 tokens, 15% overlap, batch embedding) - Phase 4: Add content_type parameter to IntelligentChunker.chunk() for explicit routing (USER_MESSAGE/AGENT_RESPONSE/GUIDELINE thresholds) - Phase 5: All stored points include chunking_metadata payload with chunk_type, chunk_index, total_chunks, chunk_size_tokens, overlap_tokens Bug fixes: - Fix 4 NameError bugs in trigger scripts (wrong variable names) - Fix wrong hook_name in best_practices_retrieval metrics - Fix wrong BMAD_ env var prefix → AI_MEMORY_ in best_practices_retrieval - Raise MAX_CONTENT_LENGTH 10K→100K in user_prompt_capture (let chunker handle) - Remove [:2000] hard truncation fallback in error_store_async - Fix Grafana dashboard hook dropdown values Tests: 1221 passed, 0 failures. Lint: all checks passed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix deprecated typing.Dict → dict in truncation.py - Remove unused os import in test_storage_chunking.py - Remove unnecessary "r" mode args in open() calls - Collapse nested if into single condition (SIM102) - Apply black formatting to 11 files for CI compliance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CHANGELOG: add TECH-DEBT-151 all 5 phases, trigger NameError fixes, Added/Changed sections - INSTALL: add docker/.env configuration guide for both automated and manual install - INSTALL: fix stale bmad_ metric prefix and macOS-only open command - README: version badge 2.0.0 → 2.0.4 - docker/.env.example: rename BMAD_LOG_LEVEL → AI_MEMORY_LOG_LEVEL (TECH-DEBT-085) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Owner
Author
v2.0.4 Verification ReportDate: 2026-02-09 Results Summary
Phase DetailsPhase 1: Infrastructure Health (HARD GATE)
Phase 2: Automated Test Suite
Phase 3: v2.0.4 Code Changes
Phase 4: Live Functional Tests
Phase 5: E2E Integration
VerdictALL 19 PASS — Safe to merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR completes the v2.0.4 cleanup sprint (PLAN-003: 3 phases, 12 agents, 42 files) and implements TECH-DEBT-151 zero-truncation chunking compliance (all 5 phases, 15 files), plus documentation updates.
What Changed
v2.0.4 Cleanup Sprint (3 commits)
TECH-DEBT-151: Zero-Truncation Chunking Compliance (2 commits)
_enforce_content_limit()fromstorage.py— was causing up to 97% data loss on guidelinessrc/memory/chunking/truncation.pywithsmart_end(sentence boundary finder) andfirst_last(head+tail extraction) utilitiesuser_prompt_store_async.py: >2000 tokens → multiple chunks (512 tokens, 15% overlap)agent_response_store_async.py: >3000 tokens → multiple chunks (512 tokens, 15% overlap)error_store_async.py: Removed[:2000]hard truncation fallbackIntelligentChunker.chunk()acceptscontent_type: ContentType | Noneparameter — routes by content typechunking_metadata(chunk_type, chunk_index, total_chunks, original_size_tokens)Trigger Script NameError Fixes (1 commit)
MAX_CONTENT_LENGTHincreased from 10,000 to 100,000Documentation Updates (1 commit)
Other Changes
storage.py: Routes all types through IntelligentChunker with MemoryType → ContentType mappingBreaking Changes
_enforce_content_limit()function fromstorage.py(internal only, not a public API)Testing
Verification Checklist
Related Issues
Fixes #20 (TECH-DEBT-140 - Classifier metrics missing project label)
Fixes #21 (TECH-DEBT-141 - VALID_HOOK_TYPES missing values)
Fixes #22 (BUG-060 - Grafana wrong metric prefix)
Fixes #23 (BUG-061 - rate() returns NaN on sparse data)
Fixes #24 (BUG-062 - NFR metrics not pushed to Pushgateway)
Fixes #25 (TECH-DEBT-142 - Hooks using local metrics)
Fixes #26 (BUG-063 - Prometheus placeholder bcrypt hash)
Implements TECH-DEBT-151 (Chunking & Storage Spec Violations — all 5 phases)