-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Epic: Daemon Temporal Watch Integration
Epic Completion Status: ⏳ PENDING
Story Completion: 0/3 stories complete
COMPLETED STORIES:
- None
IN PROGRESS:
- None
PENDING:
- 01_Feat_TemporalQueryDaemonSupport/01_Story_EnableTemporalQueriesDaemonMode.md ⏳
- 02_Feat_WatchModeAutoDetection/01_Story_WatchModeAutoUpdatesAllIndexes.md ⏳
- 03_Feat_BranchSwitchCatchUp/01_Story_EfficientUnindexedCommitDetection.md ⏳
Overall Progress: 0% complete
Main Intent
Enable full daemon mode support for temporal git history queries and integrate temporal indexing into watch mode with automatic git commit detection, providing users with the same fast, cached query experience for temporal searches as they have for HEAD collection queries.
This epic extends CIDX's daemon capabilities to handle temporal (git history) queries with identical caching infrastructure and integrates temporal indexing into watch mode with automatic commit detection via git refs inotify monitoring.
Conversation Context
User Requirements:
-
Temporal Query Daemon Support (MANDATORY):
- User: "for indexing, I want the same reporting experience on the CLI as it happens with regular indexing or standalone operation"
- User: "make a plan to enable querying for temporal indexes in daemon mode"
- User decision: "1. Mandatory"
- Current blocker:
cli.py:4708-4710prevents daemon delegation whentime_rangeis set
-
Watch Mode Auto-Detection:
- User: "review watch behavior, we need to add a parameter to auto update FTS and Temporal indexes, incrementally"
- User decision: "3. C, auto-detect based on all indexes we have. keep everything updated. easier to the user"
- Requirement:
cidx watchautomatically detects ALL existing indexes (semantic, FTS, temporal)
-
Git Commit Detection via Inotify:
- User: "can you detect a commit by a change in some file in .git folder with the same inode technique we have now?"
- Evidence:
.git/refs/heads/<branch>changes inode on every commit - User decision: "5. No hooks. Explore inode git file change detection and if that won't work, use polling"
-
Temporal Watch - Current Branch Only:
- User decision: "4. current branch. You need to ensure that temporal index can 'catch up' if the user changes branch"
- Requirement: Incremental indexing when branch switches
-
Identical HNSW mmap Caching:
- User: "daemon uses mmap for semantic hnsw indexes. I want the same exact approach for temporal indexes"
- Requirement: Temporal collection uses IDENTICAL HNSWIndexManager.load_index()
-
JSON-Based Metadata:
- Context: User previously removed SQLite completely
- Current:
temporal_progress.jsontracks completed commits - Requirement: NO database queries, all JSON-based
System Architecture
Current Daemon Architecture (HEAD Collection)
CLI Query Request
↓
cli.py checks daemon_config.enabled
↓
Daemon RPC Call: exposed_query(query, filters...)
↓
CIDXDaemonService.exposed_query()
↓
Cache Check: cache_entry.hnsw_index exists?
├─ YES: Use cached mmap HNSW index (5ms query)
└─ NO: Load from disk via HNSWIndexManager.load_index()
↓
Cache in CacheEntry (hnsw_index, id_mapping)
↓
Query cached index
New Temporal Architecture (This Epic)
CLI Temporal Query Request (--time-range)
↓
cli.py checks daemon_config.enabled AND time_range
↓
[STORY 1.3] Remove blocking at cli.py:4710
↓
Daemon RPC Call: exposed_query_temporal(query, time_range, filters...)
↓
[STORY 1.2] CIDXDaemonService.exposed_query_temporal()
↓
[STORY 1.1] Cache Check: cache_entry.temporal_hnsw_index exists?
├─ YES: Use cached mmap HNSW index for temporal collection
└─ NO: Load from disk via HNSWIndexManager.load_index()
↓
Cache in CacheEntry (temporal_hnsw_index, temporal_id_mapping)
↓
Query cached temporal index with time-range filtering
Watch Mode Architecture (Enhanced)
cidx watch (no flags)
↓
[STORY 2.1] Auto-detect existing indexes:
- .code-indexer/index/code-indexer-HEAD/ → Semantic watch
- .code-indexer/index/tantivy-fts/ → FTS watch
- .code-indexer/index/code-indexer-temporal/ → Temporal watch
↓
Start multi-index watch handlers:
├─ SemanticWatchHandler (existing)
├─ FTSWatchHandler (existing, Story 02_Story_RealTimeFTSMaintenance.md)
└─ [STORY 2.2, 2.3] TemporalWatchHandler (NEW)
↓
Watch .git/refs/heads/<current_branch> via inotify
↓
On inode change (commit detected):
↓
[STORY 2.3] Run incremental temporal indexing:
- Load temporal_progress.json
- Get new commits since last indexed
- Index only new commits
- Update temporal_progress.json
Branch Switch Detection (Story 3.1)
Watch Mode Active
↓
Detect branch switch:
- .git/HEAD file change (ref: refs/heads/new-branch)
↓
[STORY 3.1] Load temporal_progress.json
↓
Get all commits in new branch: git rev-list new-branch
↓
Build in-memory set of completed commits (O(1) lookup)
↓
Filter out indexed commits → unindexed commits list
↓
[STORY 3.2] Index unindexed commits incrementally
↓
Update temporal_progress.json
Technology Stack
Existing Components (Reuse):
HNSWIndexManager.load_index()- mmap HNSW loading (identical for temporal)CacheEntryclass - Cache structure (extend with temporal fields)BackgroundIndexRebuilder- Atomic HNSW updates (Story 0 pattern)RichLiveProgressManager- Progress display for indexingFilesystemVectorStore- Vector storage backendTemporalIndexer- Temporal indexing logicTemporalSearchService- Temporal search with time-range filteringtemporal_progress.json- JSON metadata tracking
New Components (This Epic):
exposed_query_temporal()RPC method in CIDXDaemonServiceTemporalWatchHandler- Git refs inotify monitoring + incremental indexing- Cache fields:
temporal_hnsw_index,temporal_id_mappingin CacheEntry - Branch switch detection in watch mode
Features and Implementation Order
Feature 1: Temporal Query Daemon Support
Objective: Enable temporal queries in daemon mode with identical mmap caching to HEAD collection
Stories:
- Enable Temporal Queries in Daemon Mode with mmap Cache - Complete vertical slice: extend CacheEntry with temporal HNSW cache using identical mmap mechanism, implement exposed_query_temporal() RPC method with time-range filtering, and wire CLI delegation to enable daemon-based temporal queries with sub-5ms cached performance
Value Delivered: Users get sub-5ms temporal queries via daemon cache (same experience as HEAD queries)
Feature 2: Watch Mode Auto-Detection and Git Monitoring
Objective: Automatically detect and watch all existing indexes, including git commit detection via inotify
Stories:
- Watch Mode Auto-Updates All Indexes Including Temporal with Git Commit Detection - Complete vertical slice: auto-detect all existing indexes (semantic, FTS, temporal), implement git commit detection via
.git/refs/heads/<branch>inotify monitoring with polling fallback, and trigger incremental temporal indexing with progress reporting when commits are detected
Value Delivered: Zero-configuration watch mode keeps all indexes current automatically
Feature 3: Branch Switch Temporal Catch-Up
Objective: Efficiently detect and index unindexed commits when user switches branches
Stories:
- Efficient Unindexed Commit Detection - Use in-memory set from temporal_progress.json for O(1) commit existence checks and efficient incremental indexing
Value Delivered: Temporal index stays current across branch switches without re-indexing completed commits
Component Connections
CacheEntry Extension (Story 1.1)
class CacheEntry:
# Existing HEAD collection cache
hnsw_index: Optional[Any] = None
id_mapping: Optional[Dict[str, Any]] = None
# NEW: Temporal collection cache (IDENTICAL pattern)
temporal_hnsw_index: Optional[Any] = None
temporal_id_mapping: Optional[Dict[str, Any]] = None
temporal_index_version: Optional[str] = NoneDaemon Service Extension (Story 1.2)
class CIDXDaemonService:
def exposed_query_temporal(
self,
query_text: str,
time_range: str, # "2024-01-01..2024-12-31" or "last-30-days"
limit: int = 10,
languages: Optional[List[str]] = None,
# ... other filters
) -> Dict[str, Any]:
"""Temporal query via daemon with mmap cache."""Watch Handler Extension (Story 2.2, 2.3)
class TemporalWatchHandler(FileSystemEventHandler):
def __init__(self, project_root: Path):
self.git_refs_file = project_root / ".git/refs/heads" / self._get_current_branch()
self.temporal_indexer = TemporalIndexer(...)
self.progressive_metadata = TemporalProgressiveMetadata(...)
def on_modified(self, event):
if event.src_path == str(self.git_refs_file):
# Commit detected - run incremental indexing
self._index_new_commits()Integration with Existing Systems
Daemon Service (src/code_indexer/daemon/service.py):
- Line 261-309: Existing temporal indexing support (daemon can INDEX temporal)
- NEW: exposed_query_temporal() method for querying
- NEW: Temporal cache management in cache_entry
CLI (src/code_indexer/cli.py):
- Line 4710: Current blocking for temporal + daemon (REMOVE in Story 1.3)
- NEW: Delegate temporal queries to daemon when enabled
Cache (src/code_indexer/daemon/cache.py):
- Extend CacheEntry with temporal_hnsw_index, temporal_id_mapping
- Reuse identical mmap loading via HNSWIndexManager.load_index()
Watch Mode:
- Extend existing GitAwareWatchHandler pattern
- Follow FTS watch implementation (02_Story_RealTimeFTSMaintenance.md)
Testing Strategy
Unit Tests:
- CacheEntry temporal field management
- exposed_query_temporal() RPC method
- TemporalWatchHandler commit detection
- Branch switch detection logic
- In-memory set commit filtering (O(1) lookup verification)
Integration Tests:
- End-to-end temporal query via daemon
- Cache hit/miss scenarios for temporal collection
- Watch mode auto-detection of all indexes
- Git commit detection via inotify
- Incremental temporal indexing on new commits
- Branch switch catch-up workflow
E2E Manual Tests (per story):
- Query temporal via daemon, verify cache performance
- Start watch mode, make commits, verify temporal index updates
- Switch branches, verify catch-up indexing
- Compare daemon vs standalone temporal query results
Performance Tests:
- Temporal query latency: <5ms (cached), <1s (uncached)
- Commit detection latency: <100ms after commit
- Branch switch catch-up: Only unindexed commits processed
- Memory usage: Temporal cache similar to HEAD cache
Definition of Done
Per Story:
- All acceptance criteria satisfied
- Unit tests pass (>85% coverage for new code)
- Integration tests pass
- E2E manual testing completed by Claude Code
- Code review approval from code-reviewer agent
- No regressions in fast-automation.sh
Per Feature:
- All stories completed
- Feature integration tests pass
- Documentation updated (README, --help text)
Epic Complete:
- All 3 stories delivered
- Temporal queries work identically in daemon and standalone modes
- Watch mode auto-detects and maintains all indexes
- Git commits detected via inotify without hooks
- Branch switches trigger efficient catch-up indexing
- Performance targets met (daemon cache <5ms, incremental indexing <1s)
- Zero breaking changes to existing daemon/watch behavior
Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Inotify doesn't work on all filesystems | High | Fallback to polling (5s interval) if inotify fails |
| mmap file descriptor leaks with temporal cache | High | Proper cleanup in CacheEntry.invalidate() |
| Branch switch with thousands of commits | Medium | Efficient filtering via in-memory set, progress reporting |
| Concurrent watch updates during daemon queries | Medium | Reuse existing thread-safe cache_lock pattern |
| Git refs file missing on detached HEAD | Low | Detect detached HEAD, disable temporal watch with warning |
Success Criteria
- Daemon Temporal Queries:
cidx query "auth" --time-range "last-7-days"uses daemon cache (5ms response) - Zero Configuration Watch:
cidx watchdetects and updates all indexes (semantic, FTS, temporal) - Commit Detection: Git commit triggers temporal indexing within 100ms (no hooks)
- Branch Switch Catch-Up: Only unindexed commits processed (verified via logs)
- Cache Parity: Temporal cache behavior identical to HEAD cache (verified via tests)
- No Breaking Changes: All existing tests pass, no regressions
References
Conversation Evidence:
- Temporal query daemon requirement: "make a plan to enable querying for temporal indexes in daemon mode" → "1. Mandatory"
- Auto-detection decision: "auto-detect based on all indexes we have" → "3. C"
- Inotify decision: "can you detect a commit by a change in some file in .git folder" → "5. No hooks"
- Current branch only: "4. current branch. You need to ensure that temporal index can 'catch up'"
- Identical caching: "daemon uses mmap for semantic hnsw indexes. I want the same exact approach for temporal"
Code References:
- Daemon blocking:
src/code_indexer/cli.py:4710(prevent temporal + daemon) - Temporal indexing in daemon:
src/code_indexer/daemon/service.py:261-309 - Cache structure:
src/code_indexer/daemon/cache.py:18-122 - FTS watch pattern:
plans/Completed/full-text-search/01_Feat_FTSIndexInfrastructure/02_Story_RealTimeFTSMaintenance.md - Temporal indexer:
src/code_indexer/services/temporal/temporal_indexer.py - Temporal search:
src/code_indexer/services/temporal/temporal_search_service.py
Standards:
- Epic structure:
~/.claude/standards/epic-writing-standards.md - Testing quality:
~/.claude/standards/testing-quality-standards.md - Agent delegation:
~/.claude/standards/agent-delegation-mandate.md