Conversation
Implements drift detection based on git SHA comparison rather than time-based staleness. Tracks HEAD commit, merge-base changes, and calculates update strategies based on drift magnitude. Key features: - DriftStatus struct with SHA tracking and drift percentage - UpdateStrategy enum: Fresh, Incremental, Rebase, FullRebuild - DriftDetector with check_drift() for all layer types - Merge-base detection for rebase scenarios - Threshold-based strategy selection (<10 files, <30%, ≥30%) Includes 13 tests covering TDD requirements: - Fresh index detection - Incremental updates for small changes - Large drift detection with FullRebuild strategy - Merge-base change detection - Uncommitted working changes Part of Phase 2.5 layered index architecture.
SEM-47 SHA-Based Drift Detection
OverviewImplement drift detection using git SHA comparison instead of timestamps. Key InsightTime-based staleness is meaningless:
Drift Magnitude → Strategy
| Full rebuild | TDD Requirements#[test] fn test_same_sha_reports_fresh() { }
#[test] fn test_different_sha_reports_stale() { }
#[test] fn test_strategy_incremental_under_10_files() { }
#[test] fn test_strategy_rebase_under_30_percent() { }
#[test] fn test_strategy_full_rebuild_over_30_percent() { }
#[test] fn test_merge_base_change_detected() { }
#[test] fn test_working_layer_checks_uncommitted() { }Deliverables
Acceptance Criteria
|
There was a problem hiding this comment.
Pull request overview
This PR implements SHA-based drift detection for the layered index system, replacing time-based staleness checks. The implementation correctly recognizes that drift should be measured by content changes (SHA differences) rather than time elapsed.
Key changes:
- Adds
DriftStatusstruct to track indexed vs current SHA, changed files, drift percentage, and merge-base information for branch layers - Introduces
UpdateStrategyenum with threshold-based strategy selection (Fresh, Incremental, Rebase, FullRebuild) based on the magnitude of changes - Implements layer-specific drift detection for Base, Branch, Working, and AI layers
- Includes comprehensive test coverage (13 tests) covering all TDD requirements and edge cases
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/drift.rs | New module implementing SHA-based drift detection with DriftStatus, UpdateStrategy, DriftDetector, and helper functions. Includes comprehensive tests covering all layer types and update strategies. |
| src/lib.rs | Adds re-exports for drift detection types (DriftDetector, DriftStatus, UpdateStrategy, count_tracked_files) and removes unused SemanticSummary import from benchmark module. |
| src/benchmark.rs | Auto-formatter changes breaking long format! strings across multiple lines and removing unused SemanticSummary import. No functional changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // Calculate thresholds | ||
| let thirty_percent = (total_repo_files as f64 * 0.30).ceil() as usize; | ||
|
|
There was a problem hiding this comment.
Edge case handling could be clearer: When is_stale is true but changed_count == 0, returning Fresh seems contradictory. This can theoretically occur if git reports a SHA change but no file changes (e.g., commits with only metadata changes). Consider adding a comment explaining this edge case:
// Edge case: SHA changed but no files changed (e.g., empty commit, metadata-only change)
// In this case, no actual update is needed despite the SHA difference
if changed_count == 0 {
UpdateStrategy::Fresh| // Edge case: SHA changed but no files changed (e.g., empty commit, metadata-only change) | |
| // In this case, no actual update is needed despite the SHA difference |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…mment - Fix compression ratio to use total-based calculation instead of per-file average - Rename avg_compression/avg_token_savings to total_compression/total_token_savings - Add comment explaining edge case when SHA changes but no files changed
Implements SHA-based drift detection for the layered index system, replacing time-based staleness checks
Test plan
Closes SEM-47