Skip to content

Conversation

@0xJacky
Copy link
Owner

@0xJacky 0xJacky commented Nov 29, 2025

Fixes #1455 by using per-file persisted metadata for incremental indexing decisions to prevent unnecessary re-indexing caused by aggregated log sizes.

The previous incremental indexing logic could incorrectly trigger re-indexing due to log.LastSize being an aggregated value across rotated log files (e.g., access.log and access.log.1). This led to false positives where the system believed a file had shrunk or grown, causing infinite rebuilds. This PR injects the actual PersistenceManager to use accurate per-file metadata for LastModified and LastSize, ensuring indexing only occurs when truly needed. A fallback mechanism is also included for cases where persistence data isn't available, clamping the LastSize to prevent similar issues.


Open in Cursor Open in Web


Note

Switch incremental indexing to per-file persisted metadata with a safe fallback, add rotation-aware doc count updates, and include unit tests.

  • Incremental indexing logic:
    • Use persistence (logIndexProvider) to decide indexing via per-file metadata (GetLogIndex/NeedsIndexing).
    • Fallback path clamps aggregated LastSize, handles never-indexed files, and refines change detection.
    • Pass persistence from LogFileManager into needsIncrementalIndexing and job loop.
  • Indexing execution (queueIncrementalIndexing):
    • Detect log rotation by size decrease; reset document count on rotation, otherwise accumulate.
    • Persist indexing metadata (SaveIndexMetadata) and manage statuses via persistence-backed setFileIndexStatus with simple queue ordering.
  • Tests:
    • Add internal/cron/incremental_indexing_test.go with stub provider covering unchanged file skip and growth detection.

Written by Cursor Bugbot for commit 8e9fa7d. This will update automatically on new commits. Configure here.

Co-authored-by: jacky-943572677 <jacky-943572677@qq.com>
@cursor
Copy link

cursor bot commented Nov 29, 2025

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@0xJacky
Copy link
Owner Author

0xJacky commented Nov 29, 2025

cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!


@0xJacky 0xJacky marked this pull request as ready for review November 29, 2025 03:37
@0xJacky 0xJacky merged commit d4fa5a5 into dev Nov 29, 2025
46 checks passed
log.Path, log.LastSize, fileSize)
if fileSize < lastSize {
logger.Debugf("File %s needs full re-indexing (fallback path) due to size decrease: old_size=%d, new_size=%d",
log.Path, lastSize, fileSize)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Rotation detection unreachable due to size clamping

The rotation detection code in the fallback path is unreachable. The clamping logic at lines 133-135 sets lastSize = fileSize whenever lastSize > fileSize or lastSize == 0. After clamping, lastSize is always <= fileSize, so the condition fileSize < lastSize at line 148 can never be true. This means log file rotation (where a file genuinely shrinks) won't be detected when the fallback path is used. While the primary persistence path handles rotation correctly via NeedsIndexing, the fallback path's rotation check is dead code.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Infinite indexing of log, 100% CPU and memory usage

3 participants