-
Notifications
You must be signed in to change notification settings - Fork 0
Skip zero-byte files in scanner.py during directory scans #6
Copy link
Copy link
Open
Description
User Story
As a software developer using FolderScanner,
I want the scanner to skip reading zero-byte files
so that large directory scans consume fewer system resources.
Background
The current implementation of scan_directory in scanner.py reads every file matching non-ignored paths, including empty (0-byte) files. This leads to unnecessary I/O operations, particularly during large-scale scans of directories containing temporary or placeholder files. For example, the loop:
for file in files:
file_path = os.path.join(root, file)
if spec.match_file(file_path):
continue
# ... file read occurs regardless of size wastes CPU cycles and I/O bandwidth opening/reading files with no actionable data.
Acceptance Criteria
- Modify
scanner.pyto check file size before reading:- Add
if os.path.getsize(file_path) == 0: continueafter the ignore-pattern check.
- Add
- Ensure skipped files:
- Do not trigger "Error reading..." messages.
- Are excluded from the yielded
file_chunkresults.
- Validation steps:
- Create test directories with mixed empty/non-empty files.
- Verify zero-byte files are never opened (add debug logging if needed).
- Confirm scan duration improves in environments with many empty files.
- Update unit tests to validate this optimization.
Reactions are currently unavailable