Skip to content

Conversation

@Stonebanks-js
Copy link
Contributor

Description:

This pull request introduces two significant enhancements to the duplicate_finder.py script:

File Type Filtering:

  1. Users can now specify a file type filter to limit duplicate detection to certain file types, improving efficiency and precision.

  2. Report Generation:

  3. The script now generates a comprehensive report of detected duplicates and saves it to a duplicates_report.txt file. The report includes details of all duplicate files found, making it easier to review and manage duplicates.

Changes:

  • Added the ability to filter files based on type.

  • Integrated functionality to save duplicate file information to a text file (duplicates_report.txt).

  • Updated find_duplicates() function to support file type filtering.

  • Improved user interaction with prompts for file type input and report saving.

Tests:

  • Tested on directories containing images, documents, and other files.
  • Verified that the filtering works correctly by scanning only for .jpg and .png files.
  • Confirmed that the report is generated with correct paths for all detected duplicates.

How to test:

Run the script and specify a directory to scan for duplicates.
Provide a file type extension when prompted to filter files (e.g., .jpg).
Select either "delete" or "move" as an action for managing duplicates.
Check the generated duplicates_report.txt for detailed information on found duplicates.

Additional Notes:

  • These changes improve both user experience and performance for large directories.

  • Future improvements could involve adding support for additional report formats (e.g., CSV or JSON).

@Stonebanks-js
Copy link
Contributor Author

@DhanushNehru Take a look into it and assign hacktoberfest labels to it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants