- The program will take the path of the input folder which you need to remove duplicates. Once the program runs successfully, the output will be the same input folder without duplicates.
- It has two python files;
- If you want to use the sequential implementation only use pgm_main.py
- if you want to use Parallel processing, Only use parallel_processing_pgm_main.py
- USER changes:
- Just update the input_files_path variable with the path of input folder
- If you are using multiprocessing, you can also update the number of processes in the file. The default value I used is 4
-
Notifications
You must be signed in to change notification settings - Fork 3
codeholickk/Removing-Duplicate-Docs-Using-Hashing-in-Python
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published