You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Kalebu, the script is very useful. I have several thousand files, some of which are duplicates. But the script has exited with an error when it encounters a non utf-8 encoded file.
I am running this on a Ubuntu Mate 18.04.5 LTS (Bionic Beaver) computer
(I renamed the script to remove-duplicate-files.py3 , and I am calling it like so...
this is the output I get (I have re-run it, so the previous duplicates have already been cleaned)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cleaning .................
Traceback (most recent call last):
File "./remove-duplicate-files.py3", line 69, in
App.main()
File "./remove-duplicate-files.py3", line 65, in main
self.welcome();self.clean();self.cleaning_summary()
File "./remove-duplicate-files.py3", line 53, in clean
print(raw_string, '.. cleaned ')
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 0-5: surrogates not allowed
Can you suggest a change to the script so it does not fail with a filename that has a non utf-8 character in it?
And can it be made to print the name of the file it exited on?
Also it would be useful if the script can be placed in a different directory than the one I want to clean, and would ask me the name of the directory I want to clean.
Thanks,
bradw2002
The text was updated successfully, but these errors were encountered:
Hi Kalebu, the script is very useful. I have several thousand files, some of which are duplicates. But the script has exited with an error when it encounters a non utf-8 encoded file.
I am running this on a Ubuntu Mate 18.04.5 LTS (Bionic Beaver) computer
(I renamed the script to remove-duplicate-files.py3 , and I am calling it like so...
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
S=$(date) ; python3 ./remove-duplicate-files.py3 ; E=$(date) ; echo -e "start = $S ..... \n end = $E"
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
this is the output I get (I have re-run it, so the previous duplicates have already been cleaned)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**************** DUPLYTHON ****************************
---------------- WELCOME ----------------------------
---------------- WELCOME ----------------------------
Cleaning .................
Traceback (most recent call last):
File "./remove-duplicate-files.py3", line 69, in
App.main()
File "./remove-duplicate-files.py3", line 65, in main
self.welcome();self.clean();self.cleaning_summary()
File "./remove-duplicate-files.py3", line 53, in clean
print(raw_string, '.. cleaned ')
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 0-5: surrogates not allowed
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Can you suggest a change to the script so it does not fail with a filename that has a non utf-8 character in it?
And can it be made to print the name of the file it exited on?
Also it would be useful if the script can be placed in a different directory than the one I want to clean, and would ask me the name of the directory I want to clean.
Thanks,
bradw2002
The text was updated successfully, but these errors were encountered: