New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dies on certain images #52
Comments
I think I'm getting the same error here. I did read somewhere just delete the last picture it was working on, but it still crashes.
|
Ugh... Tho I don't know if I can replace it... I neeed to just wrap it in |
Even if it just moved the image to a special folder for now. It would be so
much better than just dying after hours of running.
…On Thu, Jan 28, 2021 at 2:56 AM Mateusz Soszyński ***@***.***> wrote:
Ugh... piexif is super buggy 😕
Tho I don't know if I can replace it... I neeed to just wrap it in
try-catch...
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#52 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGUOXTLSCF43HTQMMXSSNTS4EYEDANCNFSM4WQW7DYQ>
.
|
I know... nevertheless, it weird - i thought all exif operations were I will improve this when I have some time 👍 |
This is a little off-topic but I'll be adopting PhotoPrism and I see a setting for not creating ExifTool JSON files, which means it does produce these same files. Does that mean I could import a takeout directly into PhotoPrism? Regardless, I was going to wait until this issue is fixed and then add the pictures into PhotoPrism, I was just curious how it would work with existing ExifTool files or if it would ignore them. |
Ah I see they have their own page on it. I also looked to see if they had any issues open for the new Google Takeout years format but I don't see anything, I'm going to give their import process a shot. |
Google's JSONs are not from ExifTool :/
Can you link it here? I'm curious what they have
Let us know how well it works 👍 |
As always, try it out and let me know if works 👍 |
Nope. Still fails on the images it doesn't like:
|
Um, did you even update the script?
You should at least get a "oh-oh, script crashed" message I introduced in #56 For me, the output for your problematic image is like this: Heeeere we go!
=====================
Fixing files metadata and creation dates...
=====================
test/IMG_4661(3).jpg
Can't read file's exif!
No exif for test/IMG_4661(3).jpg
Couldn't find json for file
Last chance, coping folder meta as date...
Couldn't pull datetime from album meta
ERROR! There was literally no option to set date!!!
TODO: We should do something about this - move it to some separate folder, or write it down in another .txt file...
=====================
Coping all files to one folder...
(If you want, you can get them organized in folders based on year and month. Run with --divide-to-dates to do this)
=====================
=====================
Removing duplicates...
=====================
DONE! FREEEEEDOOOOM!!!
Final statistics:
Files copied to target folder: 1
Removed duplicates: 0
Files for which we couldn't find json: 1 |
I'm referring to this help topic. And ah ok I just assumed regarding ExifTool. Anyway I did what the help topic suggested and it seems it does import in the data, but it's still a mess of course (maybe that's because of the new structure). I'm thinking of trying this out again with your patch. Just curious though, what happens to files that can't find JSON? I'm assuming they get left in the original directory? Would I be left with a folder full of the original Takeout, and a folder that has most of the pictures but not the ones that failed? |
OK, I somehow had two versions installed (in /usr/local and ~/.local). I
uninstalled everything, re-installed and it got through my test broken
images. I will try on the full set again today. Thanks!
…On Sat, Jan 30, 2021 at 7:58 AM Tyler Swindell ***@***.***> wrote:
ExifTool JSON files
Google's JSONs are not from ExifTool :/
they have their own page on it
Can you link it here? I'm curious what they have
I'm going to give their import process a shot.
Let us know how well it works 👍
I'm referring to this help topic.
<https://docs.photoprism.org/user-guide/use-cases/google/>
And ah ok I just assumed regarding ExifTool. Anyway I did what the help
topic suggested and it seems it does import in the data, but it's still a
mess of course (maybe that's because of the new structure). I'm thinking of
trying this out again with your patch. Just curious though, what happens to
files that can't find JSON? I'm assuming they get left in the original
directory? Would I be left with a folder full of the original Takeout, and
a folder that has most of the pictures but not the ones that failed?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#52 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGUOXX2NTAQWTEHDXJVOJ3S4QNCLANCNFSM4WQW7DYQ>
.
|
Script tries to find any other way to find their creation date, from exif or folder name, and if there is absolutely no way, it just copies it as-is Although I want to change this behavior later so it copies it to separate folder |
Okay so I was able to run this successfully this time, but I noticed the output folder is 2GB less. 93 gigabytes vs 95 gigabytes. I don't think every file is in the output folder. What do you think of my results?
|
HUH. This is either very lucky, or very weird... aspecially for 95GB 🤔
Do you maybe have Linux/Mac? You can easily do: cd you/takeout/folder
du -ch **/*.json
# This will print out what total weight of all json files For my sample 4.3GB it was 31MB... Please try to find how to count it on Windoza, if you have the misfortune to have it I'm gonna be honest - I don't know, and have no good way to test it, it this script works flawlessly and copies everything... all the workarounds around duplicates etc made it complicated... but it should... I just have an idea - can I replace the final
Out of curiosity - can you tell me (just |
Total weight is at 123M. Failed PNGs at 910, failed JPGs at 447. Let me know if you want me to test anything else. |
Failed files should be moved too, hmmm I think this is our Try to find if you have any photos/videos that are not from this list: photo_formats = ['.jpg', '.jpeg', '.png', '.webp', '.bmp', '.tif', '.tiff', '.svg', '.heic']
video_formats = ['.mp4', '.gif', '.mov', '.webm', '.avi', '.wmv', '.rm', '.mpg', '.mpe', '.mpeg', '.m4v'] |
Sorted through using extension and only slight deviations I see are some files have capitalized extensions mixed in --> .HEIC .JPG .MOV .MP4 This is from the output folder. |
|
So even in Google Photos, I have a lot of pictures that lost their metadata. I had switched Google accounts at some point and uploaded all my pictures without the JSON, I don't even know if Google's upload tools take them into account? Anyway, a quarter of my library is under one day in Google Photos. Now as soon as I download or extract these pictures, they end up having a created date of today, but the filename is right. Is it possible to add an option to count files that have creation dates as the current day as wrong and to get the date from the filename? This is an example - IMG_20161223_183024 1.jpg - this file has a date of June 10th, 2020 in Google Photos, but that was I believe the day I uploaded to the second Google account, when I download it, the date becomes today, and when I extract it from the Takeout archive it also becomes today. |
Well not quite a quarter of my pictures, 920 to be exact, but that's still a lot I need to fix somehow. A lot of them are saved snapchats though which have random filenames, besides them though I have a lot that have the date and time in the filename. |
Huh... that is doable... Maybe I will do this in separate branch, just for you, because it could mess the script (and it's performance) very much, and 99% people won't use it Then, you will just manually |
Ok, it's up to you, I was looking at using the divide-to-dates parameter to see what all pops up in today's folder, so I can see all the problem files I've accumulated from reuploads to Google Photos. I looked around and saw examples of using exiftool to do it, but I haven't tested the commands yet because I'm moving the archives to another system that has more storage so I don't have to keep getting low disk space warnings. |
That's not to say this will find the 2GB of data not in the output folder, but I could always use the tool on everything separate from the folder containing all the missing metadata and see if that still happens. |
That's a good idea! Then you can do: for f in os.listdir():
if f[0:4] == 'IMG_':
date = f[4:12]
timestamp = datetime.strptime(
f[0:4]+':'+f[4:6]+':'+f[6:8],
'%Y:%m:%d'
)
os.utime(f, (timestamp, timestamp)) // This is just a reference script, it won't work. I can finish it if you don't know how to do it yourself 👍
Perhaps #57 fixed your problem? Try searching for more weird files
...but inside input folder |
Okay still missing 2GB, I'm on my Mac now though so I used HoudahSpot to do a more advanced search and these are the extensions my takeout has. m4v, gif, heif, jpeg, mkv, mts, mp4, png, mov, bmp I think it's the MKV files! They weren't even supposed to be in Google Photos, they accidentally got uploaded in lol. |
Yay! So my script isn't fundamentally broken (maybe) 🎉! Updated it. Try to |
Sorry was waiting for my weekend, ran the script and my input folder is at 101.82 gigabytes and my output folder is coming to 101.8 gigabytes, so it looks like those MKVs are transferring! Although I'm deleting them because they shouldn't even be in my library, at least the script now accounts for MKV though. I'm working on going through the 1,641 files with the wrong date and luckily I am getting somewhere, some are junk that I can delete, and the most important ones I can fix the dates by the filename. Anyways, thanks for your help! |
After running for 20+ hours, the script dies on a specific image even though it parses and displays fine. I have reproduced it with a directory of just the image. This is running release 2.0 on Ubuntu Linux 20.10
The log looks like this -
Here is a link to the file
The text was updated successfully, but these errors were encountered: