Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON naming too long? #8

Closed
comfreak89 opened this issue Nov 26, 2020 · 14 comments · Fixed by #27
Closed

JSON naming too long? #8

comfreak89 opened this issue Nov 26, 2020 · 14 comments · Fixed by #27
Labels
bug Something isn't working
Milestone

Comments

@comfreak89
Copy link

I have several JSON files which names are probably too long. for exmple:

oot@linux:/mnt/data2/Takeout/Google Fotos/2005-02-05# ls -l
total 15111
-rwxrwxrwx 1 root root    440 Nov 26 03:03  Metadaten.json
-rwxrwxrwx 1 root root    752 Jan 12  2020 'Urlaub in Knaufspesch in der Schneifel (38).JP.json'
-rwxrwxrwx 1 root root 341685 Feb  5  2005 'Urlaub in Knaufspesch in der Schneifel (38).JPG'
-rwxrwxrwx 1 root root    752 Jan 12  2020 'Urlaub in Knaufspesch in der Schneifel (39).JP.json'
-rwxrwxrwx 1 root root 330766 Feb  5  2005 'Urlaub in Knaufspesch in der Schneifel (39).JPG'
-rwxrwxrwx 1 root root    752 Jan 12  2020 'Urlaub in Knaufspesch in der Schneifel (40).JP.json'
-rwxrwxrwx 1 root root 315658 Feb  5  2005 'Urlaub in Knaufspesch in der Schneifel (40).JPG'
-rwxrwxrwx 1 root root    752 Jul  3 07:34 'Urlaub in Knaufspesch in der Schneifel (41).JP.json'
-rwxrwxrwx 1 root root 423738 Feb  5  2005 'Urlaub in Knaufspesch in der Schneifel (41).JPG'

this way your script does not found the json files?

@TheLastGimbus
Copy link
Owner

TheLastGimbus commented Nov 29, 2020

Ummm, I'm not sure what you mean by this, but no - finding corresponding json fails because names don't match (eg photo_name.jpg should have photo_name.jpg.json - and they are sometimes not 😕) - not because the json name is too long...

@jmigual
Copy link
Contributor

jmigual commented Dec 1, 2020

Hi I would like to reopen this issue. I have checked that google photos JSON files are limited to 51 characters including the .json ending. So it truncates the file name if the file name is longer than 46 characters. We should at least try to open a json file with the name truncated to 46 characters and check if the title parameter matches the file name that we are checking. I can write a PR for that

@TheLastGimbus
Copy link
Owner

Ooohh... this interesting... yet another way from Google to make it harder 🔥

If you could send here exaples of that happening (happens exactly for 51 chars, all the time)? This is so absurd that I could even try to upload new Photos, myself with special super long name, take another takeout and see if it does that 😆

If so, yeah, PR would be very welcome 🙃 - you just need to modify def find_json_for_file - add second search for stripped name if the first fails

@TheLastGimbus TheLastGimbus reopened this Dec 1, 2020
@TheLastGimbus TheLastGimbus pinned this issue Dec 1, 2020
@jmigual
Copy link
Contributor

jmigual commented Dec 1, 2020

Okay, I did more research and it seems to be a total mess. So, out of a total of 50874 json files in my dataset, 2382 json files had a length of exactly 51 characters.

But there's more. I checked the pictures related to these files and they also have a length of 51 characters meaning that the filename of the picture also got truncated. See this example:

  • Json file name: 5733_1106073493870_1287760089_30252204_5769011.json (51 characters)
  • JPG file name: 5733_1106073493870_1287760089_30252204_5769011_.jpg (51 characters)
  • Title in JSON file: 5733_1106073493870_1287760089_30252204_5769011_n[1].jpg (55 characters)

So they truncate everything > 51 because why not? (WTF google?)

To make matters worse, I found 4 pictures and their JSONs with a file name of 54. So it seems that they truncate but there are some weird exceptions 🤷‍♂️ .

So, checking the title field will only help for pictures where the file name is between 46 and 51 characters. Which is not much but it's something

@bitsondatadev
Copy link
Contributor

bitsondatadev commented Dec 4, 2020

Hm..I wonder what the likelihood of collisions are for just matching the first 50 characters though. I would imagine it should ideally follow some convention that would make it unique. @jmigual would you mind verifying if there are and images that match multiple json files if you use 50 characters?

@jmigual
Copy link
Contributor

jmigual commented Dec 4, 2020

Sorry, I don't think I follow. Do you mean checking if multiple images can match the title field in the JSON file?

@jmigual jmigual mentioned this issue Dec 4, 2020
@TheLastGimbus
Copy link
Owner

Idea: we could just load every json in folder and check for the name inside the tags - not by json's file name. It would theoretically be less efficient, but would be much more successful

@bitsondatadev
Copy link
Contributor

Sorry, I don't think I follow. Do you mean checking if multiple images can match the title field in the JSON file?

This is not what I meant but I think that's a better idea and what @TheLastGimbus is alluding to. If the json file has the full name then that's what we should use.

@OscarVanL
Copy link

Has anyone reported this to google, because this is surely not desired behaviour?

@TheLastGimbus
Copy link
Owner

TheLastGimbus commented Dec 7, 2020

Google doesn't care. You can try to write some email to them, but you will probably get some typical, corporate, waffle response:

"Sorry Sir, our Takeout service doesn't plan any on any changes in it's behavior"

You know, this whole repo is because of "not desired behaviour"...

@OscarVanL
Copy link

That's a shame...

@TheLastGimbus
Copy link
Owner

Okay, so I wrote a post on Google Support:

https://support.google.com/accounts/thread/88318924

But this service seems to be "community support" - nothing where actuall Google employees look at... But we will see 👍 You guys can give it a +1 to make it more visible...

@bitsondatadev
Copy link
Contributor

Upvoted

@TheLastGimbus
Copy link
Owner

This should be solved in new beta - try it out and report if there are any issues - if not, I'll release it as official:

pip install -U google-photos-takeout-helper==2.1.0b1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants