Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: report shows way less images uploaded than scanned #390

Closed
joselsegura opened this issue Jul 18, 2024 · 8 comments
Closed

Question: report shows way less images uploaded than scanned #390

joselsegura opened this issue Jul 18, 2024 · 8 comments

Comments

@joselsegura
Copy link

Hi immich-go team!

I discovered you a few days ago and I was using your software to upload my Google Photos backup to my Immich instance.
I downloaded 3 Google Takeout tgz archives from Takeout service and uncompressed them as the README and instructions say in order to run immich-go over them.

I put all the 3 directories altogether inside the same directory and run immich-go -server https://XXXXXXXXX -key ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ upload -create-albums -google-photos ..

After a while, the upload was finished and the report was shown:

Input analysis:
---------------
scanned image file                      :   48682
scanned video file                      :    1835
scanned sidecar file                    :   48483
discarded file                          :       0
unsupported file                        :     105
file duplicated in the input            :    4168
associated metadata file                :   25771
missing associated metadata file        :   24746

Uploading:
----------
uploaded                                :   20749
upload error                            :       2
file not selected                       :       4
server's asset upgraded with the input  :       0
server has same asset                   :     844
server has a better asset               :       2

As you can see, it scanned more than 50k photos+videos, but the report says that "only" ~21k were uploaded to my server. Did I something wrong on my run? Should I try to repeat the execution and expect something different or is it expected?

@simulot
Copy link
Owner

simulot commented Jul 18, 2024

tgz
The direct support of the TGZ format imposes the read and then decompress the archive twice.... Better to process the result of the decompression

scanned more than 50k photos+video but only ~21k were uploaded

First, the google takeout is full of duplicates... That explains a part.

But also Immich-go is confused by the Iphone file names that create duplicates. IMG_1234.HEIC appears several times in the archive... and sometime confused with IMG_1234.JPG and IMG_1234_MP4....

I'm working on the issue.

You can help by providing logs and debug files. For more privacy, you can sent them via discord @simulot

immich-go -server https://XXXXXXXXX -key ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ upload -debug-counters -create-albums -google-photos 

Stay tuned

@joselsegura
Copy link
Author

Hi!

I didn't use the direct tgz support, I decompressed the files and run immich-go over the decompressed directory.

I was doing some numbers using fdupes to find the duplicates and it makes sense. I don't have any iPhone, so I don't have the HEIC files problem at al.

@simulot
Copy link
Owner

simulot commented Jul 21, 2024

I'm rewriting the google photo import.
I can work on your logs if you agree.

@joselsegura
Copy link
Author

Sure, I don't have any problem sharing my logs with you. They are huge as I have a ton of pics there...

@simulot
Copy link
Owner

simulot commented Jul 22, 2024

You also can share the list of your files, not the content. Run following command

for f in *.zip; do echo "$f: "; unzip -l $f; done >list.lst

@cocoands
Copy link

@simulot Let me know if you need another list of file names. I had a very similar experience on v 0.20.1.

Input analysis:
---------------
scanned image file                      :    2832
scanned video file                      :      89
scanned sidecar file                    :     769
discarded file                          :       0
unsupported file                        :       0
file duplicated in the input            :       0
associated metadata file                :       7
missing associated metadata file        :    2914

Uploading:
----------
uploaded                                :       4
upload error                            :       0
file not selected                       :       0
server's asset upgraded with the input  :       0
server has same asset                   :       3
server has a better asset               :       0

@simulot
Copy link
Owner

simulot commented Jul 29, 2024

The report shows 2914 photos not associated with a Json file.

This is not usual. Those files are ignored.

There are 2 main causes:

  1. You have processed only one part of the takeout. Use takeout-*.zip.
  2. These JSON files are missing from the takeout. Ask for another takeout.

The next version of immich-go will give those advices.

@cocoands
Copy link

You are absolutely right. Complete user error. I needed to pay closer attention to the Google Photos advice in the README. takeout-*.zip. Thanks for the quick response and thank you for building this tool!

@simulot simulot closed this as completed Jul 29, 2024
simulot added a commit that referenced this issue Jul 29, 2024
simulot added a commit that referenced this issue Jul 29, 2024
* errors when uploading are disturbing the the % of the progression
Fixes #376

* add test for Question: report shows way less images uploaded than scanned #390

* add tests for #390

* add fakefs to test with takeout lists

* wip: duplicate count

* wip: counters

* WIP takeout by directory

* fix name matchers for duplicate in year and live photo

* wip fixe: missing image from the same directory but different type

* edit release.md

* fix .MP.jpg

* fix: Problem with images with same name #402

* fix Wrong creation date results in false album assignment #392

* fix Problem with images with same name #402 , #390,  #376,  #401

* fix tests

* Merge branch 'main' into append-zip-list-fake-file-system

* linter fix

* linter fix
simulot added a commit that referenced this issue Jul 30, 2024
* edit readme.md

* readme.md

* edit docs/releases.md

* errors when uploading are disturbing the the % of the progression
Fixes #376

* add test for Question: report shows way less images uploaded than scanned #390

* add tests for #390

* add fakefs to test with takeout lists

* wip: duplicate count

* wip: counters

* WIP takeout by directory

* fix name matchers for duplicate in year and live photo

* wip fixe: missing image from the same directory but different type

* edit release.md

* fix .MP.jpg

* fix: Problem with images with same name #402

* fix Wrong creation date results in false album assignment #392

* fix Problem with images with same name #402 , #390,  #376,  #401

* fix tests

* chore(deps): bump github.com/telemachus/humane from 0.5.1 to 0.6.0 (#407)

* chore(deps): bump github.com/navidys/tvxwidgets from 0.6.0 to 0.7.0 (#408)

* Merge branch 'main' into append-zip-list-fake-file-system

* linter fix

* linter fix

* fix counters when force upload when missing JSON

* improved progression for no-ui

* edit readme.md

* edit release
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants