Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step 9: Counts down all the groups without displaying any photos #20

Closed
starbuck93 opened this issue Apr 12, 2024 · 20 comments
Closed

Step 9: Counts down all the groups without displaying any photos #20

starbuck93 opened this issue Apr 12, 2024 · 20 comments

Comments

@starbuck93
Copy link

starbuck93 commented Apr 12, 2024

clicked enter while not in the text box and it submitted... hold on let me type something up

Hey agross, I've been trying to get this to work for a couple of weeks now (since the API changed) and I've had a small success today but also I'm getting some errors in step 9. I have over 3000 duplicates (thanks Google Takeout).
My issue is that when I paste dupes.json into the text box and click OK, the Groups number very quickly runs down to zero. This is an example of an error message in the network tab in the browser console.

changing out the UUID for a bunch of zeros

GET http:// LAN IP Address:8081/api/asset/00000000-0000-0000-0000-562a40c9bd70

{
    "message": "Not found or no asset.read access",
    "error": "Bad Request",
    "statusCode": 400
}

Then the next line says something like

{
    "id": "00000000-0000-0000-0000-562a40c9bd70",
    "name": "",
    "birthDate": null,
    "thumbnailPath": "upload/thumbs/00000000-0000-0000-0000-e150d52cb601/00000000-0000-0000-0000-562a40c9bd70.jpeg",
    "isHidden": false
}

I've also seen some no person.read access errors, too.

Somehow, I was able to see 1 duplicate and make a decision on which one to keep, but only 1 out of the over 3000.

Thanks!

@agross
Copy link
Owner

agross commented Apr 12, 2024

Please have a look at your browser's developer tools (and the Console section there).

The behavior you see should only happen if you e.g. have a group of duplicate assets pasted during setup, but some of those have been removed intermittently resulting in a group with just a single asset.

@starbuck93
Copy link
Author

That sounds like I should probably re-run the dups.json process? (still finishing up my issue, but I suspect you're correct)

@agross
Copy link
Owner

agross commented Apr 12, 2024

You could try rerunning dupes. Only you know whether assets were deleted since the last run! If you didn't then this could be a bug, or a breaking change in Immich's API or an issue with your setup.

@starbuck93
Copy link
Author

I don't think I manually deleted any, but my suspicion is changing from the old file structure to the new one (Splitting generated content into a separate folder). I'm re-running dupes.db/json right now, and I'll update later.

@agross
Copy link
Owner

agross commented Apr 12, 2024

I don't know whether you are a developer, but what you see is a side-effect of this code (also the no person.read access messages in the log):

for (const assetId of props.assetIds) {
try {
const meta = await fetchMetadata(assetId)
const albums = await fetchAlbumInfo(assetId)
// Ignore assets in the Immich Trash.
if (meta.isTrashed) {
continue
}
loadedAssets.value.set(assetId, {
meta: meta,
albums: albums
})
} catch (err: any) {
if (await isPerson(assetId)) {
ignore()
}
let message: string = err.message
assetLoadErrors.value.set(assetId, message)
}
}
// If there are not enough assets left in the group because trashed assets have
// been ignored above, ignore the whole group.
if (loadedAssets.value.size < 2) {
ignore()
}

The idea is this per duplicate group (set of duplicate assets):

  1. For each asset, attempt to load its metadata and album info - this will fail if the asset does not exist any more or it received a new ID (by splitting, no idea?)
  2. If the metadata indicates that the asset is trashed, kick it out of the group
  3. Otherwise add the asset ID to the list of successfully loaded assets
  4. If metadata or album info loading errors out check if the asset was a person's headshot image (for Immich's "People" feature) and if so ignore the whole group (i.e. "count down")
  5. After all asset IDs of the group have been processed, only consider it further if the number of successfully loaded assets is 2 or more - otherwise ignore the whole group (i.e. "count down")

@starbuck93
Copy link
Author

Thank you for that, that's a good explanation.

I created a new dupes.json, around 3,500 groups, and I was able to do 1 comparison between 5 photos and choose the highest quality photo, then it immediately starts counting down, probably 100 per second before I stop it. I'm not sure what happened, I'm getting a lot of asset.read and person.read access errors. I'll have some time tomorrow to dig into this again.

@agross
Copy link
Owner

agross commented Apr 12, 2024

Just a guess: Do you have multiple user accounts in Immich and your API key is for a different user than the one with the 3,5k dupes?

@agross
Copy link
Owner

agross commented Apr 12, 2024

Please docker pull ghcr.io/agross/immich-duplicates-browser:latest. It will no longer ignore the group if there have been <2 assets left because of load errors. At least it'll stop counting down for you and you will be able to inspect the error messages, copy the image ID (GUID), and perhaps have a look at the database.

@starbuck93
Copy link
Author

You're correct, I do have more than one user and while I was pretty sure I had the user IDs correct, I went ahead and created an API key for my wife's account and pasted it in. It did actually pull up a duplicate group for me and when I clicked Keep Best, it had a 400 error:

{
    "message": "Not found or no asset.delete access",
    "error": "Bad Request",
    "statusCode": 400
}

So I'm pretty sure I had the right API key.

Let me pull the latest iamge and run it real quick.

@starbuck93
Copy link
Author

starbuck93 commented Apr 13, 2024

The latest image ran with only ~12,000 errors in the dev console this time! /s Most of the errors were the asset.read and person.read error. Not found or no asset.read access But I was able to make decisions on about 20 groups of dupes, which I suppose is progress. I did see several new error messages in the window like this:
SCR-20240413-gwtc

@agross
Copy link
Owner

agross commented Apr 13, 2024

Ah, wonderful. id must be a UUID is very likely due to a change in recent Immich versions where the thumbnail file name is not just <asset ID>.jpeg but now <asset ID>-preview.jpeg. I also have several thumbnails with the suffix and most without the suffix.

Please docker pull ghcr.io/agross/immich-duplicates-grouper:latest and rerun the grouping which generates the JSON file. It would be great if you could check if the JSON file contains the term preview(grep preview dupes.json) before pasting it to the duplicate browser.

@starbuck93
Copy link
Author

starbuck93 commented Apr 13, 2024

The JSON file does not contain the term preview, is that OK? (while the old JSON did contain 46 previews)

@agross
Copy link
Owner

agross commented Apr 13, 2024

Yes, "preview" should not be included in the file. The file contain UUIDs only.

You should also run the duplicate detection not for all thumbnails but only the ones belonging to the user account you created the API key for (by specifying the thumbnail subdirectory). This should get rid of any asset load errors that are caused by the API key not matching the user account that owns the asset.

@starbuck93
Copy link
Author

I've confirmed my user ID below matches a photo I recently uploaded from my account, this is the dupes.db process I ran:

docker container run \
--rm \
--volume /mnt/user/immich/pictures/thumbs/:/thumbs/ \
--volume "$PWD:/output/" \
ghcr.io/agross/immich-duplicates-findimagedupes \
--prune \
--fingerprints /output/dupes.db \
--recurse \
--no-compare \
--exclude '\.webp$' \
/thumbs/bef3720d-9670-4516-b9e8-e150d52cb601/

@starbuck93
Copy link
Author

starbuck93 commented Apr 13, 2024

I did a dump of my db so I can just search for some of these UUIDs that are failing in the console. The UUIDs that are failing on both the /asset and the /person API don't seem to exist in my database.... So I'm confused about that.

A lot of UUIDs will return a "400 Bad Request" on /asset and return 304 on the /person API. I guess they are just thumbnails of people?

@agross
Copy link
Owner

agross commented Apr 14, 2024

I did a dump of my db so I can just search for some of these UUIDs that are failing in the console. The UUIDs that are failing on both the /asset and the /person API don't seem to exist in my database.... So I'm confused about that.

Hm, I'm not sure why you would have thumbnails for assets that do not exist. You could have a look at the respective thumbnail JPEGs (potentially appending -preview to the file name). You can also try to clean your thumbnail directory and regenerate the thumbnails using Immich.

@starbuck93
Copy link
Author

That's a good idea. I may just figure out the best way to regenerate thumbs. I do have two thumbs directories, from before the migration I mentioned a few posts above. I ran the migration in Immich through the Admin Jobs tab, but there are still a ton of files scattered around. Immich seems to work just fine, though.

root@Tower:/mnt/user/immich/pictures# tree -L 2
.
├── user-2-9a99f2ff960f
│   ├── 2011
│   ├── ... (a lot more dirs)
│   ├── 2023
│   ├── encoded-video
│   ├── original
│   ├── profile
│   └── thumb
├── my-user-e150d52cb601
│   ├── 1970
│   ├── ... (way more dirs)
│   ├── 2023
│   ├── encoded-video
│   ├── original
│   ├── profile
│   └── thumb (contains 91695 files and dirs)
├── user-3-414a25736e08
│   ├── 2023
│   ├── original
│   └── thumb
├── encoded-video
│   ├── user-2-9a99f2ff960f
│   └── my-user-e150d52cb601
├── library
│   ├── user-2-9a99f2ff960f
│   ├── my-user-e150d52cb601
│   └── user-3-414a25736e08
├── profile
│   └── my-user-e150d52cb601
├── thumbs
│   ├── user-2-9a99f2ff960f
│   └── my-user-e150d52cb601 (contains 13272 files and dirs)
└── upload
    ├── user-2-9a99f2ff960f
    └── my-user-e150d52cb601

66 directories

@agross
Copy link
Owner

agross commented Apr 14, 2024

This is how it looks on my machine. Quite different!

$ tree -d -L 3 --prune
.
├── encoded-video
│   ├── 0fc60725-0009-440b-a9c1-1587d8d6cbcc
│   │   ├── 00
...
│   │   └── ff
│   └── ad2055bf-3ce4-4bc0-9884-87707fa0ee04
│       ├── 00
...
│       └── ff
├── library
│   ├── agross
│   │   ├── 2007
│   │   ├── 2007-06-10 Geocaching Suprise
...
│       └── 2023
├── profile
│   └── ad2055bf-3ce4-4bc0-9884-87707fa0ee04
├── thumbs
│   ├── 0fc60725-0009-440b-a9c1-1587d8d6cbcc
│   │   ├── 00
...
│       └── ff
└── upload
    ├── 0fc60725-0009-440b-a9c1-1587d8d6cbcc
    │   ├── 00
        └── ff

1691 directories

@starbuck93
Copy link
Author

Yup... I'm going to attempt to fix this, then revisit my dupe detection!

@agross
Copy link
Owner

agross commented Jun 20, 2024

Closing this as no feedback was received.

@agross agross closed this as not planned Won't fix, can't repro, duplicate, stale Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants