Jobs fail with corrupt input files #29003
Replies: 5 comments
-
|
Wow, that's some real useless formatting from the template. Either way, firing the job next morning I see about the same number of files in queue, which leads me to think it's aborting the whole job somewhere and starting from scratch. I suppose there's some unhandled imagick exception that crashes us? there does seem to be disk i/o after it's crashed however. I'll try moving to local storage and confirm behavior soon |
Beta Was this translation helpful? Give feedback.
-
|
Found more info in logs: I believe that's what dequeues and fails the job. Edit: It's directly preceded by Redis timeout: could it be that we're failing because excess iowait stalls your job queue instead? I didn't experience that before with a ~100GiB library of 30k+ images mounted the same way over SMB/GbE somehow |
Beta Was this translation helpful? Give feedback.
-
|
Migrated over, it's definitely restarting a couple times (Initializing Immich v2.7.5): after what looks to be two restarts I see 1 failed in job queue and that's it |
Beta Was this translation helpful? Give feedback.
-
|
After going through the library and removing any corrupt files I see no more errors in log triggering the job. Perhaps worth it to flag any files that trigger this and put it in a bin of sorts, similar to utilities > duplicates? would make selecting just the corrupt ones way easier |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the writeup. At this time I will move to discussion as this seems to be a combo of slow backend storage, plus an unsupported deployment (podman vs Docker, and LXC vs VM), and corrupt input files. If we identify a specific bug in the Immich code on a supported deployment we can open a new Issue. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have searched the existing issues, both open and closed, to make sure this is not a duplicate report.
The bug
Hi,
I recently imported a big batch of photos dumped from a decade old HDD up in the attic, some showing clear bitrot (color bands running through the image type corruption, some files failing to open altogether). Immich fails to process anything and the thumbnail job crashes outright. If that helps any, full-size upload directory is mounted over SMB so it's also under IO pressure (timeouting?). I'd expect the job to skip affected files and process next image in queue, it looks like it fails within minutes of starting the job and will not process. Starting thumbnail generation with ~800-900 new pictures on my J5005 server I get instant stall on IO (due to said SMB), letting it run overnight I wake up to 1 failed job and it looks like nothing got processed.
Logs reveal it's complaining about input files:
Shortly after, logs -f closes indicating we're probably crashing (hard to tell given kernel stalls on iowait). I'll be able to move to NAS and retest with local storage if necessary, as well as provide affected files.
The OS that Immich Server is running on
Alpine + Podman in Proxmox LXC, doesn't matter, it's containerized anyway
Version of Immich Server
2.7.5
Version of Immich Mobile App
n/a
Platform with the issue
Device make and model
No response
Your docker-compose.yml content
Your .env content
Reproduction steps
as described above
Relevant log output
Additional information
No response
Beta Was this translation helpful? Give feedback.
All reactions