Multi-document upload with filename(s) containing Umlauts stuck

### Environment
Docspell: v0.42.0
Joex: using docker image found at **ghcr.io/docspell/joex:latest**  (i.e. Debian-based image with fixed Tesseract)

### Issue
Today I uploaded a multi-document ZIP archive into Docspell using the manual document upload feature, but document processing got stuck. 

The contents found inside the ZIP archive was:

![grafik](https://github.com/user-attachments/assets/a0b346ad-c702-448e-9031-64a2a74189ca)

I uploaded the archive using the following manual upload settings:

![grafik](https://github.com/user-attachments/assets/23a97114-15d9-4c2e-9148-6782d90899b9)

The processing of the filename containing umlauts (ü) crashes processing of the archive. The error message I got from the job queue was `Malformed input or input contains unmappable characters: /tmp/docspell-zip-9930113460477770389/123456_2024_Anpassung der Ausführungsfristen bei Echtzeitüberweisungen_vom_2024.11.01_20241101101038.pdf`.

![grafik](https://github.com/user-attachments/assets/08301fa3-5476-47dc-8fe2-5efa628db430)

(Note: Inside the screenshots I only made some private number at the beginning of the PDF document unrecognizable but you can replace that with a six digit random number)

I'm not sure if this issue may be present in earlier versions of Docspell, because it was the first time my bank send me a document containing umlauts.

### Workaround

I've uploaded the plain document and this seems to be fine. 

### Testdata

Here is a test archive containing only one PDF filename with umlauts that had been zipped with Windows FileExplorer. However, this time Docspell tells me the error was ` invalid CEN header (bad entry name or comment)`. Zipping the file with 7Zip returns the same error message so somehow the processing of ZIP file streams with UTF-8 chars seems to be broken.

![grafik](https://github.com/user-attachments/assets/7793f248-fb5c-4dbe-af46-562b349abca9)

[testdata.zip](https://github.com/user-attachments/files/17598492/testdata.zip)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-document upload with filename(s) containing Umlauts stuck #2842

Environment

Issue

Workaround

Testdata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Multi-document upload with filename(s) containing Umlauts stuck #2842

Description

Environment

Issue

Workaround

Testdata

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions