Replies: 6 comments 1 reply
-
|
Did you set any of the related OCR settings? (If so, how) |
Beta Was this translation helpful? Give feedback.
-
|
Paperless_ocr_language=eng
I think that's all I have set...
…On Fri, 25 Jul 2025, 1:39 pm shamoon, ***@***.***> wrote:
Did you set any of the related OCR settings? (If so, how)
https://docs.paperless-ngx.com/configuration/#ocr
—
Reply to this email directly, view it on GitHub
<#10451 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXR24UFTBJO3JJ4HNQ62IT3KGYGZAVCNFSM6AAAAACCKS2YMWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGOBYGUYDKNQ>
.
You are receiving this because you authored the thread.Message ID:
<paperless-ngx/paperless-ngx/repo-discussions/10451/comments/13885056@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
|
so I add jpn to Paperless_ocr_language
and also:
PAPERLESS_OCR_LANGUAGES: jpn (this is in my docker compose file under env
vars for paperless-ngx.
Deleted the broken file, emptied trash and reimported but the error is
still there :(
On Fri, 25 Jul 2025 at 13:41, Mike Robinson ***@***.***>
wrote:
… Paperless_ocr_language=eng
I think that's all I have set...
On Fri, 25 Jul 2025, 1:39 pm shamoon, ***@***.***> wrote:
> Did you set any of the related OCR settings? (If so, how)
>
> https://docs.paperless-ngx.com/configuration/#ocr
>
> —
> Reply to this email directly, view it on GitHub
> <#10451 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACXR24UFTBJO3JJ4HNQ62IT3KGYGZAVCNFSM6AAAAACCKS2YMWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGOBYGUYDKNQ>
> .
> You are receiving this because you authored the thread.Message ID:
> <paperless-ngx/paperless-ngx/repo-discussions/10451/comments/13885056@
> github.com>
>
|
Beta Was this translation helpful? Give feedback.
-
|
Thank you for your reply.
Ah I didnt get I had to recreate the container. I just did the restart. Let
me try that.
…On Fri, 25 Jul 2025, 9:26 pm shamoon, ***@***.***> wrote:
So you added jpn to the OCR > language setting under “Configuration” in
the web app? And for PAPERLESS_OCR_LANGUAGES did you recreate the container
after doing that (not just restart)? Please check / provide the startup
docker logs
—
Reply to this email directly, view it on GitHub
<#10451 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXR24RYAHGKGEYM7FWCVOL3KIO7VAVCNFSM6AAAAACCKS2YMWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGOBYHA4DINI>
.
You are receiving this because you authored the thread.Message ID:
<paperless-ngx/paperless-ngx/repo-discussions/10451/comments/13888845@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
|
This discussion has been automatically closed due to inactivity. Please see our contributing guidelines for more details. |
Beta Was this translation helpful? Give feedback.
-
|
This discussion has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion for related concerns. See our contributing guidelines for more details. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Everyone,
I have a paperless NGX issue. I upload a document (pdf) that contains Japanese character set. It views fine in firefox.
I then upload it to paperless NGX and try to view it. I now get garbled text. When I download it and try to open it, firefox shows the text as garbled.
Somehow the import of paperless ngx is corrupting my files. Can anyone advise what I can do? Note, ascii files are fine.
For example:
標準賞与額決定通知書 --> •W•€•Ü—^ŠzŒˆ’Ł’˚’m•‘
errors seen in paperlessNGX:
webserver-1 | [2025-07-25 05:19:43,486] [INFO] [celery.worker.strategy] Task documents.tasks.consume_file[1a04d6ef-abe0-4d9b-8e74-a580c4b86bc6] received
webserver-1 | [2025-07-25 05:19:43,538] [INFO] [paperless.tasks] ConsumerPreflightPlugin completed with no message
webserver-1 | [2025-07-25 05:19:43,542] [INFO] [paperless.tasks] WorkflowTriggerPlugin completed with:
webserver-1 | [2025-07-25 05:19:43,542] [INFO] [paperless.consumer] Consuming file.pdf
webserver-1 | [2025-07-25 05:19:43,558] [INFO] [paperless.parsing.tesseract] pdftotext exited 0
webserver-1 | [2025-07-25 05:19:43,823] [INFO] [ocrmypdf._pipeline] skipping all processing on this page
webserver-1 | [2025-07-25 05:19:43,824] [INFO] [ocrmypdf._pipelines.ocr] Postprocessing...
webserver-1 | [2025-07-25 05:19:43,890] [ERROR] [ocrmypdf._exec.ghostscript] GPL Ghostscript 10.03.1 (2024-05-02)
webserver-1 | Copyright (C) 2024 Artifex Software, Inc. All rights reserved.
webserver-1 | This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
webserver-1 | see the file COPYING for details.
webserver-1 | Processing pages 1 through 1.
webserver-1 | Page 1
webserver-1 | Loading font F1 (or substitute) from /usr/share/ghostscript/10.03.1/Resource/Font/NimbusSans-Regular
webserver-1 | Loading font F1 (or substitute) from /usr/share/ghostscript/10.03.1/Resource/Font/NimbusSans-Regular
webserver-1 | Loading font F1 (or substitute) from /usr/share/ghostscript/10.03.1/Resource/Font/NimbusSans-Regular
webserver-1 | Loading font F1 (or substitute) from /usr/share/ghostscript/10.03.1/Resource/Font/NimbusSans-Regular
webserver-1 | Loading font F1 (or substitute) from /usr/share/ghostscript/10.03.1/Resource/Font/NimbusSans-Regular
webserver-1 |
webserver-1 | The following errors were encountered at least once while processing this file:
webserver-1 | error reading a stream
webserver-1 |
webserver-1 |
webserver-1 | [2025-07-25 05:19:43,891] [ERROR] [ocrmypdf._exec.ghostscript] This file had errors that were repaired or ignored.
webserver-1 |
webserver-1 | [2025-07-25 05:19:43,891] [ERROR] [ocrmypdf._exec.ghostscript] The file was produced by:
webserver-1 |
webserver-1 | [2025-07-25 05:19:43,891] [ERROR] [ocrmypdf._exec.ghostscript] >>>> BrainSellers.com biz-Stream Version 5.1.0 Standard edition <<<<
webserver-1 |
webserver-1 | [2025-07-25 05:19:43,891] [ERROR] [ocrmypdf._exec.ghostscript] Please notify the author of the software that produced this
webserver-1 |
webserver-1 | [2025-07-25 05:19:43,891] [ERROR] [ocrmypdf._exec.ghostscript] file that it does not conform to Adobe's published PDF
webserver-1 |
webserver-1 | [2025-07-25 05:19:43,891] [ERROR] [ocrmypdf._exec.ghostscript] specification.
Version of paperless ngx: v2.17.1
Beta Was this translation helpful? Give feedback.
All reactions