Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] PDF gets scrambled while being imported #4016

Closed
abc2006 opened this issue Aug 18, 2023 · 2 comments
Closed

[BUG] PDF gets scrambled while being imported #4016

abc2006 opened this issue Aug 18, 2023 · 2 comments
Labels
dependencies Pull requests that update a dependency file

Comments

@abc2006
Copy link

abc2006 commented Aug 18, 2023

Description

today i imported my ecodms-files to paperless. No issues so far, but at least one document( didnt get time to check all of them) gets scrambled when i import it into Paperless. i will attach both pdf's and i am able to reproduce the mistake.

alurad_original-.pdf
alurad_seltsam.pdf

Steps to reproduce

  1. Take PDF
  2. drag it to "Dokumente hier ablegen oder"
  3. it gets imported without any error
  4. when checking, the content is changed

Webserver logs

[2023-08-18 13:07:44,691] [INFO] [celery.worker.strategy] Task documents.tasks.consume_file[2ddc938a-68a5-4a01-8755-438e626b3153] received
[2023-08-18 13:07:44,733] [INFO] [paperless.consumer] Consuming alurad_original-.pdf
[2023-08-18 13:07:45,065] [INFO] [ocrmypdf._pipeline] skipping all processing on this page
[2023-08-18 13:07:45,068] [INFO] [ocrmypdf._sync] Postprocessing...
[2023-08-18 13:07:45,261] [ERROR] [ocrmypdf._exec.ghostscript] GPL Ghostscript 10.0.0 (2022-09-21)
Copyright (C) 2022 Artifex Software, Inc.  All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
GPL Ghostscript 10.00.0: Text string detected in DOCINFO cannot be represented in XMP for PDF/A1, discarding DOCINFO
GPL Ghostscript 10.00.0: Text string detected in DOCINFO cannot be represented in XMP for PDF/A1, discarding DOCINFO
GPL Ghostscript 10.00.0: Text string detected in DOCINFO cannot be represented in XMP for PDF/A1, discarding DOCINFO
GPL Ghostscript 10.00.0: Text string detected in DOCINFO cannot be represented in XMP for PDF/A1, discarding DOCINFO
Processing pages 1 through 1.
Page 1
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F1 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F2 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular
Loading font F0 (or substitute) from /usr/share/ghostscript/10.00.0/Resource/Font/NimbusSans-Regular

The following errors were encountered at least once while processing this file:
        object lacks a required Subtype


[2023-08-18 13:07:45,261] [ERROR] [ocrmypdf._exec.ghostscript]  This file had errors that were repaired or ignored.

[2023-08-18 13:07:45,261] [ERROR] [ocrmypdf._exec.ghostscript]  The file was produced by:

[2023-08-18 13:07:45,261] [ERROR] [ocrmypdf._exec.ghostscript]  >>>> ��VARIO 8 PDF Export <<<<

[2023-08-18 13:07:45,261] [ERROR] [ocrmypdf._exec.ghostscript]  Please notify the author of the software that produced this

[2023-08-18 13:07:45,261] [ERROR] [ocrmypdf._exec.ghostscript]  file that it does not conform to Adobe's published PDF

[2023-08-18 13:07:45,261] [ERROR] [ocrmypdf._exec.ghostscript]  specification.


[2023-08-18 13:07:45,284] [WARNING] [py.warnings] /usr/local/lib/python3.9/site-packages/pikepdf/models/metadata.py:411: UserWarning: The metadata field /CreationDate could not be copied to XMP
  warn(msg)

[2023-08-18 13:07:45,400] [INFO] [ocrmypdf._pipeline] Image optimization ratio: 1.09 savings: 8.2%
[2023-08-18 13:07:45,400] [INFO] [ocrmypdf._pipeline] Total file size ratio: 1.03 savings: 2.5%
[2023-08-18 13:07:45,403] [INFO] [ocrmypdf._sync] Output file is a PDF/A-2B (as expected)
[2023-08-18 13:07:47,195] [INFO] [paperless.consumer] Document 2023-08-18 alurad_original- consumption finished
[2023-08-18 13:07:47,203] [INFO] [celery.app.trace] Task documents.tasks.consume_file[2ddc938a-68a5-4a01-8755-438e626b3153] succeeded in 2.508644920992083s: 'Success. New document id 539 created'

Browser logs

Docker image: 
ghcr.io/paperless-ngx/paperless-ngx:latest

Paperless-ngx version

1.17.1

Host OS

Welcome to Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-79-generic x86_64)

Installation method

Docker - official image

Browser

No response

Configuration changes

port 8001

Other

No response

@abc2006 abc2006 added bug Bug report or a Bug-fix unconfirmed labels Aug 18, 2023
@stumpylog
Copy link
Member

Issues with particular PDF files are common and not something we can solve. As the logs from ocrmypdf note, this particular PDF has errors which it attempted to repair, not appears it could not.

You can play with ocr settings and see if some combinations work, try pre-cleaning with qpdf or print it to a PDF instead.

@shamoon shamoon closed this as not planned Won't fix, can't repro, duplicate, stale Aug 18, 2023
@shamoon shamoon added dependencies Pull requests that update a dependency file and removed bug Bug report or a Bug-fix unconfirmed labels Aug 18, 2023
@github-actions
Copy link
Contributor

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 18, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

No branches or pull requests

3 participants