Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PdfReadError: Multiple definitions in dictionary at byte 0x30b for key /Type #244

Closed
milspect18 opened this issue Jan 14, 2016 · 13 comments
Closed

Comments

@milspect18
Copy link

When attempting to merge multiple PDF documents, the merger.write("foo.pdf") line chokes with the error shown in the title... I was attempting to merge a few files with the following:


import PyPDF2 as PDF
import glob

allPdfFiles = glob.glob("*.pdf")
merger = PDF.PdfFileMerger()

for filename in allPdfFiles:
    merger.append(PDF.PdfFileReader(filename, "rb"))

merger.write("merged_full.pdf")

The output is as follows:


Traceback (most recent call last):
  File "C:\Users\kprice\Google Drive\AC201\RawChapters\CH3\pdfMerge.py", line 10, in 
    merger.write("merged_full.pdf")
  File "C:\Python27\lib\site-packages\PyPDF2\merger.py", line 230, in write
    self.output.write(fileobj)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 467, in write
    self._sweepIndirectReferences(externalReferenceMap, self._root)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 532, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 532, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 541, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, data[i])
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 532, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 532, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 532, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 561, in _sweepIndirectReferences
    newobj = data.pdf.getObject(data)
  File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 1582, in getObject
    retval = readObject(self.stream, self)
  File "C:\Python27\lib\site-packages\PyPDF2\generic.py", line 66, in readObject
    return DictionaryObject.readFromStream(stream, pdf)
  File "C:\Python27\lib\site-packages\PyPDF2\generic.py", line 581, in readFromStream
    % (utils.hexStr(stream.tell()), key))
PdfReadError: Multiple definitions in dictionary at byte 0x30b for key /Type
@lambdalisue
Copy link

+1. Same here but

pdf = PdfFileReader(fobj, strict=False)

Solve this problem.

https://github.com/mstamy2/PyPDF2/blob/0900101f836345723f8ab4086bf77da32de8fc38/PyPDF2/pdf.py#L1048

@pokey
Copy link

pokey commented Apr 7, 2016

For me, @lambdalisue's solution didn't work. I instead had to do:

merger = PdfFileMerger(strict=False)

@jwhendy
Copy link

jwhendy commented Sep 29, 2016

Could someone clarify this if this is a "bug" or not? I ran into this when using stapler, but don't know how to view what's going on. For example:

  • perhaps multiple /Type definitions doesn't play well with pypdf2
  • multiple /Type definitions aren't expected/shouldn't be in a "proper" pdf file and that's why it complains
  • multiple /Type definitions is valid for pdfs but pypdf2 doesn't handle them currently via it's algorithm. Eventually it could handle them, but presently chokes a bit with a strict=True setting. This ends up not being fatal and thus we can bypass with strict=False

I'm just trying to understand what to do about the errors, if there are unintended side effects, if this will be "fixed" at some point... for what it's worth, before figuring out I could use strict=False, I used pdfunite just fine; I don't understand why that's fine but this had troubles with my input.

@yw5aj
Copy link

yw5aj commented Oct 6, 2016

+1 for having this issue and fixed by merger = PdfFileMerger(strict=False). It does give out a warning PdfReadWarning: Multiple definitions in dictionary at byte 0x2d72 for key /Type [generic.py:589] instead of error PdfReadError: Multiple definitions in dictionary at byte 0x2d72 for key /Type though.

@radzhome
Copy link

radzhome commented Dec 5, 2016

+1

@Paul424
Copy link

Paul424 commented May 6, 2017

Thanks for the tip; it worked!

@akkana
Copy link

akkana commented May 28, 2017

+1. I used the provided example in Sample_Code/basic_merging.py, and then pared it down to two PDF files which were each just called with append(fileobj), and it still didn't work. The workaround of PdfFileMerger(strict=False) solves the problem (modulo a bunch of warning messages). It would help to have the basic_merging example mention strict=False since it looks like that's needed for many (maybe all?) real-world PDF files.

@starqueue
Copy link

I tripped over this. If its all the same to the outcomes then it would be good if this fix was the default.

@christianalcantara
Copy link

Thanks!

@charalamm
Copy link

charalamm commented Nov 10, 2020

Hello,
Could someone explain what things inside a pdf could cause this error/warning?
Thank you :)

@tripleee
Copy link

A PDF file is a (constrained) PostScript program. The error (or warning) means that the program attempted to redefine an internal dictionary variable. The fact that this often happens while merging would suggest that both the merged documents wanted to use the same variable name, resulting in this conflict when you combine them. There isn't really anything you can do (especially on the Python side) to reconcile this in a document which somebody else created.

OmniaGit pushed a commit to OmniaGit/odooplm that referenced this issue Mar 18, 2022
@Ontheroad123
Copy link

Through i add the param 'strict', it still print "Multiple definitions in dictionary at byte 0x51d16 for key /MediaBox", and some txt may messy code, how can i solve it?

@pubpub-zz
Copy link
Collaborator

a) Strict shall be set
b) without pdf and code no analysis can be done

you should open a new issue providing those details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests