Filled form fields are gone when merging - in some PDF viewer, sometimes #506

wmoskal · 2019-07-05T19:51:48Z

I am using a fillable pdf that has a number of fields as a template, and there is an unknown number of individual pdfs. When I am attempting to merge the pdfs with the entered form data, it is giving a really weird condition where the merged pdf is showing different content depending on what context it is viewed in. If I view the code in Okular (debian linux PDF viewer) it shows no form fields and basically just a flattened pdf with no data. If I view the pdf on windows, in chrome it displays the propper field data for the first page, but then all the subsequent pages just contain duplicates of the form data found on the first page. If I view the page on windows, in Microsoft Edge, it displays the correct result for all pages.

The source code of the merged pdf seems to contain all the correct data, regardless of where it is viewed. I am not sure if this is an issue with PyPDF2, with the pdf itself, or with the browser/viewers. Any Help would be greatly appreciated

Redjumpman · 2020-10-07T18:33:22Z

Hey wmoskal. I just spent the last few days pulling my hair out with the exact same problem. I saw this issue in hopes that there was a solution, but was devastated to see you never got a reply.

HOWEVER. I was able to determine the solution. It's over a year late for you, but maybe the next poor soul that comes across this will be saved some heartache.

def set_need_appearances_writer(writer: PdfFileWriter):
    try:
        catalog = writer._root_object
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
            })

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

for idx, row in enumerate(data, 1): # In my case, data was pulled from SQLAlchemy
    first_page = myfile.getPage(0) # First page in first pdf
    writer.addPage(first_page)
        set_need_appearances_writer(writer)
        writer.updatePageFormFieldValues(first_page, fields=fields)

        for j in range(0, len(first_page['/Annots'])):
            writer_annot = first_page['/Annots'][j].getObject()
            for field in fields:
                if writer_annot.get('/T') == field:
                    writer_annot.update({
                        NameObject("/T"): createStringObject(writer_annot.get('/T') + f'#{idx}')}) # Change the field name
                    writer_annot.update({
                        NameObject("/Ff"): NumberObject(1)  # make field Read Only
                    })

        with open("path",
                  "wb") as new:
            writer.write(new)

This takes a pdf that I use as a template, fills out the form, then saves the pdf. You have to rename the fields because once merged, all the pdfs in the final version have the same field names and we want to avoid that. I then flatten the fields using a bit shift to make it read only. Once you have your pdfs filled out and saved. THEN you can merge them normally. Also the set_need_appearances_writer is necessary to make the fields visible.

paulzuradzki · 2022-03-07T20:18:39Z

@Redjumpman - I am the "next poor soul". Thank you very much for sharing.

Before encountering your recipe, I had encountered the set_appearances() snippet and the method of updating form field bit to 1 for read-only. I still experienced merge errors due to the documents sharing the same field names. Your post saved a lot of trouble (after much research) on how to update those field names to be unique per document. I am curious how pdtfk gets around this. When using the flatten command line option, the resulting PDF is able to be merged. Anyway, this is great to know we can do this sort of data-driven form-filling and merging of templates in PyPDF2.

MartinThoma · 2022-04-07T16:27:49Z

Do you have a PDF that shows this issue?

MartinThoma · 2022-04-23T08:25:34Z

This might have the same root cause as #355

MartinThoma · 2022-08-06T11:49:33Z

I'm closing this as a duplicate of #355

nantaphop-kkp · 2024-01-09T03:04:08Z

@Redjumpman I'm a poor soul that you saved today 😭

paulzuradzki mentioned this issue Mar 7, 2022

ENH: Flatten PDF forms #232

Open

MartinThoma added the PdfMerger The PdfMerger component is affected label Apr 7, 2022

MartinThoma changed the title ~~Issues with merging fillable pdfs~~ Filled form fields are gone when merging - in some PDF viewer, sometimes Apr 23, 2022

MartinThoma added the workflow-forms From a users perspective, forms is the affected feature/workflow label Apr 23, 2022

MartinThoma added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Jun 26, 2022

MartinThoma closed this as completed Aug 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filled form fields are gone when merging - in some PDF viewer, sometimes #506

Filled form fields are gone when merging - in some PDF viewer, sometimes #506

wmoskal commented Jul 5, 2019

Redjumpman commented Oct 7, 2020

paulzuradzki commented Mar 7, 2022

MartinThoma commented Apr 7, 2022

MartinThoma commented Apr 23, 2022

MartinThoma commented Aug 6, 2022

nantaphop-kkp commented Jan 9, 2024

Filled form fields are gone when merging - in some PDF viewer, sometimes #506

Filled form fields are gone when merging - in some PDF viewer, sometimes #506

Comments

wmoskal commented Jul 5, 2019

Redjumpman commented Oct 7, 2020

paulzuradzki commented Mar 7, 2022

MartinThoma commented Apr 7, 2022

MartinThoma commented Apr 23, 2022

MartinThoma commented Aug 6, 2022

nantaphop-kkp commented Jan 9, 2024