-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filled form fields are gone when merging - in some PDF viewer, sometimes #506
Comments
Hey wmoskal. I just spent the last few days pulling my hair out with the exact same problem. I saw this issue in hopes that there was a solution, but was devastated to see you never got a reply. HOWEVER. I was able to determine the solution. It's over a year late for you, but maybe the next poor soul that comes across this will be saved some heartache. def set_need_appearances_writer(writer: PdfFileWriter):
try:
catalog = writer._root_object
if "/AcroForm" not in catalog:
writer._root_object.update({
NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
})
need_appearances = NameObject("/NeedAppearances")
writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
return writer
except Exception as e:
print('set_need_appearances_writer() catch : ', repr(e))
return writer
for idx, row in enumerate(data, 1): # In my case, data was pulled from SQLAlchemy
first_page = myfile.getPage(0) # First page in first pdf
writer.addPage(first_page)
set_need_appearances_writer(writer)
writer.updatePageFormFieldValues(first_page, fields=fields)
for j in range(0, len(first_page['/Annots'])):
writer_annot = first_page['/Annots'][j].getObject()
for field in fields:
if writer_annot.get('/T') == field:
writer_annot.update({
NameObject("/T"): createStringObject(writer_annot.get('/T') + f'#{idx}')}) # Change the field name
writer_annot.update({
NameObject("/Ff"): NumberObject(1) # make field Read Only
})
with open("path",
"wb") as new:
writer.write(new) This takes a pdf that I use as a template, fills out the form, then saves the pdf. You have to rename the fields because once merged, all the pdfs in the final version have the same field names and we want to avoid that. I then flatten the fields using a bit shift to make it read only. Once you have your pdfs filled out and saved. THEN you can merge them normally. Also the |
@Redjumpman - I am the "next poor soul". Thank you very much for sharing. Before encountering your recipe, I had encountered the set_appearances() snippet and the method of updating form field bit to 1 for read-only. I still experienced merge errors due to the documents sharing the same field names. Your post saved a lot of trouble (after much research) on how to update those field names to be unique per document. I am curious how pdtfk gets around this. When using the |
Do you have a PDF that shows this issue? |
This might have the same root cause as #355 |
I'm closing this as a duplicate of #355 |
@Redjumpman I'm a poor soul that you saved today 😭 |
I am using a fillable pdf that has a number of fields as a template, and there is an unknown number of individual pdfs. When I am attempting to merge the pdfs with the entered form data, it is giving a really weird condition where the merged pdf is showing different content depending on what context it is viewed in. If I view the code in Okular (debian linux PDF viewer) it shows no form fields and basically just a flattened pdf with no data. If I view the pdf on windows, in chrome it displays the propper field data for the first page, but then all the subsequent pages just contain duplicates of the form data found on the first page. If I view the page on windows, in Microsoft Edge, it displays the correct result for all pages.
The source code of the merged pdf seems to contain all the correct data, regardless of where it is viewed. I am not sure if this is an issue with PyPDF2, with the pdf itself, or with the browser/viewers. Any Help would be greatly appreciated
The text was updated successfully, but these errors were encountered: