## I used to spend a lot of time filling forms
####  Not anymore.
With the help of script generously shared online by [Andrew Krcatovich]( https://akdux.com/python/2020/10/31/python-fill-pdf-files.htmlFilling-a-PDF ) and [Vivsvaan Sharma]( https://medium.com/@vivsvaan/filling-editable-pdf-in-python-76712c3ce99 ), I was able to write a script that autofilled pdfs with a template pdf and a csv file of client information.

Initially, I set my sights too low. In a previous Medium article I posted, I simply manipulated the table with SQL so that I could link the attributes to the appropriate fields in JotForm. While it worked, I had to create a new Jotform document every time I wanted to fill in a new set of forms.

With this, I can take a new form and generate over a thousand filled documents in under a minute.

To do this I wrote a script that:
1. Creates a dataframe from csv file
2. Creates a nested dictionary with outer dictionary key referencing the client_df row index
   a. inner dictionary items are column name, value the client information
3. Opens a template pdf (make sure path is set for the pdf used as a template)
4. Iterates through the index to select inner dictionary
5. For every column name key that is identical to field key (text fillable annotation) in the template, the value is filled in that key.

Because dictionaries are unstructured, using the client_df index allowed me to iterate over the length of the
dataframe to create a document for each client.

In [None]:
import pdfrw #allows me to interact with template pdf
import pandas as pd #to create dataframe and dictionary from client csv file

In [4]:
#Annotations (Annots) are objects in a PDF file that allow us to interact with the pdf.
ANNOT_KEY = '/Annots'
ANNOT_FIELD_KEY = '/T'
ANNOT_VAL_KEY = '/V'
ANNOT_RECT_KEY = 'Rect'
SUBTYPE_KEY = '/Subtype'
WIDGET_SUBTYPE_KEY = '/Widget'

In [None]:
def autofill_bulk_pdf(input_pdf, client_csv):
    client_df = pd.read_csv(client_csv)
    client_dict = client_df.to_dict('index')
    template_pdf = pdfrw.PdfReader(input_pdf)
    for i in range(len(client_df)):
        
        #selecting inner dictionary
        individual_dict = client_dict[i]
        
        #each page in the pdf needs to be looped through
        for page in template_pdf.pages:
            annotations = page[ANNOT_KEY]
            for annotation in annotations:
                
                #make sure the subtype is a widget, which allows us to fill in information
                if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
                    if annotation[ANNOT_FIELD_KEY]:
                        
                        #Remove the parenthesis from the field key to match individual_dict keys
                        key = annotation[ANNOT_FIELD_KEY][1:-1]
                        if key in individual_dict.keys():
                            
                            #fill in information
                            annotation.update(
                                pdfrw.PdfDict(V='{}'.format(individual_dict[key])))
                                
                            annotation.update(pdfrw.PdfDict(AP=''))
                            
                        '''
                        PdfReader will not pull up an Annot if it is listed more than once 
                        in the PDF. This also applies to 'exclusive or' check boxes. This is 
                        why I had to apply the second instance of Nname to a different 
                        key ('Name_Again').
                        '''
                       
                        if key == 'Name_Again':
                            annotation.update(
                                pdfrw.PdfDict(V='{}'.format(individual_dict['Name'])))
                                
                            annotation.update(pdfrw.PdfDict(AP=''))
                            
        #according to a couple online resources, the values entered may not always propulate when
        #the pdf is opened. To avoid that, this line of code is used.
        template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))
        
        #saves new, filled in pdf as 'client name input_pdf.pdf' with reference to the template
        pdfrw.PdfWriter().write('{} {}.format(individual_dict['Name'], input_pdf), template_pdf)

In [None]:
autofill_bulk_pdf("Review_Form.pdf", "Client_List.csv")

Huge thanks again to [Andrew Krcatovich]( https://akdux.com/python/2020/10/31/python-fill-pdf-files.htmlFilling-a-PDF ) and [Vivsvaan Sharma]( https://medium.com/@vivsvaan/filling-editable-pdf-in-python-76712c3ce99 ) for providing thorough explainations of how the pdfrw works, how the annotation keys work in a PDF document, and the bulk of the code. This was exactly what I needed.

There is still some work that needs to be done for certain types of widgets as mentioned above. I also need to really double down on standardizing field naming.

As for filling or updating a specific client form, I removed the iteration, created a list from the df so I could use the  .index() method to retreive the number that refered to the indexed row. I'm sure there is a better way to return an index from a value in a df, but this was my first instict and it works.


In [None]:
def autofill_individual_pdf(input_pdf_path, client_csv, name):
    client_df = pd.read_csv(client_csv)
    client_dict = client_df.to_dict('index')
    template_pdf = pdfrw.PdfReader(input_pdf_path)
    client_df = pd.read_csv(client_csv)
    name_list = client_df['Name'].values.tolist()
    i = name_list.index(name)
    individual_dict = client_dict[i]
    for page in template_pdf.pages:
        annotations = page[ANNOT_KEY]
        for annotation in annotations:
            if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
                if annotation[ANNOT_FIELD_KEY]:
                    key = annotation[ANNOT_FIELD_KEY][1:-1]
                    if key in individual_dict.keys():
                        annotation.update(
                            pdfrw.PdfDict(V='{}'.format(individual_dict[key])))       
                        annotation.update(pdfrw.PdfDict(AP=''))
                    if key == 'Owner Name':
                        annotation.update(
                            pdfrw.PdfDict(V='{}'.format(individual_dict['Name'])))        
                        annotation.update(pdfrw.PdfDict(AP=''))
    template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))
    pdfrw.PdfWriter().write('{} Review.pdf'.format(individual_dict['Name']), template_pdf)

That's all for now. If you have any further suggestions, I'd love to hear about it.