# Automation Plan


- Working from LaTeX template, identify portions of document needing replication
- Using current template, create text frame for each portion
- Save replicable portions to file
- Modify template to include text files, using standardized filenames and TDS numbers

In [1]:
import pandas as pd
import numpy as np
from PIL import Image
import os
import sys
import docx2txt
import re

In [2]:
os.chdir('/Users/kyleslugg/Documents/NYCHA/Production')

## Creating Context Map Parts

In [None]:
cons_name = "SUMNER CONSOLIDATED"
cons_tds = '073'

In [5]:
image = Image.open(f'REPORT_TEMPLATE/{cons_tds}_context_map.png')

In [8]:
width, height = image.size

In [9]:
bb1 = (0,0,width/2,height)
bb2 = (width/2, 0, width, height)

In [10]:
img_1 = image.crop(bb1)
img_2 = image.crop(bb2)

In [15]:
img_1.save(f'REPORT_TEMPLATE/{cons_tds}_context_1.png', format="PNG")
img_2.save(f'REPORT_TEMPLATE/{cons_tds}_context_2.png', format="PNG")

## Testing Out Text Imports

In [26]:
analysis_text = docx2txt.process(f'TEXT/{cons_tds}_Analysis.docx')

In [27]:
analysis_text

'Smith Houses Analysis:\n\n\n\nInspection and Collection Requirements\xa0\n\nSmith Consolidation\xa0is\xa0in compliance with inspection and collection requirement of Paragraph 45 of the HUD\xa0Agreement, according to a Compliance Interview conducted on October 24, 2019. The Supervisor Caretaker, Frederick Brown, reported that staff patrols the grounds for cleaning and maintenance and has sufficient manpower to correct most observed deficiencies. NYCHA caretakers pick up trash inside the buildings one to two times a day, including weekends. They conduction ground inspections more than four times a day, including weekends. They pick up litter from the ground one to two times a day. Daily trash collection begins between 8:00 AM – 10:00 AM and ends before 4:00 PM. Mr. Brown stated that caretakers are usually able to complete all of their tasks in a day.\xa0\xa0\n\n\xa0\n\nRemoval or Storage Requirement\xa0\n\nSmith Consolidation is in compliance with the storage and removal of Paragraph 45

In [28]:
substitutions = {'“':"``",
                '”': "''",
                '’':"'",
                ' ':' ',
                '–':'--',
                ' ':' ',
                '\xa0':' '}

header = re.compile(r'([\w\s]*:)')

In [29]:
def clean_analysis(analysis_text):
    for key, value in substitutions.items():
        analysis_text = analysis_text.replace(key, value)
    analysis_text = analysis_text.replace(header.findall(analysis_text)[0]+'\n','')
    return analysis_text

def format_analysis_block(clean_analysis_text):
    latex_block = clean_analysis_text.replace('Inspection and Collection Requirements', 
                                              '\emph{Inspection and Collection Requirements}'
                                             ).replace('Removal or Storage Requirement',
                                                      '\emph{Removal or Storage Requirement}'
                                                      ).replace('Removal and Storage Requirement',
                                                               '\emph{Removal or Storage Requirement}')
    return latex_block
