## PDF extraction ##

#### FEMA LOMA files for Long Island ####

We will be reading hundreds of PDFs and extracting useful data into a useable dataframe. For this we try Jonathan Soma's Natural-PDF and Jeremy Singer-Vine's PDFPlumber. Both were installed in the backend on the appropriate conda environment.

In [1]:
#import the basics

import pandas as pd
import numpy as np
import requests

In [18]:
pd.set_option('display.max_colwidth', None)

### Trying Natural-PDF ###

In [2]:
from natural_pdf import PDF

# Open a PDF
pdf = PDF("https://github.com/jsoma/natural-pdf/raw/refs/heads/main/pdfs/01-practice.pdf")


In [3]:
# Get the first page
page = pdf.pages[0]

# Extract all text
text = page.extract_text()
print(text)

# Find something specific
title = page.find('text:bold')
print(f"Found title: {title.text}")

Jungle Health and Safety Inspection Service
INS-UP70N51NCL41R
Site: Durham’s Meatpacking Chicago, Ill.
Date: February 3, 1905
Violation Count: 7
Summary: Worst of any, however, were the fertilizer men, and those who served in the cooking rooms.
These people could not be shown to the visitor - for the odor of a fertilizer man would scare any ordinary
visitor at a hundred yards, and as for the other men, who worked in tank rooms full of steam, and in
some of which there were open vats near the level of the floor, their peculiar trouble was that they fell
into the vats; and when they were fished out, there was never enough of them left to be worth
exhibiting - sometimes they would be overlooked for days, till all but the bones of them had gone out
to the world as Durham’s Pure Leaf Lard!
Violations
Statute Description Level Repeat?
4.12.7 Unsanitary Working Conditions. Critical
5.8.3 Inadequate Protective Equipment. Serious
6.3.9 Ineffective Injury Prevention. Serious
7.1.5 Failure to Pro

### Now trying PDFPlumber ###

In [11]:
import pdfplumber
import re
import os

In [35]:

OUTCSV = "extracted_loma_data_with_outcome_v2.csv"
pdf_path = "Suffolk_LOMA_2013_2025"

def normalize_token(t):
    return re.sub(r"[^\w]", "", t).upper()

# Header variants to match (normalized tokens)
HEADER_STRS = [
    "WHAT IS NOT REMOVED FROM THE SFHA",
    "WHAT IS REMOVED FROM THE SFHA",
    "WHAT IS NOT REMOVED FROM SFHA",
    "WHAT IS REMOVED FROM SFHA",
]
HEADER_VARIANTS = [[normalize_token(tok) for tok in s.split()] for s in HEADER_STRS]

def extract_text(pdf_path):
    chunks = []
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            txt = page.extract_text(layout=True) or page.extract_text() or ""
            chunks.append(txt)
    return "\n".join(chunks)

def extract_words_from_pages(pdf_path, max_pages=3):
    words_by_page = []
    with pdfplumber.open(pdf_path) as pdf:
        for i in range(min(max_pages, len(pdf.pages))):
            page = pdf.pages[i]
            words = page.extract_words(use_text_flow=True)
            words_by_page.append(words or [])
    return words_by_page

def find_header_in_words(words, header_variants):
    tokens = [normalize_token(w["text"]) for w in words]
    for variant in header_variants:
        vlen = len(variant)
        for i in range(len(tokens) - vlen + 1):
            if tokens[i:i+vlen] == variant:
                return i, i+vlen-1
    return None

def extract_cell_below(words, start_idx, end_idx, max_lines=4):
    x0 = min(w["x0"] for w in words[start_idx:end_idx+1])
    x1 = max(w["x1"] for w in words[start_idx:end_idx+1])
    bottom = max(w.get("bottom", w.get("top",0)) for w in words[start_idx:end_idx+1])
    col_center_min = x0 - 4
    col_center_max = x1 + 4
    candidates = [w for w in words if (w.get("top",0) > bottom + 1) and ((w["x0"]+w["x1"])/2 >= col_center_min) and ((w["x0"]+w["x1"])/2 <= col_center_max)]
    if not candidates:
        candidates = [w for w in words if (w.get("top",0) > bottom + 1) and (w["x0"] >= x0 - 6 and w["x0"] <= x1 + 6)]
    if not candidates:
        return ""
    candidates.sort(key=lambda w: (round(w["top"],1), w["x0"]))
    lines = []
    curr_top = None
    curr_line = []
    for w in candidates:
        top = round(w["top"],1)
        if curr_top is None or abs(top - curr_top) > 3.0:
            if curr_line:
                lines.append(" ".join([t["text"] for t in curr_line]).strip())
            curr_line = [w]
            curr_top = top
        else:
            curr_line.append(w)
    if curr_line:
        lines.append(" ".join([t["text"] for t in curr_line]).strip())
    cleaned = []
    for ln in lines:
        ln2 = re.sub(r"\s{2,}", " ", ln).strip()
        if re.fullmatch(r"[A-Z0-9 ,\-'()/.%]+", ln2) and len(ln2) > 100:
            break
        cleaned.append(ln2)
        if len(cleaned) >= max_lines:
            break
    return " ".join(cleaned).strip()

def get_outcome_from_pdf(pdf_path):
    REMOVED = "REMOVED FROM THE SFHA"
    NOT_REMOVED = "NOT REMOVED FROM THE SFHA"
    words_pages = extract_words_from_pages(pdf_path, max_pages=3)
    header_text = None
    cell_text = ""
    found = False
    for words in words_pages:
        if not words:
            continue
        res = find_header_in_words(words, HEADER_VARIANTS)
        if res:
            sidx, eidx = res
            header_text = " ".join(w["text"] for w in words[sidx:eidx+1]).strip().upper()
            cell_text = extract_cell_below(words, sidx, eidx, max_lines=4)
            found = True
            break
    if not found:
        fulltxt = extract_text(pdf_path)
        m = re.search(r"(WHAT\s+IS(?:\s+NOT)?\s+REMOVED\s+FROM\s+(?:THE\s+)?SFHA)", fulltxt, flags=re.I)
        if m:
            header_text = m.group(1).strip().upper()
            rest = fulltxt[m.end():].strip()
            lines = [ln.strip() for ln in rest.splitlines() if ln.strip()]
            cell_text = " ".join(lines[:4]) if lines else ""
    normalized = None
    if header_text:
        if "NOT" in header_text:
            normalized = NOT_REMOVED
        else:
            normalized = REMOVED
    if not normalized and cell_text:
        up = cell_text.upper()
        if re.search(r"\bPORTION\s+OF\s+PROPERTY\b", up) or re.search(r"\bPORTION\b", up) or re.search(r"\bPROPERTY\b", up):
            normalized = REMOVED
        if re.search(r"\bNOT\s+REMOVED\b", up) or re.search(r"\bNONE\b", up):
            normalized = NOT_REMOVED
    return normalized, header_text, cell_text

# Other field extractors
def get_community_from_text(full_text):
    m = re.search(r"((?:TOWN|CITY|VILLAGE) OF [A-Z0-9 .'\-]+,?)(?:[^\n]*?)\n\s*([A-Z][A-Z ]+ COUNTY,\s*NEW YORK)", full_text, flags=re.I)
    if m:
        town = re.sub(r"\s+", " ", m.group(1)).strip().upper().rstrip(",")
        county = re.sub(r"\s+", " ", m.group(2)).strip().upper()
        return f"{town}, {county}"
    m = re.search(r"((?:TOWN|CITY|VILLAGE) OF [A-Z0-9 .'\-]+,\s*[A-Z][A-Z ]+ COUNTY,\s*NEW YORK)", full_text, flags=re.I)
    if m:
        return re.sub(r"\s+", " ", m.group(1)).strip().upper()
    m = re.search(r"FLOODING SOURCE:[^\n]*\n([A-Z ,'\-]+)(?:\n([A-Z ,'\-]+))?", full_text)
    if m:
        parts = [p for p in (m.group(1), m.group(2)) if p]
        cand = " ".join(parts)
        if "COUNTY" in cand:
            return re.sub(r"\s+", " ", cand).strip().upper()
    m = re.search(r"\bCOMMUNITY\s*\n([A-Z ,'\-]+(?:\n[A-Z ,'\-]+)?)", full_text)
    if m:
        return re.sub(r"\s+", " ", m.group(1)).strip().upper()
    return None

def get_street_from_text(full_text):
    suffix = r"(Road|Rd\.|Street|St\.|Avenue|Ave\.|Drive|Dr\.|Lane|Ln\.|Boulevard|Blvd\.|Court|Ct\.|Place|Pl\.)"
    m = re.search(rf"\b(\d{{1,6}}\s+[A-Za-z0-9.\- ]+?{suffix})", full_text)
    return m.group(1).strip() if m else None

def get_date_from_text(full_text):
    m = re.search(r"Page\s*\d+\s*of\s*\d+\s*([A-Za-z]+\s\d{1,2},\s\d{4})", full_text)
    if m: return m.group(1).strip()
    m = re.search(r"\b([A-Za-z]+\s\d{1,2},\s\d{4})\b", full_text)
    return m.group(1).strip() if m else None

def get_flooding_source_from_text(full_text):
    m = re.search(r"FLOODING SOURCE:\s*([A-Za-z0-9 &\-\.\']+?)(?:\s{2,}|\s+APPROXIMATE\s+LATITUDE|\n|$)", full_text, flags=re.I)
    if m:
        return re.sub(r"\s+", " ", m.group(1)).strip().upper()
    return None

def get_latlong_from_text(full_text):
    m = re.search(r"APPROXIMATE LATITUDE\s*&\s*LONGITUDE OF PROPERTY:\s*([0-9.\-,\s]+)", full_text, flags=re.I)
    return re.sub(r"\s+", " ", m.group(1)).strip() if m else None

def get_determination_para(full_text):
    m = re.search(r"(This document provides the Federal Emergency Management Agency.*?Administration)", full_text, flags=re.S|re.I)
    if m:
        p = re.sub(r"\s*\n\s*", " ", m.group(1)).strip()
        p = re.sub(r"\s{2,}", " ", p)
        return p
    return None

# Run through folder and collect records
records = []
files = [f for f in sorted(os.listdir(pdf_path)) if f.lower().endswith(".pdf")]
for fn in files:
    path = os.path.join(pdf_path, fn)
    try:
        full_text = extract_text(path)
        normalized, header, cell = get_outcome_from_pdf(path)
        rec = {
            "filename": fn,
            "community": get_community_from_text(full_text),
            "street": get_street_from_text(full_text),
            "outcome": normalized,                    # REMOVED / NOT REMOVED or None
            "outcome_raw_header": header,             # raw header text (if found)
            "outcome_raw_cell": cell,                 # following cell text under header
            "date": get_date_from_text(full_text),
            "flooding_source": get_flooding_source_from_text(full_text),
            "latitude_longitude": get_latlong_from_text(full_text),
            "determination_paragraph": get_determination_para(full_text)
        }
    except Exception as e:
        rec = {"filename": fn, "error": str(e)}
    records.append(rec)

df3 = pd.DataFrame(records)
df3.head()


Unnamed: 0,filename,community,street,outcome,outcome_raw_header,outcome_raw_cell,date,flooding_source,latitude_longitude,determination_paragraph
0,13-02-0720A-365337.pdf,COMMUNITY NO,55 Mowbray Avenue,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure area that would be section on Attachment Agency's determination,"May 21, 2013",,"40.721, -73.239","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
1,13-02-0917A-365342.pdf,"TOWN OF SOUTHAMPTON, SUFFOLK COUNTY, NEW YORK",193 Old Mill Road,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Portion of Property area that would be section on Attachment,"May 09, 2013",MILL POND,"40.911, -72.361","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the described portion(s) of the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
2,13-02-1024A-365342.pdf,"TOWN OF SOUTHAMPTON, SUFFOLK COUNTY, NEW YORK",178 Bay Lane,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure (Residence) area that would be section on Attachment,"June 04, 2013",,"40.909, -72.328","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
3,13-02-1215A-360790.pdf,COMMUNITY NO,847 South Pickett Street,NOT REMOVED FROM THE SFHA,WHAT IS NOT REMOVED FROM THE SFHA,Structure (Residence) area that would be Agency's determination,"July 18, 2013",,"40.669, -73.376","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). Therefore, flood insurance is required for the property described above. The lowest adjacent grade elevation to a structure must be at or above the Base Flood Elevation for a structure to be outside of the SFHA. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination and information regarding your options for obtaining a Letter of Map Amendment. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
4,13-02-1577A-360813.pdf,COMMUNITY NO,847 South Pickett Street,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure (Residence) area that would be section on Attachment,"August 29, 2013",LONG ISLAND SOUND,"41.140, -72.347","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"


In [36]:
df3.outcome.value_counts()

outcome
REMOVED FROM THE SFHA        101
NOT REMOVED FROM THE SFHA     44
Name: count, dtype: int64

### Same for Nassau now. ###

In [38]:
pdf_path_nas = 'Nassau_LOMA_2013_2025'

# Run through folder and collect records
records_nas = []
files = [f for f in sorted(os.listdir(pdf_path_nas)) if f.lower().endswith(".pdf")]
for fn in files:
    path = os.path.join(pdf_path_nas, fn)
    try:
        full_text = extract_text(path)
        normalized, header, cell = get_outcome_from_pdf(path)
        rec = {
            "filename": fn,
            "community": get_community_from_text(full_text),
            "street": get_street_from_text(full_text),
            "outcome": normalized,                    # REMOVED / NOT REMOVED or None
            "outcome_raw_header": header,             # raw header text (if found)
            "outcome_raw_cell": cell,                 # following cell text under header
            "date": get_date_from_text(full_text),
            "flooding_source": get_flooding_source_from_text(full_text),
            "latitude_longitude": get_latlong_from_text(full_text),
            "determination_paragraph": get_determination_para(full_text)
        }
    except Exception as e:
        rec = {"filename": fn, "error": str(e)}
    records_nas.append(rec)

df4 = pd.DataFrame(records_nas)
df4.head()


Unnamed: 0,filename,community,street,outcome,outcome_raw_header,outcome_raw_cell,date,flooding_source,latitude_longitude,determination_paragraph
0,13-02-0514A-360495.pdf,"VILLAGE OF VALLEY STREAM, NASSAU COUNTY, NEW YORK",10 -- -- 150 Cornwell Avenue,NOT REMOVED FROM THE SFHA,WHAT IS NOT REMOVED FROM THE SFHA,Structure (Residence) area that would be Agency's determination,"May 09, 2013",MOTTS CREEK,"40.666, -73.688","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). Therefore, flood insurance is required for the property described above. The lowest adjacent grade elevation to a structure must be at or above the Base Flood Elevation for a structure to be outside of the SFHA. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination and information regarding your options for obtaining a Letter of Map Amendment. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
1,13-02-0782A-360467.pdf,,2067 Whalen Avenue,NOT REMOVED FROM THE SFHA,WHAT IS NOT REMOVED FROM THE SFHA,Structure (Residence) area that would be Agency's determination,"May 16, 2013",,"40.650, -73.540","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). Therefore, flood insurance is required for the property described above. The lowest adjacent grade elevation to a structure must be at or above the Base Flood Elevation for a structure to be outside of the SFHA. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination and information regarding your options for obtaining a Letter of Map Amendment. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
2,13-02-1150A-360467.pdf,N,406 Barnard Avenue,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure that would be on Attachment 1 determination,"May 10, 2013",,"40.632, -73.726","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, Attn: RAMPP eLOMA Coordinator, 8401 Arlington Blvd, Fairfax, VA 22031-2666, Fax: 800-684-6860. Luis Rodriguez, P.E., Chief Engineering Management Branch eLOMA Federal Insurance and Mitigation Administration"
3,13-02-1304A-360467.pdf,C,3 Oakwood at East 3264 Judith Drive,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure that would be on Attachment 1 determination,"June 04, 2013",,"40.642, -73.523","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, Attn: RAMPP eLOMA Coordinator, 8401 Arlington Blvd, Fairfax, VA 22031-2666, Fax: 800-684-6860. Luis Rodriguez, P.E., Chief Engineering Management Branch eLOMA Federal Insurance and Mitigation Administration"
4,13-02-1305A-360467.pdf,C,3 Merrick Bayview 2093 Blanche Lane,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure that would be on Attachment 1 determination,"June 04, 2013",,"40.646, -73.536","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, Attn: RAMPP eLOMA Coordinator, 8401 Arlington Blvd, Fairfax, VA 22031-2666, Fax: 800-684-6860. Luis Rodriguez, P.E., Chief Engineering Management Branch eLOMA Federal Insurance and Mitigation Administration"


In [39]:
df4.outcome.value_counts()

outcome
REMOVED FROM THE SFHA        367
NOT REMOVED FROM THE SFHA     63
Name: count, dtype: int64

In [44]:
li_df = pd.concat([df3, df4])
li_df

Unnamed: 0,filename,community,street,outcome,outcome_raw_header,outcome_raw_cell,date,flooding_source,latitude_longitude,determination_paragraph
0,13-02-0720A-365337.pdf,COMMUNITY NO,55 Mowbray Avenue,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure area that would be section on Attachment Agency's determination,"May 21, 2013",,"40.721, -73.239","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
1,13-02-0917A-365342.pdf,"TOWN OF SOUTHAMPTON, SUFFOLK COUNTY, NEW YORK",193 Old Mill Road,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Portion of Property area that would be section on Attachment,"May 09, 2013",MILL POND,"40.911, -72.361","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the described portion(s) of the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
2,13-02-1024A-365342.pdf,"TOWN OF SOUTHAMPTON, SUFFOLK COUNTY, NEW YORK",178 Bay Lane,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure (Residence) area that would be section on Attachment,"June 04, 2013",,"40.909, -72.328","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
3,13-02-1215A-360790.pdf,COMMUNITY NO,847 South Pickett Street,NOT REMOVED FROM THE SFHA,WHAT IS NOT REMOVED FROM THE SFHA,Structure (Residence) area that would be Agency's determination,"July 18, 2013",,"40.669, -73.376","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). Therefore, flood insurance is required for the property described above. The lowest adjacent grade elevation to a structure must be at or above the Base Flood Elevation for a structure to be outside of the SFHA. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination and information regarding your options for obtaining a Letter of Map Amendment. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
4,13-02-1577A-360813.pdf,COMMUNITY NO,847 South Pickett Street,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure (Residence) area that would be section on Attachment,"August 29, 2013",LONG ISLAND SOUND,"41.140, -72.347","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration"
...,...,...,...,...,...,...,...,...,...,...
478,25-02-0454A-360479.pdf,,33 Manorhaven Boulevard,,,,"June 26, 2025",,,
479,25-02-0501A-360467.pdf,"TOWN OF HEMPSTEAD, NASSAU COUNTY, NEW YORK",5 North 841 Talbot Avenue,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure that would be inundated appropriate section on Agency's determination,"July 25, 2025",,"40.648748, -73.725168",
480,25-02-0515A-360467.pdf,"TOWN OF HEMPSTEAD, NASSAU COUNTY, NEW YORK",3 Hewlett Knolls 978 Dartmouth Lane,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Property that would be inundated appropriate section on Agency's determination,"August 05, 2025",,"40.642515, -73.714045",
481,25-02-0523A-360467.pdf,"TOWN OF HEMPSTEAD, NASSAU COUNTY, NEW YORK",5 Map of North 865 Oliver Avenue,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure that would be inundated appropriate section on Agency's determination,"August 06, 2025",,"40.649619, -73.724829",


In [46]:
li_df[['latitude','longitude']]=li_df['latitude_longitude'].str.split(",").apply(pd.Series).astype(float).rename(columns={0: "lat", 1: "lng"})
li_df.head()

Unnamed: 0,filename,community,street,outcome,outcome_raw_header,outcome_raw_cell,date,flooding_source,latitude_longitude,determination_paragraph,latitude,longitude
0,13-02-0720A-365337.pdf,COMMUNITY NO,55 Mowbray Avenue,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure area that would be section on Attachment Agency's determination,"May 21, 2013",,"40.721, -73.239","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration",40.721,-73.239
1,13-02-0917A-365342.pdf,"TOWN OF SOUTHAMPTON, SUFFOLK COUNTY, NEW YORK",193 Old Mill Road,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Portion of Property area that would be section on Attachment,"May 09, 2013",MILL POND,"40.911, -72.361","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the described portion(s) of the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration",40.911,-72.361
2,13-02-1024A-365342.pdf,"TOWN OF SOUTHAMPTON, SUFFOLK COUNTY, NEW YORK",178 Bay Lane,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure (Residence) area that would be section on Attachment,"June 04, 2013",,"40.909, -72.328","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration",40.909,-72.328
3,13-02-1215A-360790.pdf,COMMUNITY NO,847 South Pickett Street,NOT REMOVED FROM THE SFHA,WHAT IS NOT REMOVED FROM THE SFHA,Structure (Residence) area that would be Agency's determination,"July 18, 2013",,"40.669, -73.376","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). Therefore, flood insurance is required for the property described above. The lowest adjacent grade elevation to a structure must be at or above the Base Flood Elevation for a structure to be outside of the SFHA. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination and information regarding your options for obtaining a Letter of Map Amendment. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration",40.669,-73.376
4,13-02-1577A-360813.pdf,COMMUNITY NO,847 South Pickett Street,REMOVED FROM THE SFHA,WHAT IS REMOVED FROM THE SFHA,Structure (Residence) area that would be section on Attachment,"August 29, 2013",LONG ISLAND SOUND,"41.140, -72.347","This document provides the Federal Emergency Management Agency's determination regarding a request for a Letter of Map Amendment for the property described above. Using the information submitted and the effective National Flood Insurance Program (NFIP) map, we have determined that the structure(s) on the property(ies) is/are not located in the SFHA, an area inundated by the flood having a 1-percent chance of being equaled or exceeded in any given year (base flood). This document amends the effective NFIP map to remove the subject property from the SFHA located on the effective NFIP map; therefore, the Federal mandatory flood insurance requirement does not apply. However, the lender has the option to continue the flood insurance requirement to protect its financial risk on the loan. A Preferred Risk Policy (PRP) is available for buildings located outside the SFHA. Information about the PRP and how one can apply is enclosed. This determination is based on the flood data presently available. The enclosed documents provide additional information regarding this determination. If you have any questions about this document, please contact the FEMA Map Assistance Center toll free at (877) 336-2627 (877-FEMA MAP) or by letter addressed to the Federal Emergency Management Agency, LOMC Clearinghouse, 847 South Pickett Street, Alexandria, VA 22304-4605. Luis Rodriguez, P.E., Chief Engineering Management Branch Federal Insurance and Mitigation Administration",41.14,-72.347


In [47]:
li_df.to_csv('li_loma.csv')