# Introduction

This notebook downloads PDF files from FEMA and processes them with Open AI's GPT-4 to create bot for answering questions about preparing for various disasters.

# Setup
## Environment
This notebook runs on Python 3.9 and package versions as outlined in `environment.yml`. A miniconda environment has been supplied, which you can use with ...

1. Install [miniconda](https://docs.conda.io/en/latest/miniconda.html) by selecting the installer that fits your OS version. Once it is installed you may have to restart your terminal (closing your terminal and opening again)
2. In this directory, open terminal
3. `conda env create -f environment.yml`
4. `conda activate stay_safe_bot`

## Data

PDF Files were downloaded from FEMA as noted in the table below, and saved into the folder `./data`.


Note: I didn't automatically download data intentionally, as this is only an exploratory analysis and to adhere to FEMA websites terms and conditions.

# Analysis

In [5]:
import os
from langchain.document_loaders import PyPDFDirectoryLoader
from langchain.document_loaders import PyPDFLoader 
from langchain.embeddings import OpenAIEmbeddings 
from langchain.vectorstores import Chroma 
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.llms import OpenAI

os.environ["OPENAI_API_KEY"] = "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"

pdf_folder_path = f'./data'

## Read PDF Documents

In [6]:

print(os.listdir(pdf_folder_path))
loader = PyPDFDirectoryLoader(pdf_folder_path)
docs = loader.load()

['fema_scenario_10_power_outage_answer_key_01102020.pdf', 'LanchainPDFs-20230814T133731Z-001.zip', 'fema_scenario_7-shelter_in_place_TTX_answer_key_01102020.pdf', 'ready_12-ways-to-prepare_postcard.pdf', 'fema_safeguard-critical-documents-and-valuables.pdf', 'ready_document-and-insure-your-property.pdf', 'fema_scenario_7_shelter_in_place_01102020.pdf', 'fema_scenario_1-active_shooter-01102020.pdf', '.DS_Store', 'fema_protect-your-property_wildfire.pdf', 'fema_proteja-su-propiedad-inundaciones_2023.pdf', 'fema_proteja-su-propiedad-marejada-ciclonica_2023.pdf', 'fema_scenario_4-hurricane-01102020.pdf', 'fema_scenario_10_power_outage_01102020.pdf', 'fema_scenario_4_hurricane_flood_TTX_answer_key-01102020.pdf', 'fema_scenario_11_winter_storm_01102020.pdf', 'fema_protect-your-property_severe-wind.pdf', 'fema_proteja-su-propiedad-terremotos_2023.pdf', 'fema_protect-your-property-storm-surge.pdf', 'fema_scenario_8_earthquake_answer_key_01102020.pdf', 'fema_scenario_2-tornado_TTX_answer_key-01

## Index PDF Documents (calculate LLM embeddings)

In [7]:
embeddings = OpenAIEmbeddings()
vectordb = Chroma.from_documents(docs, embedding=embeddings, 
                                 persist_directory=".")
vectordb.persist()
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
pdf_qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0.8) , vectordb.as_retriever(), memory=memory)

That's it! With just a few lines of code we have indexed content and are all set to test an LLM chat interface.

## Do some test questions

In [10]:
def query_content(query):
    result = pdf_qa({"question": query})
    print(f"Question: \n{query}")
    print(f"\nAnswer:\n{result['answer']}")

query_content("How do I prepare my home for floods?")

Questions: 
How do I prepare my home for floods?

Answer:
 You can create an emergency plan for your family and practice it regularly, get flood insurance, document your belongings, store valuables and important documents above the Base Flood Elevation (BFE), elevate appliances and utilities above the BFE, use flood-resistant materials, and know your property and neighborhood. Additionally, you should determine the BFE for your home, which is used in floodplain management regulations in your community, and contact your local floodplain manager for help.
