## Use of LangChain and Azure OpenAI
### Notebook 3 - Retrieval Augmentation Generation

In [1]:
# Importing required packages
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
from langchain.document_loaders import PyPDFLoader
from langchain_community.vectorstores import FAISS
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import create_retrieval_chain
import os

**Note**: If connecting LangChain to Azure OpenAI endpoint, ensure that you don't have local OPENAI_API_BASE environment variable, as otherwise you will get a type value error, with the following error message - "*As of openai>=1.0.0, Azure endpoints should be specified via the `azure_endpoint` param not `openai_api_base` (or alias `base_url`)*". 

In [2]:
# Extracting environment variables
AOAI_API_BASE = os.getenv("AZURE_OPENAI_API_BASE")
AOAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
AOAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION")
AOAI_DEPLOYMENT1 = os.getenv("AZURE_OPENAI_API_DEPLOY")
AOAI_DEPLOYMENT2 = os.getenv("AZURE_OPENAI_API_DEPLOY_EMBED")

In [3]:
# Creating an instance of Azure OpenAI GPT-4
llm = AzureChatOpenAI(
    api_key = AOAI_API_KEY,
    api_version = AOAI_API_VERSION,
    azure_endpoint = AOAI_API_BASE,
    azure_deployment = AOAI_DEPLOYMENT1,
)

In [4]:
# Creating an instance of Azure OpenAI Embeddings
embeddings = AzureOpenAIEmbeddings(
    api_key = AOAI_API_KEY,
    api_version = AOAI_API_VERSION,
    azure_endpoint = AOAI_API_BASE,
    azure_deployment = AOAI_DEPLOYMENT2,
)

In [5]:
# Creating array of pages
loader = PyPDFLoader("data/NorthwindHealthPlus_BenefitsDetails.pdf")
pages = loader.load_and_split()

In [6]:
# Checking number of loaded pages
print(f"Number of pages in a new array: {len(pages)}")
print(f"Content of 1st page: {pages[0]}")

Number of pages in a new array: 109
Content of 1st page: page_content='Contoso Electronics  \nNorthwind Health Plus Plan' metadata={'source': 'data/NorthwindHealthPlus_BenefitsDetails.pdf', 'page': 0}


You may need to install FAISS package first with ```pip install faiss-cpu```

In [7]:
# Creating vector store with FAISS
vector = FAISS.from_documents(pages, embeddings)

In [8]:
# Defining document chain
prompt = ChatPromptTemplate.from_template("""
Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}
"""
)

document_chain = create_stuff_documents_chain(llm, prompt)

In [9]:
# Defining retrieval chain
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [10]:
# Testing RAG chain
response = retrieval_chain.invoke({"input": "What is Northwind Health Plus?"})
print(response["answer"])

Northwind Health Plus is a comprehensive group health plan sponsored by Contoso and administered by Northwind Health. It provides participants with a wide range of health benefits and services, including coverage for medical, vision, and dental services, as well as prescription drug coverage, mental health and substance abuse services, and preventive care services. The plan allows members to choose from a variety of in-network providers, and it also covers emergency services. The plan is designed to supplement existing health insurance coverage, and it includes cost-sharing arrangements like co-pays, deductibles, and out-of-pocket maximums. Members are responsible for a portion of the premium, which is deducted from their paycheck, and may also incur additional costs when they receive health care services.
