As I explore the book *Azure OpenAI Essentials: A Practical Guide to Unlocking Generative AI-Powered Innovation with Azure OpenAI*, I am using this Jupyter Notebook to document my learning process and demonstrate how these concepts can be applied to real-world healthcare scenarios.

Suppose I am tasked with creating an enterprise document question-answer solution for a healthcare insurance provider. The solution should leverage all healthcare documents and medical history provided by the insurer to enable members to interact with a chatbot. Members can ask the chatbot questions related to their insurance policies, such as coverage details, claim status, or benefits, and receive accurate and context-aware responses. This solution will utilize Azure OpenAI for generating embeddings and answering prompts, along with Azure Cognitive Search for retrieving the most relevant documents.

**Note**: This task is solely for my personal practice and learning purposes. It is not related to my current job, nor is it intended as advice or a production-ready solution for others to use.

### Objective
The objective of this notebook is to build an enterprise document question-answer solution for a healthcare insurance provider. This solution will:
1. Leverage Azure OpenAI to generate embeddings and answer user queries.
2. Use Azure Cognitive Search to retrieve the most relevant healthcare documents based on user queries.
3. Provide accurate and context-aware responses to users' questions about their insurance policies.

### Workflow
This notebook will follow these steps:
1. **Import Required Packages**: Load all necessary libraries and dependencies.
2. **Set Up Environment Variables**: Configure Azure OpenAI and Azure Cognitive Search credentials.
3. **Initialize Services**: Connect to Azure OpenAI and Azure Cognitive Search.
4. **Define Helper Functions**: Implement functions for document retrieval, embedding generation, and answering queries.
5. **Run the Solution**: Demonstrate the end-to-end process by answering sample user queries.

### Prerequisites
Before running this notebook, ensure the following:
1. **Azure OpenAI Deployment**:
   - You have access to an Azure OpenAI deployment. Please note that access to Azure OpenAI models requires you to be either an enterprise customer or partner with a company email address.
   - Deployed models include GPT-4 or GPT-3.5, configured to handle user queries effectively.

2. **Prompt Engineering**:
   - The system prompt for the Azure OpenAI model using GPT-4 has been designed to act as an AI assistant that helps users find information related to their health insurance policies. This includes:
     - Providing details about coverage, deductibles, and benefits.
     - Answering questions about claim statuses and other policy-related queries.
   - Example system prompt I set up for this task:
     ```
     {"role": "system", "content": "You are an AI assistant that specializes in helping users find information about their health insurance policies. You provide accurate, concise, and context-aware answers based on the user's query and the provided document context."}
     ```
      ```
     {"role": "user", "content": "Will my health insurance cover hearing aids for my son?"}
     ```
      ```
     {"role": "assistant", "content": "Yes, your health insurance will cover hearing aids for your son. However, you may need to pay a $100 copay as part of your plan's coverage policy."}
     ```

3. **Azure Cognitive Search Index**:
   - The Azure Cognitive Search index is populated with healthcare documents, insurance policies, and other relevant data.
   - The index includes an embedding field for vector search to enable semantic retrieval of the most relevant documents.
   - The index name for this task I created is: `azureblob-healthcare-index`.

4. **Environment Variables**:
   - Ensure the following environment variables are set up in a `.env` file:
      - `AZURE_OPENAI_ENDPOINT`
      - `AZURE_OPENAI_API_KEY`
      - `AZURE_OPENAI_DEPLOYMENT`
      - `AZURE_SEARCH_ENDPOINT`
      - `AZURE_SEARCH_API_KEY`
      - `AZURE_SEARCH_INDEX_NAME`


In [None]:
# ASCII diagram: How User asks a question and gets an answer from Azure OpenAI
print("""
+------------------+       Question            +------------------+
|                  | ------------------------> |                  |
|      User        |                           |   Azure OpenAI   |
|                  |                           |   Embeddings     |
+------------------+                           +------------------+
          ^                                           |
          |                                           v
          |                                 +----------------------+
          |                                 |   Vector Database    |
          |                                 | (Azure OpenAI Embeds)| <------ Where it contains an index with <azureblob-healthcare> docs
          |                                 +----------------------+
          |                                           |--------------------- Top k 
          |                                           v
          |                                 +----------------------+
          |                                 |  Azure OpenAI Answer |
          |                                 |       Prompt         |
          |                                 +----------------------+
          |                                           |
          |                                           v
          |                Answer              +------------------+
          -----------------------------------  |                  |
                                               |   Azure OpenAI   |
                                               |   Answering      |
                                               +------------------+
""")


### Step 1: Import Packages

This markdown section provides a clear overview of the notebook's purpose, workflow, and prerequisites, making it easier for myself to understand and follow along. Once I've added this, I can proceed with the Python section to import the required packages.

**Note** that I have also explored their official GitHub repository for code samples to aid my learning process and implementation here. The link is here: https://github.com/PacktPublishing/Azure-OpenAI-Essentials.

In [None]:
# Import packages required for this healthcare-document task
import os
from dotenv import load_dotenv
import openai
from langchain import OpenAI
from langchain.llms import AzureOpenAI
from langchain.retrievers import AzureCognitiveSearchRetriever
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.vectorstores import AzureSearch
from langchain.chains import RetrievalQA
from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

### Step 2: Load Environment Variables and Initialize Azure

In this step, we will load the environment variables required to configure Azure OpenAI and Azure Cognitive Search services. These variables include API keys, endpoints, and deployment names, which are essential for securely connecting to the Azure services.

**Why is this important?**
For this healthcare task, environment variables ensure that sensitive information, such as API keys and endpoints, is not hardcoded into the script. This approach enhances security and allows for easier configuration changes without modifying the code. Properly loading these variables is critical for accessing the healthcare documents and generating accurate responses using Azure OpenAI and Azure Cognitive Search.

**Initialize Azure OpenAI**:

Azure OpenAI is used to generate embeddings for semantic search and to answer user queries. Initializing Azure OpenAI ensures that the script can connect to the correct deployment and use the appropriate model for generating embeddings and responses.

In [None]:
# Load environment variables from the .env file
load_dotenv()

# Azure OpenAI environment variables
OPENAI_API_TYPE = "azure"  # Specify that you are using Azure OpenAI
OPENAI_API_BASE = os.getenv("OPENAI_API_BASE", "https://your-azure-openai-endpoint")  # Replace with your endpoint
OPENAI_API_VERSION = os.getenv("OPENAI_API_VERSION", "2023-03-15-preview")  # Default API version
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "your-azure-openai-api-key")  # Replace with your API key

# Deployment-specific variables for Azure OpenAI
OPENAI_DEPLOYMENT_NAME = os.getenv("OPENAI_DEPLOYMENT_NAME", "your-deployment-name")  # Replace with your deployment name (e.g., gpt-4 or gpt-35-turbo)
OPENAI_EMBEDDING_DEPLOYMENT_NAME = os.getenv("OPENAI_EMBEDDING_DEPLOYMENT_NAME", "text-embedding-ada-002")  # Replace with your embedding deployment name

# Initialize Azure OpenAI
openai.api_type = OPENAI_API_TYPE
openai.api_base = OPENAI_API_BASE
openai.api_version = OPENAI_API_VERSION
openai.api_key = OPENAI_API_KEY

# Print a message to confirm initialization (for learning purposes)
print("Azure OpenAI environment variables loaded and initialized successfully.")

**Initialize Azure Cognitive Search**:

Azure Cognitive Search is used to retrieve the most relevant documents based on the user's query. Initializing this service allows the script to connect to the search index, perform vector-based searches, and retrieve documents that will be used as context for answering user queries. The values provided in the code are **samples only** and should be replaced with actual endpoint, API key, and index name when connecting to a real Azure Cognitive Search service.

These steps are critical for securely and effectively connecting to the Azure services required for this healthcare document question-answer solution. 

In [None]:
# Initialize Azure Cognitive Search with SAMPLE values for learning purposes
VECTOR_STORE_ADDRESS = "https://my-cognitive-search-service.search.windows.net"
VECTOR_STORE_PASSWORD = "1234567890abcdef1234567890abcdef"
AZURE_COGNITIVE_SEARCH_INDEX_NAME = "azureblob-healthcare-index"
AZURE_COGNITIVE_SERVICE_NAME = "my-cognitive-search-service"
AZURE_COGNITIVE_SEARCH_API = "2021-04-30-Preview"

# Print the configuration for confirmation (for learning purposes)
print("Azure Cognitive Search initialized with the following configuration:")
print(f"Endpoint: {VECTOR_STORE_ADDRESS}")
print(f"Index Name: {AZURE_COGNITIVE_SEARCH_INDEX_NAME}")
print(f"Service Name: {AZURE_COGNITIVE_SERVICE_NAME}")
print(f"API Version: {AZURE_COGNITIVE_SEARCH_API}")

### Step 3: Test Connectivity

**Why Test Connectivity with Azure OpenAI?**

Testing connectivity with Azure OpenAI ensures that the environment variables, API keys, and deployment configurations are correctly set up and functional. This step is crucial to verify that the application can successfully communicate with the Azure OpenAI service before proceeding with more complex operations, such as generating embeddings or answering queries. By identifying and resolving connectivity issues early, we can save time and avoid potential errors during the implementation of the solution.

In [None]:
# Using model engine for testing the connectivity OpenAI
llm = AzureOpenAI(engine=OPENAI_DEPLOYMENT_NAME, temperature=0) # Initialize the Azure OpenAI model engine
print(llm('Hello, tell me about yourself.')) # Test the model engine with a sample input

**Example Output**:
Based on the system setup, the response may look like this:

**Response**: I am an AI assistant designed to help users with questions related to their healthcare and insurance policies. I can provide information about coverage, deductibles, claims, and other policy-related details. How can I assist you today?

**OPTIONAL** (Cell 14): The code sample below is an OPTIONAL step to test the end-to-end workflow.

This step demonstrates how to simulate the process of retrieving relevant documents (e.g., health insurance policies) and generating an answer using Azure OpenAI. It assumes that:
1. Relevant documents are retrieved from Azure Cognitive Search (or simulated for learning purposes).
2. The documents are combined into a context to guide Azure OpenAI in answering the user's query.
3. Azure OpenAI generates a response based on the provided context and query.

**Note**: This step assumes that the simple connectivity test using `llm` in **Cell 11** has already been successfully established. Ensure that the `llm` object is properly initialized before running this step.

This is a useful step to validate the integration of document retrieval and answer generation in a real-world scenario.

In [None]:
# OPTIONAL: Test the end-to-end workflow with a sample query

# Define a sample query
query = "What is my current deductible for health insurance?"

# Step 1: Retrieve relevant documents using Azure Cognitive Search
retrieved_docs = [
    "Your current deductible is $500 for individual coverage.",
    "For family coverage, the deductible is $1,000.",
    "Please refer to your policy document for more details."
]  # Simulated results for learning purposes

# Step 2: Combine retrieved documents into a context for the prompt
context = "\n".join(retrieved_docs)
prompt = f"Context: {context}\n\nQuestion: {query}\nAnswer:"

# Step 3: Use Azure OpenAI to generate an answer
try:
    response = llm(prompt)  # Assuming `llm` is already initialized in Cell 11
    print("Generated Answer:")
    print(response)
except Exception as e:
    print("Error generating answer:", e)

**Example Output**:
Based on the provided context, the response may look like this:

**Answer**: Your current deductible is **$500** for individual coverage. For family coverage, the deductible is **$1,000**. If you need more details, please refer to your policy document or contact your insurance provider directly.

### Step 4: Load Healthcare Documents

In this step, we will load healthcare-related documents, such as insurance policies, claims history, and medical records, from PDF files. These documents will be processed and added to the vector search index to enable semantic search and retrieval. 

**Why is this important?**
Loading and indexing these documents is a critical step in building the question-answer solution. By converting the content of these documents into embeddings and storing them in the vector search index, we enable the system to retrieve the most relevant documents based on user queries. This ensures that the AI assistant can provide accurate and context-aware responses.

**Steps**:
1. Read the content of PDF files.
2. Generate embeddings for the document content.
3. Add the documents and their embeddings to the vector search index.

In [None]:
# Define the path to the directory containing healthcare documents
directory_path = "path/to/healthcare/documents"  # Replace with the path to your healthcare documents

# Check if the directory exists
if not os.path.exists(directory_path):
    print(f"Error: The directory '{directory_path}' does not exist.")
else: 
    # Load the healthcare documents from the directory
    loader = DirectoryLoader("path/to/healthcare/documents", glob="*.pdf") # Load only PDF files (you can adjust for specific file types)
    documents = loader.load()

    # Print the number of documents loaded
    print(f"Loaded {len(documents)} healthcare documents.")

    # OPTIONALLY: Print the content of the first document for verification
    if documents:
        print("\nPreview of the first healthcare document:")
        print(documents[0][:500])  # Print the first 500 characters of the first document

### Step 5: Split Healthcare Documents into Chunks

After loading the healthcare documents, the next step is to split them into smaller chunks. This is important because:

1. Generating embeddings for smaller chunks of text ensures that the embeddings capture the semantic meaning of specific sections of the document, rather than the entire document.
2. Splitting healthcare documents into chunks allows the vector search to retrieve the most relevant sections of a document, rather than the entire document, improving the accuracy of the results.
3. Large healthcare documents may exceed the token limits of the embedding model or the OpenAI model. Splitting them into smaller chunks ensures compatibility with these limits.

The following code demonstrates how to split the loaded documents into chunks.

In [None]:
# Define the text splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # Maximum number of characters in each chunk
    chunk_overlap=200  # Overlap between chunks to maintain context
)

# Split the loaded healthcare documents into chunks
if documents:
    document_chunks = text_splitter.split_documents(documents)
    print(f"Split the documents into {len(document_chunks)} chunks.")

    # Optionally, preview the first chunk
    print("\nPreview of the first chunk:")
    print(document_chunks[0])
else:
    print("No documents loaded to split.")

This step above (cell 19) splits the loaded healthcare documents into smaller chunks using a text splitter. Each chunk is limited to 1,000 characters with a 200-character overlap to maintain context between chunks. Splitting healthcare documents ensures efficient embedding generation, preserves semantic meaning, and improves the accuracy of vector-based searches by focusing on specific sections of the documents.

These values (chunk size and overlap) can be adjusted depending on the size and structure of the documents to better suit specific use cases.

### Step 6: Initialize Vector Store and Add Healthcare Document Chunks

In this step, we initialize the vector store using Azure Cognitive Search and add the healthcare document chunks to it. This process involves:
1. Initializing the embedding model to generate vector representations of the document chunks.
2. Configuring the vector store to use Azure Cognitive Search as the backend.
3. Adding the document chunks to the vector store for efficient vector-based searches.

This step ensures that the document chunks are stored in a searchable format, enabling semantic search capabilities for user queries.

In [None]:
# Define the embedding model
deployment_model = "text-embedding-ada-002"  # Sample (Replace with your actual embedding deployment name)
embeddings = OpenAIEmbeddings(
    deployment=deployment_model,
    chunk_size=1,  # Process one document at a time
    openai_api_key=OPENAI_API_KEY,
    openai_api_base=OPENAI_API_BASE,
    openai_api_type=OPENAI_API_TYPE,
    openai_api_version=OPENAI_API_VERSION,
)

# Initialize the vector store
vector_store = AzureSearch(
    azure_search_endpoint=VECTOR_STORE_ADDRESS,
    azure_search_key=VECTOR_STORE_PASSWORD,
    index_name="azureblob-healthcare-index",  # Sample that I created earlier
    embedding_function=embeddings.embed_documents,  # Use embed_documents for adding documents
)

# Add document chunks to the vector store
if document_chunks:
    list_of_docs = vector_store.add_documents(documents=document_chunks)
    print(f"Successfully added {len(document_chunks)} document chunks to the vector store.")
else:
    print("No document chunks available to add to the vector store.")

**Summary**: From Loading Documents to Adding Chunks to Vector Search

1. Loading Healthcare Documents:
   - Healthcare documents are loaded from a specified directory using `DirectoryLoader`, which reads files in supported formats (e.g., PDFs).

2. Splitting Documents into Chunks:
   - The loaded documents are split into smaller, manageable chunks using `RecursiveCharacterTextSplitter`. Each chunk is limited to 1,000 characters with a 200-character overlap to maintain context and improve search accuracy.

3. Initializing the Embedding Model:
   - An embedding model, such as `text-embedding-ada-002`, is initialized to generate vector representations of the document chunks. These embeddings capture the semantic meaning of the text.

4. Adding Chunks to the Vector Store:
   - The document chunks are added to a vector store, such as Azure Cognitive Search, which enables efficient semantic search. This allows the system to retrieve the most relevant sections of the documents based on user queries.

### Step 7: Perform a Similarity Search

In this step, we perform a similarity search to retrieve the top similar healthcare documents that are most relevant to a user's query. This involves comparing the user's query embedding with the embeddings of the document chunks stored in the vector store.

1. Similarity search allows the system to quickly identify and retrieve the most relevant sections of healthcare documents, saving time and improving user experience.

2. By retrieving the top 3 most similar documents, the system can provide accurate and context-aware answers to user queries.

3. Focusing on the most relevant documents ensures that the system generates responses based on the most pertinent information, reducing the likelihood of irrelevant or incorrect answers.

In [None]:
# Perform a similarity search to retrieve the top 3 most relevant healthcare documents
query = "Will my health insurance cover hearing aids for my child?"  # Example user query

# Perform the similarity search
try:
    # Retrieve the top 3 most similar documents
    top_k = 3
    similar_documents = vector_store.similarity_search(query, k=top_k)
    search_type = "similarity"  # Specify the type of search (e.g., "similarity" or "semantic")
    
    # Print the results
    print(f"Top {top_k} most similar documents for the query: '{query}'\n")
    for i, doc in enumerate(similar_documents, start=1):
        print(f"Document {i}:")
        print(doc.page_content)  # Assuming `page_content` contains the text of the document
        print("-" * 50)
except Exception as e:
    print(f"An error occurred during the similarity search: {e}")

**Example Output**:
Based on the similarity search, the system may retrieve the following response:

**Answer**: Yes, your health insurance will cover hearing aids for your child. However, you may need to copay $100 as part of your plan's coverage policy.

The response is supported by the following similar documents retrieved from the vector store:
1. **Document 1**: "Hearing aids are covered under your health insurance plan with a $100 copay for each device."
2. **Document 2**: "Your health insurance policy includes coverage for hearing aids for dependents under the age of 18."
3. **Document 3**: "Please refer to Section 5 of your policy document for details on hearing aid coverage and associated costs."

### Step 8: Create a Question-Answering (QA) System Using RetrievalQA

In this step, we initialize a question-answering (QA) system using the `RetrievalQA` chain from `langchain`. This chain combines the power of Azure OpenAI for generating answers with the vector store for retrieving relevant healthcare documents. 

The QA system works as follows:
1. A user query is passed to the chain.
2. The chain retrieves the most relevant documents from the vector store using similarity search.
3. Azure OpenAI generates a context-aware answer based on the retrieved documents.

This step enables the system to provide accurate and contextually relevant answers to user queries about healthcare and insurance policies.

In [None]:
# Initialize the RetrievalQA chain
chain = RetrievalQA.from_chain_type(
    llm=AzureOpenAI(
        deployment_name=OPENAI_DEPLOYMENT_NAME,  # Replace with your Azure OpenAI deployment name
        model=OPENAI_DEPLOYMENT_NAME,  # Use the same model for the LLM
        temperature=0  # OPTIONAL: Set temperature to 0 for deterministic answers
    ),
    chain_type="stuff",  # Concatenates all retrieved documents into a single string
    retriever=vector_search.as_retriever(),  # Use the vector store as the retriever
    return_source_documents=True  # Include source documents in the output
)

#### Explanation of `chain_type` and `temperature`

The `chain_type` parameter determines how retrieved documents are processed before being passed to the LLM. Common options include `stuff` (concatenates all documents into a single string), `map_reduce` (processes documents individually and combines results), and `refine` (iteratively refines the answer using additional documents). For small datasets, `stuff` is efficient, while `map_reduce` or `refine` are better for handling larger datasets or ensuring context-aware answers.

The `temperature` parameter controls the randomness of the model's output, ranging from `0` (deterministic and consistent answers) to `1` (creative and diverse responses). For this healthcare use case, setting `temperature=0` is recommended to ensure accurate and reliable answers. **If you have not set the temperature in your own GPT model deployment**, adding the `temperature` parameter in this cell is optional. For more details, refer to the [LangChain Documentation](https://langchain.readthedocs.io/).

### Step 9: Test the QA System with a Query

In this step, we test the QA system by providing a sample query, such as "Will my health insurance cover hearing aids for my child?". The system retrieves relevant healthcare documents and insurance policies from the vector store and generates an accurate, context-aware answer based on the retrieved information.

In [None]:
query = "Will my health insurance cover hearing aids for my child?"

try:
    result = chain({"query": query})  # Use chain() to get both the answer and source documents

    # Print the generated answer
    print("Answer:")
    print(result['result'])  # The generated answer

    # Print the source documents
    print("\nSource Documents:")
    if 'source_documents' in result and result['source_documents']:
        for i, doc in enumerate(result['source_documents'], start=1):
            print(f"Document {i}: {doc.page_content}")
    else:
        print("No source documents were retrieved.")
except Exception as e:
    print(f"An error occurred while processing the query: {e}")

### Summary of the Healthcare Document Question-Answering System

This end-to-end solution demonstrates how to build a healthcare document question-answering system using Azure OpenAI and Azure Cognitive Search. The process begins by loading healthcare documents with `DirectoryLoader` and splitting them into smaller chunks using `RecursiveCharacterTextSplitter` to ensure efficient embedding generation and semantic accuracy. These chunks are embedded using an Azure OpenAI embedding model and stored in a vector store powered by Azure Cognitive Search, enabling semantic search capabilities. A similarity search retrieves the most relevant documents for a user query, and a `RetrievalQA` chain is created using Azure OpenAI to generate accurate, context-aware answers based on the retrieved documents. This solution effectively integrates document retrieval, embeddings, and language models to address real-world healthcare use cases.

**Note**: This project is part of my learning process and is not intended for production use.