# Strategic Leadership Copilot

## Plan

Build a copilot within my data and ai leadership course for mcmaster that allows students to query course information. This includes text content, case studies etc.

Requirements
- Chatbot functionality
- RAG on course content
- Students can send their documents to chat with
- For prompt engineering, create a system that allows me to assess their prompts

## Draft Architecture

- Frontend: Static Web App - Azure
- Backend: App Service - Azure: 
  - Makes calls to GPT 
  - Persists state
  - 
  - 
- Document Storage & Upload: Blob for document upload and RAG on course content.
- Chat memory: [Need to find cheapest option] Cosmos DB used in other architectures, here I 
- Key Vault: Authentication & secrets [in production, in dev we just doing whatever]

### Feature List - Frontend
- Chat interface
- Upload button
- Markdown rendering

In [1]:
from openai import AzureOpenAI
from flask import Flask, request, jsonify
import json

user_message = "Hey there, how are you?"
try:

    # https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#rest-api-versioning
    client = AzureOpenAI(
        # if no key specified gets the API Key from environment variable AZURE_OPENAI_API_KEY
        api_key = "a7c02d6cccc44acaa1207229de52ef26",
        api_version="2024-08-01-preview",
        # https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource
        azure_endpoint="https://open-ai-service-class.openai.azure.com/",
    )


    completion = client.chat.completions.create(
    model="gpt-4o-mini",  # e.g. gpt-35-instant
    messages=[
        {
            "role": "user",
            "content": user_message,
        },
    ],
    )
    print(completion.choices[0].text.strip())
except Exception as e:
    print("Sorry, I encountered an error.")

Sorry, I encountered an error.


In [None]:
# https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#rest-api-versioning
client = AzureOpenAI(
    # if no key specified gets the API Key from environment variable AZURE_OPENAI_API_KEY
    api_key = "a7c02d6cccc44acaa1207229de52ef26",
    api_version="2024-08-01-preview",
    # https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource
    azure_endpoint="https://open-ai-service-class.openai.azure.com/",
)


completion = client.chat.completions.create(
model="gpt-4o-mini",  # e.g. gpt-35-instant
messages=[
    {
        "role": "user",
        "content": user_message,
    },
],
)
print(completion.choices[0].message.content)

In [7]:
import requests

start_url = "https://daiol-chatbot-c7c6bhf0cghgdtdj.canadacentral-01.azurewebsites.net/api/start"

response = requests.get(start_url)
if response.status_code == 200:
    session_token = response.json().get("token")
else:
    print(f"Failed to start a session. Status code: {response.status_code}")
    exit()

chat_url = "https://daiol-chatbot-c7c6bhf0cghgdtdj.canadacentral-01.azurewebsites.net/api/chat"

headers = {
    "Authorization": session_token
}

payload = {
    "message": "Explain to me personality theory in a few words"
}


# Make a POST request to the API
try:
    response = requests.post(chat_url, headers=headers, json=payload)

    if response.status_code == 200:
        print("Response from chatbot:", response.json()["response"])
    else:
        print(f"Failed to get a valid response. Status code: {response.status_code}")
        print("Response content:", response.text)
        
except Exception as e:
    print("An error occurred:", e)

Response from chatbot: Personality theory explores the consistent patterns in thoughts, feelings, and behaviors that characterize individuals. Key frameworks include:

1. **Trait Theory**: Focuses on specific traits (e.g., the Big Five: openness, conscientiousness, extraversion, agreeableness, neuroticism) that describe personalities.
  
2. **Psychoanalytic Theory**: Emphasizes unconscious motives and childhood experiences (e.g., Freud's id, ego, superego).
  
3. **


# Appendix

It's absolutely feasible to have users upload documents and then perform RAG on those documents, but there are practical considerations that often guide whether this is the best approach. Here are some reasons why RAG might not be used dynamically on user-uploaded documents, along with when it can be feasible:

### Practical Considerations for RAG on User-Uploaded Documents

1. **Document Indexing and Processing Time**:
   - RAG works best when documents are pre-processed, indexed, and stored in a format that allows fast retrieval. If a user uploads a new document every time, the system would need to index it on-the-fly, which could introduce delays.
   - The indexing process involves generating embeddings (vectors) for each chunk, which can be computationally expensive, especially for long documents. Doing this dynamically for every upload can affect the response time.

2. **Scalability and Performance**:
   - RAG systems are generally optimized for larger, static datasets where multiple queries can be handled efficiently after the data is indexed once. For dynamic, ad-hoc uploads, the system would have to repeat the indexing process for each document, which might not scale well.
   - Performance can degrade if the system needs to repeatedly perform retrieval operations on freshly uploaded, non-indexed data.

3. **Security and Data Privacy**:
   - Storing documents for indexing and retrieval might raise concerns about data privacy and security. Ensuring the safety of sensitive data can be more challenging when handling dynamic uploads.
   - When documents are static and stored securely, it’s easier to enforce robust privacy policies.

4. **Token Limitations**:
   - If RAG is not used and a large document is uploaded, the entire document might exceed the token limits of language models like GPT. Chunking and retrieval help mitigate this by only sending relevant parts to the model.
   - RAG helps with large static data, but dynamic processing would still need a way to manage how much text can be passed to the model at once.

### When Dynamic RAG Makes Sense

1. **Short Documents or Limited Use Cases**:
   - If the uploaded documents are relatively short (or can be chunked into manageable pieces quickly), then on-the-fly RAG could work well. 
   - For example, if users are uploading customer support tickets or small reports, you could dynamically index these documents for RAG without too much delay.

2. **Building a Temporary Index**:
   - You can design a system to temporarily index the uploaded document, perform RAG to answer user queries, and then discard the index after a session ends. This would allow dynamic interaction without long-term storage.
   - This might be useful for applications where users need quick, document-specific answers but don’t require persistent storage.

3. **Hybrid Approach**:
   - For frequently uploaded, similar types of documents, you can use a hybrid approach where some static elements are pre-indexed, and dynamic parts are added temporarily when uploaded.
   - For example, a legal document search tool might have a database of standard contracts (pre-indexed) and allow users to add a new contract temporarily for more specific searches.

### Why RAG Isn't Used Dynamically in My Current Setup

In my case, the process doesn’t involve dynamically storing and indexing user-uploaded documents for several reasons:
- **Real-Time Response Needs**: The system aims to provide quick, immediate answers without the overhead of dynamically creating indices, which can introduce delays.
- **Privacy Concerns**: Storing user documents temporarily or permanently for indexing could raise privacy issues. Analyzing them directly without indexing avoids these complications.
- **Simplicity and Efficiency**: The current design focuses on efficiently understanding and responding to the provided content without building and querying a temporary index each time.

### Feasibility for Your Application

If you’re considering using RAG dynamically:
- **Optimize for Speed**: Use fast indexing methods and ensure the retriever is efficient. You may need a powerful backend that can handle embeddings and retrieval quickly.
- **Limit Document Size**: Consider placing limits on document size or chunking them so the system can work within reasonable time constraints.
- **Ensure Data Security**: Build strong encryption and privacy protocols, especially if you’re working with sensitive user data.

In summary, dynamic RAG is possible and can be effective for some use cases, but it requires careful engineering to handle indexing, retrieval speed, and data privacy. For large, frequently queried datasets, static RAG is generally more efficient and practical.

```
Prompt:

Okay so if I'm understanding this correctly, let me explain to you what I'm thinking and you offer any suggestions if I'm incorect: I'm building a course in Data, AI and Organizational Leadership and want to add a strategic leadership chatbot onto it. Would it make sense to perform RAG on the course content (textbook, data etc) to allow the course informaiton to be quickly and accurately retrieved, but allow students to upload docs and simply send to you through the api? Or is there a better way?
```

Your approach makes a lot of sense! Here's a breakdown of how it could work and why it’s effective, along with some suggestions:

### 1. **Pre-Index Course Content Using RAG**
   - **Index Core Course Materials**: You can use RAG to pre-process and index all your core course content (textbooks, lecture notes, data sets, etc.). This will allow the chatbot to quickly retrieve relevant information when students ask questions. By doing this, you're essentially creating a knowledge base that the chatbot can query to provide accurate and context-specific answers.
   - **Benefits**: 
     - **Fast and Efficient**: Because the content is already indexed, retrieval will be much faster compared to processing new data dynamically each time.
     - **Scalable**: As your course grows or gets updated, you can simply re-index the new content.

### 2. **Allow Students to Upload Documents Dynamically**
   - **Handle Student Uploads Separately**: When students upload their documents (e.g., assignments, personal notes, external reading materials), you can treat this as a dynamic, on-the-fly process. Instead of indexing these documents permanently, you can send the content to the OpenAI API to analyze or answer specific queries. This way, each upload is treated independently.
   - **Benefits**:
     - **Privacy**: Since these uploads are processed dynamically, you don’t have to store or index them, which minimizes data security concerns.
     - **Flexibility**: Students can get help analyzing their unique documents without affecting the core RAG system.

### 3. **How It Would Work Together**
   - **Core Course Queries**: When students ask questions related to the course, the chatbot uses RAG to pull information from pre-indexed materials. For example, if a student asks, "What is the role of AI in strategic leadership?", the system retrieves relevant content from the course and generates an answer.
   - **Dynamic Document Queries**: If a student uploads a case study or their notes and asks a question, you would extract the text from the uploaded file and send it directly to the API for processing. This could include summarizing the document, answering questions, or analyzing specific parts.

### 4. **Suggestions for Implementation**
   - **Hybrid Query System**: Build a system where the chatbot first checks if a question can be answered using the core course content (via RAG). If not, it can then offer to let the student upload a document for a more personalized answer.
   - **Chunking and Context Management**: For larger student documents, you might want to use chunking to ensure that each piece of text sent to the API stays within token limits. Maintain context between chunks if the query requires it.
   - **User Experience**: Make the process smooth for students by clearly guiding them on how to ask questions or upload documents. For example, the chatbot could say, "I can answer questions based on our course materials. If you’d like to analyze something external, please upload your document."

### **Example Workflow**
1. **Student Query 1**: "Can you explain the concept of machine learning from the course?"
   - The chatbot retrieves pre-indexed material using RAG and provides a concise explanation.
   
2. **Student Query 2**: "Here’s a report I’m working on. Can you help summarize the main points?" (uploads a file)
   - The chatbot extracts the content, sends it to the OpenAI API for summarization, and provides a response.

### **Why This Works Well**
- **Speed and Efficiency**: Pre-indexing allows for fast retrieval, while dynamic processing gives flexibility for unique requests.
- **Scalability**: You can handle many students at once because most common queries are handled by the pre-indexed data.
- **Better User Experience**: Students get quick answers to common course-related questions and can still receive personalized help when needed.

By combining the strengths of RAG for core content and dynamic processing for uploads, you can create a robust and responsive strategic leadership chatbot.

## Appendix B - Chat memory example

Below is the code for how we get our chatbot to maintain a history of conversations - just some demo code - I have since updated this

```
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

messages = [{"role": "system", "content": "You are a helpful math tutor."}]
user_msgs = ["Explain what pi is.", "Summarize this in two bullet points."]

for q in user_msgs:
    print("User: ", q)
    
    # Create a dictionary for the user message from q and append to messages
    user_dict = {"role": "user", "content": q}
    messages.append(user_dict)
    
    # Create the API request
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages = messages,
        max_tokens=100
    )
    
    # Convert the assistant's message to a dict and append to messages
    assistant_dict = {"role": "assistant", "content": response.choices[0].message.content}
    messages.append(assistant_dict)
    print("Assistant: ", response.choices[0].message.content, "\n")
```