#### **3. Response Generation Task:**
* In this notebook, I have performed a response generation task using **Retrieval-Augmented Generation (RAG)**. This RAG-based response generation is implemented using a locally installed LLM and text embedding models via Ollama. Specifically, I used **llama3.1:8b** as the LLM and **mxbai-embed-large** as the text embedding model.

* **Step 1:** To build the RAG-based response generation system, I used a fraction (0.5%) of the `filtered_enron_emails.csv` dataset as the knowledge base. I chose only 0.5% of the data due to memory and computation constraints.

* **Step 2:** I chunked the knowledge base data into segments of 500 characters with an overlap of 50 characters.

* **Step 3:** These chunks were converted into vector embeddings using the `mxbai-embed-large` model (which outputs 1024-dimensional vectors) and stored in an on-disk `Faiss-Index`.

* **Step 4:** Based on a given query, similar documents were retrieved from the `Faiss index`. Using prompt engineering techniques, I then generated responses with the help of the LLM `llama3.1:8b`. All steps are systematically demonstrated in this notebook.

In [None]:
import pandas as pd

In [None]:
# read the data
df = pd.read_csv('../data/filtered_enron_emails.csv')
sampled_df = df.sample(frac=0.005, random_state=47).reset_index(drop=True)

In [3]:
#Filter emails where body length >= 500 characters
filtered_docs = sampled_df['body'].dropna()
filtered_docs = filtered_docs[filtered_docs.str.len() >= 500].tolist()

# Now filtered_docs contains each email body (≥ 500 chars) as a single document
print(f"Total documents with len >= 500: {len(filtered_docs)}")

Total documents with len >= 500: 863


In [None]:
# perform chuck creation using RecursiveCharacterTextSplitter 
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,  # slight overlap preserves context
    separators=["\n\n", "\n", ".", " ", ""],  # smart fallback if no newlines
)

docs = splitter.create_documents(filtered_docs)
chunks = [doc.page_content for doc in docs]

In [5]:
len(chunks)

7600

In [None]:
# Here, I'm generating embeddings for each text chunk in parallel 
# by utilizing multiple CPU cores. This speeds up the process by 
# running several embedding requests at the same time.

from concurrent.futures import ThreadPoolExecutor, as_completed
from tqdm import tqdm
import requests

def get_ollama_embedding(text, model="mxbai-embed-large"):
    response = requests.post(
        "http://localhost:11434/api/embeddings",
        json={"model": model, "prompt": text}
    )
    return response.json()["embedding"]

def embed_chunks_parallel(chunks, max_workers=10):
    embeddings = [None] * len(chunks)
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {
            executor.submit(get_ollama_embedding, chunk): idx
            for idx, chunk in enumerate(chunks)
        }
        for future in tqdm(as_completed(futures), total=len(chunks), desc="Embedding chunks"):
            idx = futures[future]
            try:
                embeddings[idx] = future.result()
            except Exception as e:
                print(f"❌ Chunk {idx} failed: {e}")
    return embeddings

In [7]:
from langchain.embeddings.base import Embeddings

class OllamaEmbeddingFunction(Embeddings):
    def embed_documents(self, texts):
        embeddings = []
        for text in tqdm(texts, desc="Embedding via Ollama"):
            response = requests.post(
                "http://localhost:11434/api/embeddings",
                json={"model": "mxbai-embed-large", "prompt": text}
            )
            embeddings.append(response.json()["embedding"])
        return embeddings

    def embed_query(self, text):
        response = requests.post(
            "http://localhost:11434/api/embeddings",
            json={"model": "mxbai-embed-large", "prompt": text}
        )
        return response.json()["embedding"]

In [8]:
from langchain.vectorstores import FAISS
from langchain.docstore.document import Document

# Step 1: Create documents
documents = [Document(page_content=chunk) for chunk in chunks]

# Step 2: Embed in parallel
# # Adjust workers for your CPU/GPU
chunk_embeddings = embed_chunks_parallel(chunks, max_workers=10)  

Embedding chunks: 100%|██████████| 7600/7600 [32:10<00:00,  3.94it/s]  


In [None]:
# initializing a instance of OllamaEmbeddingFunction
embedding_func = OllamaEmbeddingFunction() 

In [None]:
from langchain.vectorstores.faiss import FAISS
from langchain.docstore.in_memory import InMemoryDocstore
from langchain.docstore.document import Document
import numpy as np
import faiss 

# Step 1: Convert to float32 numpy array
embedding_vectors = np.array(chunk_embeddings).astype("float32")

# Step 2: Create FAISS index
dimension = embedding_vectors.shape[1]  ## 1024 - dimensions
faiss_index = faiss.IndexFlatL2(dimension)
faiss_index.add(embedding_vectors)

# Step 3: Wrap documents
docstore = InMemoryDocstore(dict(enumerate(documents)))
index_to_docstore_id = {i: i for i in range(len(documents))}

# Step 4: Create the vectorstore
vectorstore = FAISS(
    index=faiss_index,
    docstore=docstore,
    index_to_docstore_id=index_to_docstore_id,
    embedding_function=embedding_func
)

# Step 5: Save the index
vectorstore.save_local("../faiss_index")

In [22]:
embedding_vectors.shape

(7600, 1024)

In [None]:
# Load the previously saved FAISS index from disk so we can use it for searching (inference).
from langchain.vectorstores import FAISS 

faiss_index = FAISS.load_local(
    folder_path="../faiss_index",
    embeddings=embedding_func,
    allow_dangerous_deserialization=True  # safe only if file is trusted
)


In [None]:
# function to retrieve simmilar docs based on query
def retrieve_similar_docs(query, faiss_index, top_k=3):
    query_embedding = get_ollama_embedding(query)
    results = faiss_index.similarity_search_by_vector(query_embedding, k=top_k)
    return results  


##### **Inferencing:**

In [28]:
query = "Generate an email requesting project status update from a colleague"
similar_docs = retrieve_similar_docs(query, faiss_index)

context = "\n---\n".join([doc.page_content for doc in similar_docs])

In [32]:
context

"peter this email will confirm the site visit by your company at the three plants\n---\n. thus, it is very important that i hear from you. thank you much kim  forwarded by kim nguyenewcenron on 06242002 1006 am  jeff duff 06212002 0622 pm to kim nguyenewcenronenron cc ronald brzezinskiewcenronenron, kevin cousineauewcenronenron, joe chapmanewcenronenron, clemens wstedeveloptwtdetwtde, markus altenschultedeveloptwtdetwtde subject re autodownload tool kim, first, hollis will be out of the office until further notice. i'll be coordinating his tasks for the time being\n---\n. please confirm to me that hal is fully dedicated to this project now and will continue to be until it is completed. i would actually like his to sit with our group at least until the project is complete. i would also like a breakdown of what the other it people listed in the elaboration document will be responsible for, and discuss the 65 day estimation with you. thanks shona"

In [41]:
def generate_answer(query, context):
    prompt = f"""
    You are a helpful assistant. Use the context below to answer or generate the requested email.

    Context:
    {context}

    Instruction:
    {query}

    Response:
    """

    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3.1:8b", "prompt": prompt, "stream": False}
    )
    
    return response.json()["response"]

In [42]:
query = "Generate an email requesting project status update from a colleague"
similar_docs = retrieve_similar_docs(query, faiss_index)

context = "\n---\n".join([doc.page_content for doc in similar_docs])

response = generate_answer(query, context)

In [44]:
print(response)

Here is an email to Peter confirming the site visit and requesting a project status update:

Subject: Confirmation of Site Visit and Project Update Request

Dear Peter,

I hope this email finds you well. I wanted to confirm that your company's site visit at our three plants has been scheduled, as previously discussed.

However, I also wanted to touch base with you regarding the ongoing project. As per Kim's earlier request, could you please confirm to me that Hal is fully dedicated to this project and will continue to be involved until its completion? Additionally, would it be possible for him to sit with our group at least until the project is finished?

Furthermore, I would appreciate an update on the roles and responsibilities of the other IT personnel listed in the elaboration document. Could you also provide me with a breakdown of what each team member will be responsible for during the project?

Lastly, I'd like to discuss the 65-day estimation for this project. Could we schedule

In [45]:
from IPython.display import Markdown, display

In [46]:
display(Markdown(response))

Here is an email to Peter confirming the site visit and requesting a project status update:

Subject: Confirmation of Site Visit and Project Update Request

Dear Peter,

I hope this email finds you well. I wanted to confirm that your company's site visit at our three plants has been scheduled, as previously discussed.

However, I also wanted to touch base with you regarding the ongoing project. As per Kim's earlier request, could you please confirm to me that Hal is fully dedicated to this project and will continue to be involved until its completion? Additionally, would it be possible for him to sit with our group at least until the project is finished?

Furthermore, I would appreciate an update on the roles and responsibilities of the other IT personnel listed in the elaboration document. Could you also provide me with a breakdown of what each team member will be responsible for during the project?

Lastly, I'd like to discuss the 65-day estimation for this project. Could we schedule a meeting or call at your earliest convenience to review the project timeline and any potential roadblocks?

Looking forward to hearing back from you.

Best regards,

[Your Name]

In [47]:
query = "Generate an email to remind a team member about an upcoming task deadline."
similar_docs = retrieve_similar_docs(query, faiss_index)

context = "\n---\n".join([doc.page_content for doc in similar_docs])

response = generate_answer(query, context)

In [48]:

print(response)

Here's a generated email:

Subject: Reminder: Update Your Address Book and Complete Timesheet by Monday, 4pm

Dear [Team Member],

I wanted to follow up on the memo from Constance Charles regarding the rollout of SAP. As mentioned in the memo, it's essential that we update our address book and complete our timesheets as soon as possible.

Could you please make sure to update your address book by providing us with any changes (yes/no) and let us know if you have access to a shared calendar (if yes, which one)? This will help us ensure a smooth transition during the migration process.

Additionally, please don't forget to complete your timesheet for the current period. You can input your time online at [http://ehronline.enron.com](http://ehronline.enron.com), but I'll also continue to email you reminders regarding this task.

Please confirm by responding to this email that you've updated your address book and completed your timesheet by Monday, 4pm. If there are any issues or concerns, f

In [49]:
print(context)

responsible for updating your address book no if yes, who do you have access to a shared calendar no if yes, which shared calendar do you have any distribution groups that messaging maintains for you for mass mailings no if yes, please list here please list all notes databases applications that you currently use in our efforts to plan the exact datetime of your migration, we also will need to know what are your normal work hours from 815 to 600 will you be out of the office in the near future
---
fyi. i sent this monday 4pm if you hear any rumblings
---
.  with the rollout of sap, you have the ability to go online  httpehronline.enron.com and input your time. i will continue to email for timesheets regardless if you go online. this memo is to inform you to complete your timesheet. quick reminder you may receive your present or previous paychecks  eb 3539a if delivery is not setup to your locationmail stop. thank you for your cooperation constance charles human resources associate  anal

In [50]:
display(Markdown(response))

Here's a generated email:

Subject: Reminder: Update Your Address Book and Complete Timesheet by Monday, 4pm

Dear [Team Member],

I wanted to follow up on the memo from Constance Charles regarding the rollout of SAP. As mentioned in the memo, it's essential that we update our address book and complete our timesheets as soon as possible.

Could you please make sure to update your address book by providing us with any changes (yes/no) and let us know if you have access to a shared calendar (if yes, which one)? This will help us ensure a smooth transition during the migration process.

Additionally, please don't forget to complete your timesheet for the current period. You can input your time online at [http://ehronline.enron.com](http://ehronline.enron.com), but I'll also continue to email you reminders regarding this task.

Please confirm by responding to this email that you've updated your address book and completed your timesheet by Monday, 4pm. If there are any issues or concerns, feel free to reach out to me directly.

Best regards,

[Your Name]

In [55]:
query = "Write a mail to urgent meet-up for the entire tech team for some project update."
similar_docs = retrieve_similar_docs(query, faiss_index)

context = "\n---\n".join([doc.page_content for doc in similar_docs])

response = generate_answer(query, context)

In [56]:
print(context)

.bdc6c63.271797aaboundary xmailer windows aol sub 123 i have attached a complete team list, indicating who will be at the site visit. please give me a call if you have any questions. susan flanagan  team list  ext.doc
---
gentlemenwe had agreed recently that we would get together with steve to discuss our crisis management and crisis communication efforts. i am starting to get my ducks in a row and thought we should move this along. i have asked stacy to set up this meeting when everyone can make it. i talked briefly with steve today about what he expects from each of us, and have written the attached in an effort to delineate the major items
---
. the effort of network design and construction currently under way is unprecedented in terms of its scope and complexity and it is important for technical people, who often have highly specialized technical skills, to understand the broad picture. i would appreciate if you could join us for friday afternoon april 28 and saturday april 29. i u

In [57]:
print(response)

Here is an email to the entire tech team with an urgent meet-up request:

Subject: Urgent: Project Update Meet-Up this Friday and Saturday

Dear Team,

I hope this message finds you well. As we approach a critical phase of our ongoing network design and construction project, it's essential that we gather together to discuss its progress and address any challenges that may have arisen.

To ensure everyone is on the same page and aware of the broad picture, I'm requesting your presence at an urgent meet-up this Friday afternoon (April 28) and Saturday (April 29). Your input and insights are crucial in shaping the direction of our project.

Please make every effort to join us on both days. If you have any conflicts or concerns, kindly let me know as soon as possible so we can work out an alternative arrangement.

The meet-up will provide a platform for Steve to share his expectations from each team member and for us to discuss our crisis management and crisis communication efforts.

Date:

In [58]:
display(Markdown(response))

Here is an email to the entire tech team with an urgent meet-up request:

Subject: Urgent: Project Update Meet-Up this Friday and Saturday

Dear Team,

I hope this message finds you well. As we approach a critical phase of our ongoing network design and construction project, it's essential that we gather together to discuss its progress and address any challenges that may have arisen.

To ensure everyone is on the same page and aware of the broad picture, I'm requesting your presence at an urgent meet-up this Friday afternoon (April 28) and Saturday (April 29). Your input and insights are crucial in shaping the direction of our project.

Please make every effort to join us on both days. If you have any conflicts or concerns, kindly let me know as soon as possible so we can work out an alternative arrangement.

The meet-up will provide a platform for Steve to share his expectations from each team member and for us to discuss our crisis management and crisis communication efforts.

Date: Friday, April 28 (afternoon) and Saturday, April 29
Time: To be confirmed soon

Looking forward to seeing you all there. If you have any questions or need further information, please don't hesitate to reach out to me directly.

Best regards,

Susan Flanagan

#### **Observations:**

* The responses generated by the LLM are contextually aligned with the user queries, utilizing information retrieved from the faiss_index.

* The generated answers accurately capture key details such as timing, email subjects, important individuals, and semantic cues.

* The model maintains good contextual alignment, ensuring the responses stay relevant to the content provided.

* However, the quality of the responses is limited in sophistication, likely due to the small size of the underlying knowledge base.

#### **Final Thoughts and Future Work:**

* The current implementation relies on in-memory storage and locally installed LLMs and text embedding models, which work well for small-scale testing and development.

* To make the solution scalable, it can be extended using a NoSQL database for storing vector embeddings and high-end GPU machines to speed up embedding generation.

* By integrating more advanced LLMs and embedding models, we can enable real-time response generation suitable for production-level applications.

* APIs can be developed and deployed to serve various business use cases. As a proof of concept, a local API has already been built in this setup.