<a href="https://colab.research.google.com/github/penumsa/BigDL/blob/main/West_2024_Introduction_to_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Requirements

To run this notebook yourself you'll need
 OpenAI API key

*   an OpenAI API key - set to OPENAI_API_KEY
*   a Cohere API key - set the key to COHERE_API_KEY

Set both keys in secrets (see slides for more details)



###Install Required Libraries
In this step, we install the libraries needed for our RAG implementation:
- **langchain**: For building the retrieval and generation pipeline.
- **openai**: For using OpenAI's language models.
- **faiss-cpu**: For indexing the document embeddings.
- **PyMuPDF**: For reading PDF files.

In [None]:
!pip install --quiet --upgrade langchain langchain-community langchain-openai openai faiss-cpu PyMuPDF tiktoken

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.6/50.6 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m20.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m48.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m386.9/386.9 kB[0m [31m22.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.5/27.5 MB[0m [31m41.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.6/19.6 MB[0m [31m42.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m36.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

# Loading Stage

###Set Up OpenAI API Key
To use OpenAI's models, we need to set up our API key. Replace `YOUR_OPENAI_API_KEY` with your actual API key obtained from OpenAI.


In [None]:
from google.colab import userdata
import os
import openai

## Get the API key from Co-lab Secrets
api_key = userdata.get('OPENAI_API_KEY')
os.environ["OPENAI_API_KEY"] = api_key

# Data Loading

###Load Data from a PDF File
In this step, we load text from a PDF file using PyMuPDF. Make sure to upload your PDF file to the Colab environment and update the path accordingly.
We will read the text from each page of the PDF and print the first 500 characters.


In [None]:
import fitz  # PyMuPDF
import glob

def load_all_pdfs_in_directory(directory="/content/"):
    """Load text from all PDF files in a specified directory."""
    combined_text = ""
    pdf_paths = glob.glob(f"{directory}*.pdf")  # Get all PDF files in the directory

    for file_path in pdf_paths:
        with fitz.open(file_path) as pdf:
            for page in pdf:
                combined_text += page.get_text()

    return combined_text

# Load all PDFs in /content/ directory
pdf_data = load_all_pdfs_in_directory()

# Print the first 500 characters of the loaded data
print(pdf_data[:500])

Large Language Model Agent in Financial Trading: A Survey
Han Ding∗
hd2412@columbia.edu
Columbia University
New York, NY, USA
Yinheng Li∗
yl4039@columbia.edu
Columbia University
New York, NY, USA
Junhao Wang∗
jw3668@columbia.edu
Columbia University
New York, NY, USA
Hang Chen∗
hc2798@nyu.edu
New York University
New York, NY, USA
ABSTRACT
Trading is a highly competitive task that requires a combination of
strategy, knowledge, and psychological fortitude. With the recent
success of large language 


# Indexing

###Text Splitting for Chunking
Here, we split the loaded PDF data into smaller chunks for efficient retrieval. Each chunk will be around 1000 characters with an overlap of 100 characters. This allows us to maintain context between chunks.


In [None]:
# prompt: using langchaing text splitter split pdf_data into chunks of size 1000 and overlap 100

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=100
)
documents = text_splitter.split_text(pdf_data)
print(len(documents))

191


In [None]:
documents[0]

'Large Language Model Agent in Financial Trading: A Survey\nHan Ding∗\nhd2412@columbia.edu\nColumbia University\nNew York, NY, USA\nYinheng Li∗\nyl4039@columbia.edu\nColumbia University\nNew York, NY, USA\nJunhao Wang∗\njw3668@columbia.edu\nColumbia University\nNew York, NY, USA\nHang Chen∗\nhc2798@nyu.edu\nNew York University\nNew York, NY, USA\nABSTRACT\nTrading is a highly competitive task that requires a combination of\nstrategy, knowledge, and psychological fortitude. With the recent\nsuccess of large language models(LLMs), it is appealing to apply the\nemerging intelligence of LLM agents in this competitive arena and\nunderstanding if they can outperform professional traders. In this\nsurvey, we provide a comprehensive review of the current research\non using LLMs as agents in financial trading. We summarize the\ncommon architecture used in the agent, the data inputs, and the\nperformance of LLM trading agents in backtesting as well as the\nchallenges presented in these research.

In [None]:
documents[1]

'challenges presented in these research. This survey aims to provide\ninsights into the current state of LLM-based financial trading agents\nand outline future research directions in this field.\nCCS CONCEPTS\n• Computing methodologies →Natural language processing;\nInformation extraction; Intelligent agents.\nKEYWORDS\nLarge Language Models, Agent, Asset Management, Quantitative\nTrading\n1\nINTRODUCTION\nRecent advances in large language models (LLMs) have revolution-\nized research in natural language processing and demonstrated\nsignificant potential in powering autonomous agents [46]. LLM\nagents have been applied across various domains, such as health-\ncare [32] and education [59]. In addition, the finance sector has seen\nlots of exploration of LLM applications [23, 26]. There has been\na emerging trend of developing LLM powered agents for trading\nin financial markets. Professional traders are required to process\namount of information from various sources and quickly make'

###Creating Embeddings Using OpenAI
In this step, we initialize OpenAI embeddings and use them to convert each chunk of text into vector representations. These embeddings allow us to perform similarity searches later.


In [None]:
from langchain_openai import OpenAIEmbeddings

# Initialize OpenAI embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

# Embed the document chunks
embedded_documents = embeddings.embed_documents(documents)


#### Why Are Embeddings Important?

Embeddings convert text into numerical vectors, allowing computers to process and understand language effectively. They’re essential for:

- **Retrieval-Augmented Generation (RAG)**: Finding relevant documents to answer questions.
- **Similarity Search**: Identifying similar content by comparing vector similarities.
- **Classification & Clustering**: Grouping content based on semantic similarity.

#### Key Embedding Techniques

1. **Word2Vec**: Maps individual words to vectors based on their context in the text but lacks deeper, phrase-level meaning.

2. **GloVe**: Uses word co-occurrence to capture word relationships, useful for word analogies but limited to single words.

3. **FastText**: Considers sub-word information, handling rare words well, suitable for multilingual tasks; however, it is still context-independent.

4. **BERT**: Transformer-based and context-aware, ideal for sentence-level tasks like question answering, but requires high computational power.

5. **Sentence-BERT**: Captures sentence-level meaning, effective for comparing phrases or sentences in tasks like semantic search.

6. **OpenAI Embeddings (e.g., `text-embedding-ada-002`)**: Advanced embeddings with high contextual accuracy, commonly used in RAG and search systems. Requires OpenAI API and can be costly for large datasets.


In [None]:
# Show Embedding result
print("Embedding for the first document:")
for i, value in enumerate(embedded_documents[0]):
    print(f"Component {i + 1}: {value}")

Embedding for the first document:
Component 1: -0.011556433513760567
Component 2: 0.0014334290754050016
Component 3: 0.026645520702004433
Component 4: -0.03546346351504326
Component 5: -0.007640390656888485
Component 6: 0.013979998417198658
Component 7: -0.019169438630342484
Component 8: 0.041049983352422714
Component 9: -0.010810194537043571
Component 10: -0.06260190904140472
Component 11: 0.0064936475828289986
Component 12: 0.037435177713632584
Component 13: -0.010645885020494461
Component 14: 0.013562378473579884
Component 15: 0.0028908199165016413
Component 16: -0.0018108274089172482
Component 17: 0.023427793756127357
Component 18: -0.02171623520553112
Component 19: 0.0002875415957532823
Component 20: -0.013528146781027317
Component 21: -0.001966579118743539
Component 22: 0.011665972881019115
Component 23: -0.011932975612580776
Component 24: -0.011843974702060223
Component 25: 5.4208219808060676e-05
Component 26: 0.008242858573794365
Component 27: 0.03762686997652054
Component 28: 

# Storing

Here, we index the document embeddings using FAISS, which allows for quick retrieval based on similarity. The FAISS index is saved for later use.


In [None]:
from langchain.vectorstores import FAISS

# Create FAISS index from the embedded documents
faiss_index = FAISS.from_texts(texts=documents, embedding=embeddings)

# Save the FAISS index for future use
faiss_index.save_local("faiss_index")

In [None]:
print(f"Number of documents indexed: {len(documents)}")
print("Sample vector from first document:", faiss_index.index.reconstruct(0))

Number of documents indexed: 191
Sample vector from first document: [-0.01155643  0.00143343  0.02664552 ... -0.00979695  0.00748977
 -0.02616628]


###Building the RAG Pipeline (RetrievalQA)
In this step, we create a Retrieval-Augmented Generation (RAG) pipeline. We initialize an OpenAI model and set up a retriever using the FAISS index. This allows us to query the PDF data. Here, we run a sample query to get an answer based on the content of the PDF.

### Loading the FAISS VectorStore

The following line loads a FAISS vector store from a local file:

```python
vectorstore = FAISS.load_local("/content/faiss_index", embeddings, allow_dangerous_deserialization=True)


In [None]:
# Load the FAISS VectorStore
vectorstore = FAISS.load_local("/content/faiss_index", embeddings, allow_dangerous_deserialization=True)

# Display some basic information about the vectorstore
print("VectorStore loaded successfully!")
print(f"Number of vectors in the VectorStore: {vectorstore.index.ntotal}")
print(f"Dimensionality of the vectors: {vectorstore.index.d}")

VectorStore loaded successfully!
Number of vectors in the VectorStore: 191
Dimensionality of the vectors: 1536


### Creating the Retriever from the FAISS Index

This line creates a retriever that allows us to search the FAISS index for similar documents based on a query. The `search_type="similarity"` option specifies that we want to retrieve the top `k` (in this case, 6) most similar documents to the input query.

```python
# Create the retriever from the FAISS index
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})


The line of code creates a retriever from the vector store, designed to retrieve the top 6 most similar documents based on a given query. Similarity search works by comparing the embeddings of the query with the embeddings of documents stored in the vector store. Each document and query is represented as a high-dimensional vector, and the retriever calculates the distance or similarity between these vectors using metrics like cosine similarity or Euclidean distance. The top 6 documents with the smallest distances (or highest similarity scores) are then returned, allowing for efficient and contextually relevant information retrieval based on the input query.

In [None]:
# Helper function for printing docs
def pretty_print_docs(docs):
    print(
        f"\n{'-' * 100}\n".join(
            [f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]
        )
    )

In [None]:
! pip install --quiet langchain-cohere

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/44.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.0/44.0 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/248.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m245.8/248.1 kB[0m [31m10.7 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m248.1/248.1 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m208.1/208.1 kB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m30.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
vectorstore = FAISS.load_local("/content/faiss_index", embeddings, allow_dangerous_deserialization=True)

# Create base retriever using similarity search
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k":10})

In [None]:
query = "What are these documents about?"
docs = retriever.invoke(query)
pretty_print_docs(docs)

Document 1:

cial traders. We found all LLM powered agents reviewed in this
paper use textual financial data as input. Based on the terminology
commonly used in the financial industry, we categorize textual data
into two types: Fundamental Data and Alternative Data.
3.2.1
Fundamental Data. Fundamental data encompasses infor-
mation that represents the primary characteristics and financial
metrics for assessing the stability and health of an asset. Funda-
mental data used in LLM trading agents includes financial reports
and analyst reports.
Financial Reports. Financial reports, such as Form 10-Q and Form
10-K filings, are critical for understanding a company’s performance.
These documents provide LLM agents with insights into corporate
financial status, performance, and future expectations. They are
extensively utilized by financial trading agents like FinMem [53],
TradingGPT [27], and FinAgent [57]. These works incorporate fi-
nancial reports to enrich the agents’ memory and make infor

### ReRanker


####Cohere Reranker Initialization:
The CohereRerank model (rerank-english-v3.0) is set to select the top 3 documents most relevant to the user query.
Contextual Compression Retriever: A ContextualCompressionRetriever is created, combining the Cohere Ranker with an existing retriever to optimize retrieval results.

Query and Reranking Process: When a query like "What are topics covered?" is sent, the retriever uses the Cohere Ranker to refine the results, ensuring the output includes the most contextually relevant documents.

The Cohere Ranker prioritizes highly relevant results, which increases response accuracy in RAG by filtering out less useful documents and focusing on the best matches.

In [None]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain_cohere import CohereRerank
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
import cohere

cohere_api_key = userdata.get('COHERE_API_KEY')
cohere_client = cohere.Client(cohere_api_key)

# Initialize Cohere reranker. select the top 3 documents most relevant to the user query.
cohere_reranker = CohereRerank(client=cohere_client, model="rerank-english-v3.0", top_n=3)

# Create Contextual Compression Retriever with Cohere reranker
compression_retriever = ContextualCompressionRetriever(
    base_compressor=cohere_reranker,
    base_retriever=retriever
)

# Test the reranking retriever with a query
query = "What are topics covered?"
reranked_docs = compression_retriever.get_relevant_documents(query)

# Display results
pretty_print_docs(reranked_docs)

  reranked_docs = compression_retriever.get_relevant_documents(query)


Document 1:

Askell, Peter Welinder, Paul Francis Christiano,
Jan Leike, and Ryan J. Lowe. 2022. Training lan-
guage models to follow instructions with human
feedback. ArXiv.
Soujanya Poria, E. Cambria, Devamanyu Haz-
arika, Navonil Majumder, Amir Zadeh, and Louis-
Philippe Morency. 2017. Context-dependent sen-
timent analysis in user-generated videos. In An-
nual Meeting of the Association for Computa-
tional Linguistics.
Yu Qin and Yi Yang. 2019.
What you say and
how you say it matters: Predicting stock volatility
using verbal and vocal cues. In Annual Meeting
of the Association for Computational Linguistics.
Ramit Sawhney, Mihir Goyal, Prakhar Goel, Puneet
Mathur, and Rajiv Ratn Shah. 2021. Multimodal
multi-speaker merger & acquisition financial mod-
eling: A new task, dataset, and neural baselines.
In Annual Meeting of the Association for Compu-
tational Linguistics.
Ramit Sawhney, Puneet Mathur, Ayush Mangal,
Piyush Khanna, Rajiv Ratn Shah, and Roger
------------------------------

### Interacting & Data Query
Use this part to chat with the data, one of the methods used which prevents the model from repeating it self is to pass the past responses back to it

In [None]:
#Import necessary libraries
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# Initialize the OpenAI chat model
llm = ChatOpenAI(model="gpt-4-0613", temperature=0.5, max_tokens=300)

# Build the RetrievalQA chain
qa_chain = RetrievalQA.from_chain_type(llm, retriever=compression_retriever)

In [None]:
user_query = "What are topics covered?"

In [None]:
prompt = (
    "You are an expert assistant with a strong grasp of the subject matter. "
    "Please answer the following question succinctly, highlighting the key points. "
    f"Format your response as follows:\n\n"
    f"[Your answer here]\n"
    f"Key Points:\n"
    f"- Point 1: [Key insight 1]\n"
    f"- Point 2: [Key insight 2]\n"
    f"- Point 3: [Key insight 3]\n\n"
    f"Ensure your response is relevant and avoid unnecessary elaboration. "
    f"Answer the following question: '{user_query}'"
)

# Get the response using the prompt
response = qa_chain.invoke(prompt)

# Display the formatted response
print(response['result'])

The topics covered include the limitations of large language models (LLMs), ethical concerns related to their use, and their performance in the medical field.

Key Points:
- Point 1: The limitations of LLMs are discussed, including their 'black box' issues, which refer to the unclear methods they use to generate answers from input queries and data. Efforts to overcome these limitations include explainable AI research and development.
- Point 2: Ethical concerns related to the use of LLMs are highlighted. These include the risk of dangerous or offensive responses, privacy and security breaches, and lack of accountability for model outputs. Solutions suggested include finetuning the models, establishing governance systems, and creating a reporting system for users.
- Point 3: The performance of AI in the medical field is also discussed. While AI has shown potential, such as in the case of ChatGPT which has been preferred over doctors in terms of response quality and empathy, it also has 

In [None]:
from IPython.display import display, HTML
display(HTML("<style>pre { white-space: pre-wrap; }</style>"))  #wrap the outputtext


# Create a function to handle user queries
def chat_with_assistant():
    print("Chat with the assistant! Type 'exit' to end the chat.")

    previous_responses = []  # Keep track of previous responses

    while True:
        # Get user input
        user_query = input("You: ")

        # Check if the user wants to exit
        if user_query.lower() == 'exit':
            print("Ending chat.")
            break

        # Create a structured prompt for the model
        prompt = (
            f"You are a knowledgeable assistant with expert-level insight into the content. "
            f"Based on the retrieved document information, please provide a concise and focused answer to the question: "
            f"'{user_query}'. Highlight key insights or examples, ensuring no repetition of prior responses."
        )

        # Get the response using the prompt
        response = qa_chain.invoke(prompt)

        # Format and enhance the response
        enhanced_response = (
            f"{response['result']}\n\n"
        )

        # Print the assistant's response
        print(enhanced_response)

        # Optional: Store the response to help refine future interactions
        previous_responses.append(response)

# Start the chat
chat_with_assistant()

Chat with the assistant! Type 'exit' to end the chat.
You: what are the titles of the documents
The document does not provide specific titles for multiple documents. However, it does mention two studies: 

1. "Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care. JMIR Medical Education 9, e46599 (2023)."
2. "Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA Internal Medicine (2023) doi:10.1001/jamainternmed.2023.1838."

It also refers to several preprints and articles, but their titles are not explicitly stated.


You: who are the authors of Modal-adaptive Knowledge-enhanced Graph-based Financial Prediction from Monetary Policy Conference Calls with LLM
The document does not provide specific names of the authors for the paper titled 'Modal-adaptive Knowledge-enhanced Graph-based Financial Prediction from Monetary Policy Conference Calls w

KeyboardInterrupt: Interrupted by user

###Evaluation
In this step, we evaluate the response generated by the model. We compare the response against a list of expected keywords and calculate an accuracy score. This helps in assessing how well the model understood and answered the query based on the PDF content.


In [None]:
from sklearn.metrics import jaccard_score
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np

# Define sample questions and expected answers for evaluation
evaluation_data = [
    {"question": "What is the main topic of the document?", "expected_answer": "The central theme of the document revolves around the applications and challenges of Language Learning Models (LLMs) within the healthcare sector. It elaborates on how LLMs can enhance operational efficiency by automating tasks such as synthesizing clinical data, generating tailored content for diverse stakeholders, and improving communication. Additionally, the document addresses significant challenges associated with LLMs, such as their inherent opacity in decision-making processes, the risk of generating harmful or biased outputs, and concerns regarding data privacy and ethical implications. The document also suggests strategies to mitigate these challenges, including refining model training, implementing robust regulatory frameworks, and developing feedback mechanisms to ensure accountability."},
    {"question": "How does machine learning help in healthcare?", "expected_answer": "Machine learning plays a vital role in healthcare by analyzing vast datasets to uncover trends and make informed predictions. It can be trained through expert guidance to enhance its performance and reliability. Additionally, it can be tailored for specific applications with validation from healthcare professionals to ensure high accuracy and safety in its outputs. However, there are ongoing challenges, such as the generation of inaccurate information and the potential for errors, which necessitate diligent oversight, particularly in critical healthcare scenarios. As machine learning technologies advance and achieve levels of precision comparable to human specialists, indicators of uncertainty and the involvement of clinicians will remain crucial. Looking ahead, machine learning has the potential to operate in semi-autonomous capacities within the healthcare field."},
    # Add more questions and expected answers as needed
]

# Convert text to Jaccard similarity score
def jaccard_similarity(generated, expected):
    # Convert sentences to tokenized sets
    vectorizer = CountVectorizer(binary=True, stop_words="english")
    tokens = vectorizer.fit_transform([generated, expected])
    tokens = tokens.toarray()
    return jaccard_score(tokens[0], tokens[1], average='binary')

# Evaluation function
def evaluate_assistant(qa_chain, evaluation_data):
    scores = []

    for item in evaluation_data:
        question = item["question"]
        expected_answer = item["expected_answer"]

        # Get the response from the assistant
        prompt = (
            "You are an expert assistant with a strong grasp of the subject matter. "
            "Please answer the following question succinctly, highlighting the key points: "
            f"'{question}'. Ensure that your response is relevant and avoid unnecessary elaboration."
        )

        response = qa_chain.invoke(prompt.format(question=question))["result"]

        # Calculate Jaccard similarity between expected and generated answers
        score = jaccard_similarity(response, expected_answer)
        scores.append(score)

        # Display question, responses, and score
        print(f"Question: {question}")
        print(f"Generated Answer: {response}")
        print(f"Expected Answer: {expected_answer}")
        print(f"Similarity Score: {score}\n")

    # Average similarity score
    average_score = np.mean(scores)
    print(f"Average Similarity Score across questions: {average_score:.2f}")

# Run the evaluation
evaluate_assistant(qa_chain, evaluation_data)