# Evaluating RAG

The extent to which you can **evaluate** your system is the extent to which you can **improve** your system. Before going directly to production then, it is in your best interest to establishing a framework for quickly and effectively understanding the quality of your RAG application. In this notebook, we will use the RAGAS framework as proposed by [this paper](https://arxiv.org/pdf/2309.15217) to evaluate the RAG application developed in the previous examples. 

There is no substitute for reading the paper, but summarized below are the main metrics we will work with. Note: there are many more metrics that can be used depending on use case but these are the main ones covered in the paper so we will start there. 

# Quality metric breakdown

The 3 quality metrics in the RAGAS framework are: **faithfulness**, **answer relevance**, and **context relevance**. Let's take a moment to define each and understand how we can arrive at their values.

## Faithfulness

An answer to a question can be said to be "faithful" if the **claims** that are made in the answer **can be inferred** from the **context**.

The process for quantifying this score is as follows:

1. Use the following prompt with an LLM to generate shorter more focused statements provided the question and answer.

    > Given a question and answer, create one
    > or more statements from each sentence
    > in the given answer.
    > question: [question]
    > answer: [answer]

2. For each generated statement, verify if it can be inferred from the context with the following prompt.

    > Consider the given context and following
    > statements, then determine whether they
    > are supported by the information present
    > in the context. Provide a brief explanation for each statement before arriving
    > at the verdict (Yes/No). Provide a final
    > verdict for each statement in order at the
    > end in the given format. Do not deviate
    > from the specified format.
    > statement: [statement 1]
    > ...
    > statement: [statement n]

3. The final score can then be calculated Faithfulness = (number of supported statements) / (total number of statements)

## Answer Relevance

An answer can be said to be relevant if it directly addresses the question (intuitively).

The process for quantifying this score is:

1. Use an LLM to generate "hypothetical" questions to a given answer with the following prompt:

    > Generate a question for the given answer.
    > answer: [answer]

2. Embed the generated "hypothetical" questions as vectors.
3. Calculate the cosine similarity of the hypothetical questions and the original question, sum those similarities, and divide by n.

Expressed computationally: `Answer Relevance = sum(cos_sim((q, q_i) for q_i in n)) / n`

## Context Relevance

"The context is considered relevant to the extent that it exclusively contains information that is needed to answer the question."

The process:

1. Use the following LLM prompt to extract a subset of sentences necessary to answer the question. The context is defined as the formatted search result from the vector database.

    > Please extract relevant sentences from
    > the provided context that can potentially
    > help answer the following `{question}`. If no
    > relevant sentences are found, or if you
    > believe the question cannot be answered
    > from the given context, return the phrase
    > "Insufficient Information". While extracting candidate sentences you’re not allowed to make any changes to sentences
    > from given `{context}`.

2. Compute the context relevance score = (number of extracted sentences) / (total number of sentences in context)

# Let's start coding!

## Step 1: load data

If you just finished the other examples this may already be done for you.


In [3]:
import os
from redisvl.index import SearchIndex
from redisvl.schema import IndexSchema
from redis import Redis

# init Redis connection
# Replace values below with your own if using Redis Cloud instance
REDIS_HOST = os.getenv("REDIS_HOST", "localhost")
REDIS_PORT = os.getenv("REDIS_PORT", "6379")
REDIS_PASSWORD = os.getenv("REDIS_PASSWORD", "")

# If SSL is enabled on the endpoint, use redis:// as the URL prefix
REDIS_URL = f"redis://{REDIS_HOST}:{REDIS_PORT}"
os.environ["REDIS_URL"] = REDIS_URL

index_name = 'langchain'
prefix = 'chunk'
schema = IndexSchema.from_yaml('sec_index.yaml')
client = Redis.from_url(REDIS_URL)
# create an index from schema and the client
index = SearchIndex(schema, client)
index.create(overwrite=True, drop=True)

16:19:41 redisvl.index.index INFO   Index already exists, overwriting.


In [2]:
# check index was created properly
!rvl index info -i langchain

[32m13:24:03[0m [34m[RedisVL][0m [1;30mINFO[0m   Using Redis address from environment variable, REDIS_URL


Index Information:
╭──────────────┬────────────────┬────────────┬─────────────────┬────────────╮
│ Index Name   │ Storage Type   │ Prefixes   │ Index Options   │   Indexing │
├──────────────┼────────────────┼────────────┼─────────────────┼────────────┤
│ langchain    │ HASH           │ ['chunk']  │ []              │          0 │
╰──────────────┴────────────────┴────────────┴─────────────────┴────────────╯
Index Fields:
╭────────────────┬────────────────┬─────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
│ Name           │ Attribute      │ Type    │ Field Option   │ Option Value   │ Field Option   │ Option Value   │ Field Option   │   Option Value │ Field Option    │ Option Value   │
├────────────────┼────────────────┼─────────┼────────────────┼────────────────┼────────────────┼─

# Load the data

In [4]:
# configure env
import json
import os
import warnings
warnings.filterwarnings("ignore")
dir_path = os.getcwd()
parent_directory = os.path.dirname(dir_path)
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["ROOT_DIR"] = parent_directory
# print(dir_path)
# print(parent_directory)

#setting the local downloaded sentence transformer models f
os.environ["TRANSFORMERS_CACHE"] = f"{parent_directory}/models"

In [4]:
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings 
from ingestion import get_sec_data
from ingestion import redis_bulk_upload

embeddings = SentenceTransformerEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2", cache_folder=os.getenv("TRANSFORMERS_CACHE", f"{parent_directory}/models"))
sec_data = get_sec_data()
chunks = redis_bulk_upload(sec_data, index, embeddings, tickers=['AAPL', 'AMZN'])

/Users/robert.shelton/Documents/boa/financial-vss/multi_doc_RAG
/Users/robert.shelton/Documents/boa/financial-vss
 ✅ Loaded doc info for  110 tickers...
✅ Loaded 108 10K chunks for ticker=AAPL from AAPL-2021-10K.pdf
✅ Loaded 94 10K chunks for ticker=AAPL from AAPL-2023-10K.pdf
✅ Loaded 103 10K chunks for ticker=AAPL from AAPL-2022-10K.pdf
✅ Loaded 27 earning_call chunks for ticker=AAPL from 2018-May-01-AAPL.txt
✅ Loaded 31 earning_call chunks for ticker=AAPL from 2019-Oct-30-AAPL.txt
✅ Loaded 30 earning_call chunks for ticker=AAPL from 2016-Jan-26-AAPL.txt
✅ Loaded 31 earning_call chunks for ticker=AAPL from 2020-Jul-30-AAPL.txt
✅ Loaded 30 earning_call chunks for ticker=AAPL from 2017-Aug-01-AAPL.txt
✅ Loaded 29 earning_call chunks for ticker=AAPL from 2020-Jan-28-AAPL.txt
✅ Loaded 34 earning_call chunks for ticker=AAPL from 2016-Apr-26-AAPL.txt
✅ Loaded 29 earning_call chunks for ticker=AAPL from 2017-Jan-31-AAPL.txt
✅ Loaded 28 earning_call chunks for ticker=AAPL from 2019-Apr-30-AA

# Populate index and create vector store

In [9]:
from langchain_community.vectorstores import Redis as LangChainRedis
from utils import create_langchain_schemas_from_redis_schema

index_name = 'langchain'

vec_schema , main_schema = create_langchain_schemas_from_redis_schema('sec_index.yaml')

rds = LangChainRedis.from_existing_index( embedding=embeddings, 
                                          index_name= index_name, 
                                          schema = main_schema)

In [None]:
rds.similarity_search("What was apples revenue last year?")

# Step 2 - Setup RAG

In [27]:
from langchain_community.llms import Ollama
llm = Ollama(model="llama3")

In [7]:
# use openai to get unblocked then get ollama going
# this might need to change but later thought
# import openai
# import os
# import getpass
# from langchain.llms import OpenAI


# CHAT_MODEL = "gpt-3.5-turbo-0125"

# if "OPENAI_API_KEY" not in os.environ:
#     os.environ["OPENAI_API_KEY"] = getpass.getpass("OPENAI_API_KEY")


# llm = OpenAI(openai_api_key=os.getenv("OPENAI_API_KEY"))

In [13]:
def get_prompt():
    """Create the QA chain."""
    from langchain.prompts import PromptTemplate

    # Define our prompt
    prompt_template = """Use the following pieces of context from financial 10k filings data to answer the user question at the end. Only use the result from tools and evidence provided to you. If you don't know the answer, say that you don't know, don't try to make up an answer. Provide the source of the document that you used to get the answer.

    This should be in the following format:

    Question: [question here]
    Answer: [answer here]
    Source: [source document here]

    Begin!

    Context:
    ---------
    {context}
    ---------
    Question: {question}
    Answer:"""

    prompt = PromptTemplate(
        template=prompt_template,
        input_variables=["context", "question"]
    )
    return prompt

In [36]:
from langchain.chains import RetrievalQA

def get_search_kwargs(filters, distance_threshold):
    return {"distance_threshold":distance_threshold,"filter":filters}
    

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=rds.as_retriever(search_type="similarity_distance_threshold",
                               search_kwargs={"distance_threshold":0.8, 'include_metadata': True}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": get_prompt()},
    verbose=True
)

## Test it out

In [37]:
query = "What was Apple's revenue last year compared to this year??"
res=qa(query)
res



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


{'query': "What was Apple's revenue last year compared to this year??",
 'result': "Question: What was Apple's revenue last year compared to this year??\nAnswer: In fiscal year '18, our revenue grew by $36.4 billion.\nSource: https://www.sec.gov/Archives/edgar/data/99662/000119312018061311/d551511d10k.htm (Apple Inc.'s 10-K filing for 2018)",
 'source_documents': [Document(page_content="revenue and earnings in Apple's history. In fiscal year '18, our revenue grew by $36.4 billion. That's the equivalent of a Fortune 100 company in a single year. And we're capping all that off with our best September quarter ever. Revenue was $62.9 billion, ahead of our expectations. That's an increase of 20% over last year and our highest growth rate in 3 years. We also generated record Q4", metadata={'id': 'chunk:2018-Nov-01-AAPL.txt-c7764e65-9866-4559-a1b7-8baa72e6b733', 'chunk_id': '2018-Nov-01-AAPL.txt-c7764e65-9866-4559-a1b7-8baa72e6b733', 'source_doc': '2018-Nov-01-AAPL.txt', 'doc_type': 'earning_

In [37]:
res

{'query': "What was Apple's revenue last year compared to this year??",
 'result': "Question: What was Apple's revenue last year compared to this year??\nAnswer: According to the text, in fiscal year '18, Apple's revenue grew by $36.4 billion. This means that their revenue increased from the previous year.\n\nSource: Apple's 10-K filing for FY2018 (not provided, but based on the text)",
 'source_documents': [Document(page_content="Thank you, Nancy. Good afternoon, everyone, and thanks for joining us. I just got back from Brooklyn, where we marked our fourth major launch at the end of the year. In addition to being a great time, it put an exclamation point at the end of a remarkable fiscal 2018. This year, we shipped our 2 billionth iOS device, celebrated the 10th anniversary of the App Store and achieved the strongest revenue and earnings in Apple's history. In fiscal year '18, our revenue grew by $36.4 billion. That's the equivalent of a Fortune 100 company in a single year. And we're

# Now let's generate a test data set

We will use the convenient TestSetGenerator class from the ragas package to help us quickly get started evaluating our apps. The generator class will take our documents as input and use an LLM to generate feasible questions from our dataset based on slices of context and the critic LLM to extract ground truth data to be measured against.

Note: this methodology has been shown to be effective when creating one's own human labeled test set is not pragmatic. 

In [32]:
flattened_chunks = [item for sublist in chunks for item in sublist]
assert len(flattened_chunks) > len(chunks)
flattened_chunks_sample = flattened_chunks[:100] + flattened_chunks[-100:]

In [34]:
flattened_chunks_sample[0].page_content

'UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549\n\nFORM 10-K\n\n(Mark One)\n\n☒ ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the fiscal year ended September 25, 2021 or ☐ TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934\n\nFor the transition period from to .\n\nCommission File Number: 001-36743\n\nApple Inc.\n\n(Exact name of Registrant as specified in its charter)\n\nCalifornia (State or other jurisdiction of incorporation or organization)\n\n94-2404110 (I.R.S. Employer Identification No.)\n\nOne Apple Park Way Cupertino, California (Address of principal executive offices)\n\n95014 (Zip Code)\n\n(408) 996-1010 (Registrant’s telephone number, including area code)\n\nSecurities registered pursuant to Section 12(b) of the Act:\n\nTitle of each class Common Stock, $0.00001 par value per share\n\n1.000% Notes due 2022 1.375% Notes due 2024 0.000% Notes due 2025 0.875% Notes due 2025

In [38]:
# we will use the testset generator from RAGAS to being our evaluation

from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
from langchain_openai import ChatOpenAI, OpenAIEmbeddings


# generator with openai models
generator_llm = Ollama(model="llama3")
critic_llm = Ollama(model="llama3")
embeddings = OpenAIEmbeddings()

generator = TestsetGenerator.from_langchain(
    generator_llm,
    critic_llm,
    embeddings,
)


testset_sample = generator.generate_with_langchain_docs(flattened_chunks_sample, test_size=10, distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})



embedding nodes:   0%|          | 0/400 [00:00<?, ?it/s]

Filename and doc_id are the same for all nodes.


Generating:   0%|          | 0/10 [00:00<?, ?it/s]

In [48]:
testset_sample.to_pandas()

Unnamed: 0,question,contexts,ground_truth,evolution_type,metadata,episode_done
0,What is the purpose of incorporating portions ...,[Portions of the Registrant’s definitive proxy...,The purpose of incorporating portions of the R...,simple,[{'source': '/Users/robert.shelton/Documents/b...,True
1,What factors should users review before making...,[THE INFORMATION CONTAINED IN EVENT TRANSCRIPT...,The applicable company's conference call itsel...,simple,[{'source': '/Users/robert.shelton/Documents/b...,True
2,How does competition in the market affect the ...,[The Company’s ability to compete successfully...,Competition in the market can affect the suppl...,simple,[{'source': '/Users/robert.shelton/Documents/b...,True
3,What is the purpose of the Investor Relations ...,[---------------------------------------------...,The purpose of the Investor Relations departme...,simple,[{'source': '/Users/robert.shelton/Documents/b...,True
4,What is Amazon's approach to investment levels...,[---------------------------------------------...,We continue to invest heavily on behalf of cus...,simple,[{'source': '/Users/robert.shelton/Documents/b...,True
5,How does the determination of the discount rat...,[The discount rate related to the Company’s le...,The discount rate for lease liabilities is gen...,reasoning,[{'source': '/Users/robert.shelton/Documents/b...,True
6,What intellectual property rights does the Com...,[The Company currently holds a broad collectio...,The Company holds a broad collection of intell...,reasoning,[{'source': '/Users/robert.shelton/Documents/b...,True
7,What are the risks and consequences of using a...,[The Company and its global supply chain are e...,Losses or unauthorized access to or releases o...,multi_context,[{'source': '/Users/robert.shelton/Documents/b...,True
8,What are the obligations and requirements for ...,[In addition to the risks generally relating t...,"Financial data, such as payment card data, is ...",multi_context,[{'source': '/Users/robert.shelton/Documents/b...,True
9,How do trade restrictions and tariffs impact t...,"[The Company has a large, global business, and...",Trade restrictions and tariffs can impact the ...,multi_context,[{'source': '/Users/robert.shelton/Documents/b...,True


In [35]:
# define reusable helper function for evaluating our test set against different chains

from datasets import Dataset
from ragas.metrics import (
    answer_relevancy,
    faithfulness,
    context_relevancy,
)

from ragas import evaluate

def parse_contexts(source_docs):
    return [doc.page_content for doc in source_docs]

def create_evaluation_dataset(chain, testset_df):
    res_set = {
        "question": [],
        "answer": [],
        "contexts": [],
        "ground_truth": []
    }

    for test in testset_df.iterrows():
        query = test[1]["question"]
        result = chain(query)

        res_set["question"].append(query)
        res_set["answer"].append(result["result"])
        res_set["contexts"].append(parse_contexts(result["source_documents"]))
        res_set["ground_truth"].append(test[1]["ground_truth"])
    return Dataset.from_dict(res_set)

def evaluate_chain(chain, testset_df, test_name):
    eval_dataset = create_evaluation_dataset(chain, testset_df)

    eval_result = evaluate(
        eval_dataset,
        metrics=[
            faithfulness,
            answer_relevancy,
            context_relevancy
        ],
    )

    eval_df = eval_result.to_pandas()
    eval_df.to_csv(f"{test_name}.csv")
    return eval_df

In [43]:
basic_rag_eval = evaluate_chain(qa, testset_sample.to_pandas(), "basic_rag_eval")



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


Evaluating:   0%|          | 0/30 [00:00<?, ?it/s]

In [47]:
basic_rag_eval.describe()

Unnamed: 0,faithfulness,answer_relevancy,context_relevancy
count,10.0,10.0,10.0
mean,0.865,0.988599,0.045145
std,0.253914,0.036051,0.039301
min,0.25,0.885996,0.005236
25%,0.85,0.999999,0.016296
50%,1.0,1.0,0.034722
75%,1.0,1.0,0.068509
max,1.0,1.0,0.130435


# Analysis 

From the above we can see that we did okay in terms of faithfulness and answer relevancy but our context relevancy is not very good at all. This means we are passing a bunch of unnecessary context to our LLM. As an example, we could improve this using Parent Document Retriever to help fine tune our input data. 

To test this we will setup a similar RAG system but replace the retriever we used earlier. 

In [1]:
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")



In [51]:
from langchain_community.vectorstores import Redis as LangChainRedis
from utils import create_langchain_schemas_from_redis_schema

index_name = 'parent_doc_sec'
schema = IndexSchema.from_yaml('parent_doc_sec.yaml')
client = Redis.from_url(REDIS_URL)
# create an index from schema and the client
index = SearchIndex(schema, client)
index.create(overwrite=True, drop=True)

vec_schema , main_schema = create_langchain_schemas_from_redis_schema('parent_doc_sec.yaml')

rds = LangChainRedis.from_existing_index( embedding=embeddings, 
                                          index_name= index_name, 
                                          schema = main_schema)

11:23:14 redisvl.index.index INFO   Index already exists, overwriting.


In [6]:
from ingestion import redis_bulk_upload

/Users/robert.shelton/Documents/boa/financial-vss/multi_doc_RAG
/Users/robert.shelton/Documents/boa/financial-vss


In [52]:
main_schema

{'vector': [{'name': 'text_embedding',
   'algorithm': 'FLAT',
   'dims': 384,
   'distance_metric': 'COSINE',
   'datatype': 'FLOAT32'}],
 'text': [{'name': 'content'}],
 'tag': [{'name': 'chunk_id'},
  {'name': 'source_doc'},
  {'name': 'doc_type'},
  {'name': 'ticker'},
  {'name': 'company_name'},
  {'name': 'sector'},
  {'name': 'asset_class'},
  {'name': 'location'},
  {'name': 'exchange'},
  {'name': 'currency'}],
 'numeric': [{'name': 'market_value'},
  {'name': 'weight'},
  {'name': 'notional_value'},
  {'name': 'shares'},
  {'name': 'price'}],
 'content_vector_key': 'text_embedding'}

In [25]:
from ingestion import get_sec_data
from ingestion import redis_bulk_upload
from langchain_text_splitters import RecursiveCharacterTextSplitter

child_chunk_size=400
parent_chunk_size=2000

# This text splitter is used to create the parent documents
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=parent_chunk_size)
# This text splitter is used to create the child documents
# It should create documents smaller than the parent
child_splitter = RecursiveCharacterTextSplitter(chunk_size=child_chunk_size)

sec_data = get_sec_data()
child_chunks = redis_bulk_upload(sec_data, index, embeddings, chunk_size=child_chunk_size, tickers=['AAPL', 'AMZN'])


 ✅ Loaded doc info for  110 tickers...
✅ Loaded 779 10K chunks for ticker=AAPL from AAPL-2021-10K.pdf
✅ Loaded 689 10K chunks for ticker=AAPL from AAPL-2023-10K.pdf
✅ Loaded 748 10K chunks for ticker=AAPL from AAPL-2022-10K.pdf
✅ Loaded 155 earning_call chunks for ticker=AAPL from 2018-May-01-AAPL.txt
✅ Loaded 193 earning_call chunks for ticker=AAPL from 2019-Oct-30-AAPL.txt
✅ Loaded 200 earning_call chunks for ticker=AAPL from 2016-Jan-26-AAPL.txt
✅ Loaded 196 earning_call chunks for ticker=AAPL from 2020-Jul-30-AAPL.txt
✅ Loaded 176 earning_call chunks for ticker=AAPL from 2017-Aug-01-AAPL.txt
✅ Loaded 179 earning_call chunks for ticker=AAPL from 2020-Jan-28-AAPL.txt
✅ Loaded 209 earning_call chunks for ticker=AAPL from 2016-Apr-26-AAPL.txt
✅ Loaded 177 earning_call chunks for ticker=AAPL from 2017-Jan-31-AAPL.txt
✅ Loaded 176 earning_call chunks for ticker=AAPL from 2019-Apr-30-AAPL.txt
✅ Loaded 155 earning_call chunks for ticker=AAPL from 2017-Nov-02-AAPL.txt
✅ Loaded 189 earning_c

In [54]:
flattened_chunks = [item for sublist in child_chunks for item in sublist]

In [60]:
from langchain.retrievers import ParentDocumentRetriever
from langchain.storage import InMemoryStore, LocalFileStore
from langchain.storage._lc_store import create_kv_docstore

# The storage layer for the parent documents
# store = InMemoryStore()
fs = LocalFileStore("./store_location")
store = create_kv_docstore(fs)

parent_doc_vector_store = LangChainRedis.from_documents(
    documents=flattened_chunks,
    embedding=embeddings,
    index_name=index_name,
    redis_url=REDIS_URL,
    index_schema=main_schema,
)

parent_doc_retriever = ParentDocumentRetriever(
    vectorstore=parent_doc_vector_store,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
)

parent_doc_qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=parent_doc_vector_store.as_retriever(
        search_type="similarity_distance_threshold",
        search_kwargs={"distance_threshold":0.5}
    ),
    return_source_documents=True,
    chain_type_kwargs={"prompt": get_prompt()},
    verbose=True
)

TypeError: LocalFileStore.__init__() missing 1 required positional argument: 'root_path'

In [59]:
parent_doc_vector_store.as_retriever(
    search_type="similarity_distance_threshold", 
    search_kwargs={"distance_threshold":0.5}
).similarity_search("What was Apple's revenue last year compared to this year??")

AttributeError: 'RedisVectorStoreRetriever' object has no attribute 'similarity_search'

In [47]:
from langchain.retrievers import ParentDocumentRetriever

# The storage layer for the parent documents
store = InMemoryStore()

# construct the vector store class from texts and metadata
vector_store = LangChainRedis.from_documents(
    documents=flattened_chunks,
    embedding=embeddings,
    index_name=index_name,
    redis_url=REDIS_URL,
    index_schema=main_schema,
)

parent_doc_retriever = ParentDocumentRetriever(
    vectorstore=vector_store,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
    return_source_documents=True,
)

parent_doc_retriever.add_documents(flattened_chunks)

`index_schema` does not match generated metadata schema.
If you meant to manually override the schema, please ignore this message.
index_schema: {'vector': [{'name': 'text_embedding', 'algorithm': 'FLAT', 'dims': 384, 'distance_metric': 'COSINE', 'datatype': 'FLOAT32'}], 'text': [{'name': 'content'}], 'tag': [{'name': 'chunk_id'}, {'name': 'source_doc'}, {'name': 'doc_type'}, {'name': 'ticker'}, {'name': 'company_name'}, {'name': 'sector'}, {'name': 'asset_class'}, {'name': 'location'}, {'name': 'exchange'}, {'name': 'currency'}], 'numeric': [{'name': 'market_value'}, {'name': 'weight'}, {'name': 'notional_value'}, {'name': 'shares'}, {'name': 'price'}], 'content_vector_key': 'text_embedding'}
generated_schema: {'text': [{'name': 'source'}], 'numeric': [], 'tag': []}



In [42]:
vector_store.similarity_search("What was apples revenue last year?")

[Document(page_content="revenue and earnings in Apple's history. In fiscal year '18, our revenue grew by $36.4 billion. That's the equivalent of a Fortune 100 company in a single year. And we're capping all that off with our best September quarter ever. Revenue was $62.9 billion, ahead of our expectations. That's an increase of 20% over last year and our highest growth rate in 3 years. We also generated record Q4", metadata={'id': 'chunk:2018-Nov-01-AAPL.txt-c7764e65-9866-4559-a1b7-8baa72e6b733', 'chunk_id': '2018-Nov-01-AAPL.txt-c7764e65-9866-4559-a1b7-8baa72e6b733', 'source_doc': '2018-Nov-01-AAPL.txt', 'doc_type': 'earning_call', 'ticker': 'AAPL', 'company_name': 'APPLE INC', 'sector': 'Information Technology', 'asset_class': 'Equity', 'location': 'United States', 'exchange': 'NASDAQ', 'currency': 'USD', 'market_value': '559365151.11', 'weight': '5.16', 'notional_value': '559365151.11', 'shares': '4305127', 'price': '129.93'}),
 Document(page_content="revenue and earnings in Apple's

In [49]:
query = "What was apples revenue last year?"
res = parent_doc_retriever.invoke(query, distance_threshold=0.8)
res

TypeError: 'ParentDocumentRetriever' object is not callable

In [28]:
from langchain.chains import RetrievalQA
from langchain_community.llms import Ollama

llm = Ollama(model="llama3")

def get_search_kwargs(filters, distance_threshold):
    return {"distance_threshold":distance_threshold,"filter":filters}

parent_doc_qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=parent_doc_retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": get_prompt()},
    verbose=True
)

In [50]:
sample_question = "What is the purpose of incorporating portions of the Registrant's definitive proxy statement into the Annual Report on Form 10-K?"
parent_doc_qa("What was apples revenue last year?")



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


{'query': 'What was apples revenue last year?',
 'result': "I'm happy to help! However, I don't have any context provided about Apple's financial data. Could you please provide the relevant section from a 10K filing or another financial document that mentions Apple's revenue? That way, I can give an accurate answer.\n\nIf you could provide the necessary information, I'll be happy to help you with your question!",
 'source_documents': []}

In [29]:
import pandas as pd
testset_sample = pd.read_csv("basic_rag_eval.csv")

In [30]:
parent_doc_dataset = create_evaluation_dataset(parent_doc_qa, testset_sample)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [31]:
parent_doc_dataset['contexts']

[[], [], [], [], [], [], [], [], [], []]

In [None]:
eval_result = evaluate(
    eval_dataset,
    metrics=[
        faithfulness,
        answer_relevancy,
        context_relevancy
    ],
)

eval_df = eval_result.to_pandas()
eval_df.to_csv(f"{test_name}.csv")