<div align="center">
    <div><img src="../assets/redis_logo.svg" style="width: 130px"> </div>
    <div style="display: inline-block; text-align: center; margin-bottom: 10px;">
        <span style="font-size: 36px;"><b>Multi-document Single-index RAG with LangChain and Redis Hybrid Search</b></span>
        <br />
    </div>
    <br />
</div>

## Environment Setup

In [4]:
%pip install python-dotenv

Note: you may need to restart the kernel to use updated packages.


In [5]:
import sys
import os
import warnings
import dotenv
# load env vars from .env file
dotenv.load_dotenv()

warnings.filterwarnings('ignore')
dir_path = os.getcwd()
parent_directory = os.path.dirname(dir_path)
sys.path.insert(0, f'{parent_directory}/helpers')
os.environ["ROOT_DIR"] = parent_directory
REDIS_URL = os.getenv("REDIS_URL")

print(dir_path)
print(parent_directory)
print(sys.path)

/Users/rouzbeh.farahmand/PycharmProjects/boa-financial-rag-workshop/2_RAG_patterns_with_redis
/Users/rouzbeh.farahmand/PycharmProjects/boa-financial-rag-workshop
['/Users/rouzbeh.farahmand/PycharmProjects/boa-financial-rag-workshop/helpers', '/Users/rouzbeh.farahmand/PycharmProjects/boa-financial-rag-workshop/helpers', '/Applications/PyCharm.app/Contents/plugins/python/helpers-pro/jupyter_debug', '/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev', '/Users/rouzbeh.farahmand/PycharmProjects/boa-financial-rag-workshop', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python312.zip', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/lib-dynload', '', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages']


In [6]:
import os
import warnings
import dotenv
# mute warnings
warnings.filterwarnings('ignore')
# load env vars from .env file
dotenv.load_dotenv()

dir_path = os.getcwd()
parent_directory = os.path.dirname(dir_path)
os.environ["ROOT_DIR"] = parent_directory
print(dir_path)
print(parent_directory)

/Users/rouzbeh.farahmand/PycharmProjects/boa-financial-rag-workshop/2_RAG_patterns_with_redis
/Users/rouzbeh.farahmand/PycharmProjects/boa-financial-rag-workshop


### Install Python Dependencies

In [7]:
%pip install -r $ROOT_DIR/requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [8]:
from utils import *
from ingestion import *
from custom_ners import *

### Configure your Redis Stack


In [9]:
REDIS_URL = os.getenv("REDIS_URL")

### SentenceTransformerEmbeddings Models Cache folder
We are using `SentenceTransformerEmbeddings` in this demo and here we specify the cache folder. If you already downloaded the models in a local file system, set this folder here, otherwise the library tries to download the models in this folder if not available locally.

In particular, this models will be downloaded if not present in the cache folder:

models/models--sentence-transformers--all-MiniLM-L6-v2


In [10]:
#setting the local downloaded sentence transformer models folder
os.environ["TRANSFORMERS_CACHE"] = f"{parent_directory}/models"

## RAG with LangChain

### Create Custom index based on your data using RedisVL

In [11]:
from redisvl.index import SearchIndex
from redisvl.schema import IndexSchema
from redis import Redis
index_name = 'langchain'
prefix = 'chunk'
schema = IndexSchema.from_yaml('sec_index.yaml')
client = Redis.from_url(REDIS_URL)
# create an index from schema and the client
index = SearchIndex(schema, client)
index.create(overwrite=True, drop=True)

16:15:30 redisvl.index.index INFO   Index already exists, overwriting.


In [12]:
# get info about the index
!rvl index info -i langchain

[32m16:15:31[0m [34m[RedisVL][0m [1;30mINFO[0m   Using Redis address from environment variable, REDIS_URL


Index Information:
╭──────────────┬────────────────┬────────────┬─────────────────┬────────────╮
│ Index Name   │ Storage Type   │ Prefixes   │ Index Options   │   Indexing │
├──────────────┼────────────────┼────────────┼─────────────────┼────────────┤
│ langchain    │ HASH           │ ['chunk']  │ []              │          0 │
╰──────────────┴────────────────┴────────────┴─────────────────┴────────────╯
Index Fields:
╭────────────────┬────────────────┬─────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
│ Name           │ Attribute      │ Type    │ Field Option   │ Option Value   │ Field Option   │ Option Value   │ Field Option   │   Option Value │ Field Option    │ Option Value   │
├────────────────┼────────────────┼─────────┼────────────────┼────────────────┼──────

### Ingestion and Indexing



In [13]:
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings 
embeddings = SentenceTransformerEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2", cache_folder=os.getenv("TRANSFORMERS_CACHE", f"{parent_directory}/models"))

In [14]:
sec_data = get_sec_data()

 ✅ Loaded doc info for  111 tickers...


In [15]:
redis_bulk_upload(sec_data, index, embeddings, tickers=['AAPL', 'AMZN'])

✅ Loaded 108 10K chunks for ticker=AAPL from AAPL-2021-10K.pdf
✅ Loaded 94 10K chunks for ticker=AAPL from AAPL-2023-10K.pdf
✅ Loaded 103 10K chunks for ticker=AAPL from AAPL-2022-10K.pdf
✅ Loaded 27 earning_call chunks for ticker=AAPL from 2018-May-01-AAPL.txt
✅ Loaded 31 earning_call chunks for ticker=AAPL from 2019-Oct-30-AAPL.txt
✅ Loaded 30 earning_call chunks for ticker=AAPL from 2016-Jan-26-AAPL.txt
✅ Loaded 31 earning_call chunks for ticker=AAPL from 2020-Jul-30-AAPL.txt
✅ Loaded 30 earning_call chunks for ticker=AAPL from 2017-Aug-01-AAPL.txt
✅ Loaded 29 earning_call chunks for ticker=AAPL from 2020-Jan-28-AAPL.txt
✅ Loaded 34 earning_call chunks for ticker=AAPL from 2016-Apr-26-AAPL.txt
✅ Loaded 29 earning_call chunks for ticker=AAPL from 2017-Jan-31-AAPL.txt
✅ Loaded 28 earning_call chunks for ticker=AAPL from 2019-Apr-30-AAPL.txt
✅ Loaded 26 earning_call chunks for ticker=AAPL from 2017-Nov-02-AAPL.txt
✅ Loaded 31 earning_call chunks for ticker=AAPL from 2016-Oct-25-AAPL.tx

## Vector Search with LangChain
**Important Note-2**: LangChain does not support JSON data types yet. Only supports HASH for now. This update should be coming soon.

In [16]:
from langchain_community.vectorstores import Redis as LangChainRedis

index_name = 'langchain'

vec_schema , main_schema = create_langchain_schemas_from_redis_schema('sec_index.yaml')

rds = LangChainRedis.from_existing_index( embedding=embeddings, 
                                          index_name= index_name, 
                                          schema = main_schema)

### Query the database
Now we can use the LangChain vector store class to perform similarity search operations on Redis

In [17]:
from langchain.vectorstores.redis import RedisText
from langchain.vectorstores.redis import RedisTag

In [18]:
f = RedisTag("ticker") == "AAPL"
rds.similarity_search(query="How many employees work at this company???", k=4, distance_threshold=0.8, filter=f)

[Document(page_content='The Company has historically experienced higher net sales in its first quarter compared to other quarters in its fiscal year due in part to seasonal holiday demand. Additionally, new product and service introductions can significantly impact net sales, cost of sales and operating expenses. The timing of product introductions can also impact the Company’s net sales to its indirect distribution channels as these channels are filled with new inventory following a product launch, and channel inventory of an older product often declines as the launch of a newer product approaches. Net sales can also be affected when consumers and distributors anticipate a product introduction.\n\nHuman Capital\n\nThe Company believes it has a talented, motivated, and dedicated team, and is committed to supporting the development of all of its team members and to continuously building on its strong culture. As of September 25, 2021, the Company had approximately 154,000 full-time equi

In [19]:
f = RedisTag("doc_type") == "10K"
rds.similarity_search(query="What did Tim Cook said in 2020 earning calls regarding NANDs?", k=4, distance_threshold=0.8, filter=f)

[Document(page_content='David A. Zapolsky. Mr. Zapolsky has served as Senior Vice President, General Counsel, and Secretary since May 2014, Vice President, General Counsel, and Secretary from September 2012 to May 2014, and as Vice President and Associate General Counsel for Litigation and Regulatory matters from April 2002 until September 2012.\n\nBoard of Directors Name\n\nAge\n\nPosition\n\nJeffrey P. Bezos\n\n58\n\nExecutive Chair\n\nAndrew R. Jassy\n\n54\n\nPresident and Chief Executive Officer\n\nKeith B. Alexander\n\n70\n\nCo-CEO, President, and Chair of IronNet Cybersecurity, Inc.\n\nEdith W. Cooper\n\n60\n\nFormer Executive Vice President, Goldman Sachs Group, Inc.\n\nJamie S. Gorelick\n\n71\n\nPartner, Wilmer Cutler Pickering Hale and Dorr LLP\n\nDaniel P. Huttenlocher\n\n63\n\nDean, MIT Schwarzman College of Computing\n\nJudith A. McGrath\n\n69\n\nFormer Chair and CEO, MTV Networks\n\nIndra K. Nooyi\n\n66\n\nFormer Chief Executive Officer, PepsiCo, Inc.\n\nJonathan J. Rubins

In [20]:
f = RedisTag("doc_type") == "earning_call"
rds.similarity_search(query="What did Tim Cook said in 2020 earning calls regarding NANDs?", k=4, distance_threshold=0.8, filter=f)

[Document(page_content="Thank you. Good afternoon, and thanks to everyone for joining us. Speaking first today is Apple's CEO, Tim Cook; and he'll be followed by CFO, Luca Maestri. After that, we'll open the call to questions from analysts. Please note that some of the information you'll hear during our discussion today will consist of forward-looking statements, including, without limitation, those regarding revenue, gross margin, operating expenses, other income and expense, taxes, capital allocation and future business outlook. Actual results or trends could differ materially from our forecast. For more information, please refer to the risk factors discussed in Apple's most recently filed periodic reports on Form 10-K and Form 10-Q and the Form 8-K filed with the SEC today, along with the associated press release. Apple assumes no obligation to update any forward-looking statements or information which speak as of their respective dates. I'd now like to turn the call over to Tim for

In [21]:
# vector search with combinations of metadata filtering
f = (RedisText("content") % "profit") | (RedisText("content") % "revenue")

rds.similarity_search_with_score(query="Apple company revenue", k=4, filter=f)


[(Document(page_content="Earlier this month, released macOS Catalina with all new entertainment apps, innovative Sidecar feature that uses iPad to expand Mac workspace and new accessibility tools that enable users to control their Mac entirely with their voice. 1. Catalina brings Apple Arcade experience to Mac. 1. Already seeing some third-party developers bring their iPad apps to Mac App Store with Mac Catalyst, including Twitter, Post-it and more. 4. Launching newly redesigned Mac Pro this fall, which Co. is manufacturing in Austin, Texas. 7. Others: 1. In FY19, crossed $100b in revenue in US for first time. 2. Introduce new services from Apple Card to Apple TV+ and generated over $46b in total Services revenue, setting new yearly Services records in all five geographic segments and driving Services business to size of Fortune 70 co. 3. Delivered new hardware in all device categories. 4. Wearables business showed explosive growth and generated more annual revenue than two-thirds of c

## RAG with Ollama running Llama 3 LLM

### Initialize a llama  LLM served via Ollama
Alternatively, if you like to connect to a local Ollama LLM, you can use below LLM. If you have a local OpenAI-compatible server running via vLLM , add your LLM here.

In [22]:
llm = get_llm()

### Setup prompt
PromptTemplate defines the exect text of the response that would be fed to the LLM. This step is optional, but the defaults usually work well for OpenAI and might fall short for other models.

In [23]:
def get_prompt():
    """Create the QA chain."""
    from langchain.prompts import PromptTemplate

    # Define our prompt
    prompt_template = """Use the following pieces of context from financial 10k filings data to answer the user question at the end. Only use the result from tools and evidence provided to you. If you don't know the answer, say that you don't know, don't try to make up an answer. Provide the source of the document that you used to get the answer.

    This should be in the following format:

    Question: [question here]
    Answer: [answer here]
    Source: [source document here]

    Begin!

    Context:
    ---------
    {context}
    ---------
    Question: {question}
    Answer:"""

    prompt = PromptTemplate(
        template=prompt_template,
        input_variables=["context", "question"]
    )
    return prompt

### Putting it all together

This is where the Langchain brings all the components together in a form of a simple RAG application with the financial PDF document.

In [24]:
from langchain.chains import RetrievalQA

def get_search_kwargs(filters, distance_threshold):
    return {"distance_threshold":distance_threshold,"filter":filters}
    

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=rds.as_retriever(search_type="similarity_distance_threshold",
                               search_kwargs={"distance_threshold":0.8, 'include_metadata': True}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": get_prompt()},
    verbose=True
)

### Finally - let's ask questions!



In [25]:
query = "What was Apple's revenue last year compared to this year??"
res=qa(query)
res['result']



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


"Based on the provided financial data, we can see that:\n\n* Last year (previous quarter), Apple's revenue was approximately $5.5 billion, excluding the $548 million received from a patent infringement dispute.\n* This year (current quarter), Apple's revenue is almost $6.1 billion, including the $548 million received from the same patent infringement dispute.\n\nSo, Apple's revenue has increased by about 10% ($600 million) compared to last year."

In [None]:
query = "How many products does Nike offer? What is the industry that Nike is part of?"
res=qa(query)
res['result']



[1m> Entering new RetrievalQA chain...[0m


In [None]:
query = "what was the deferred revenue of Apple in 2022?"
res=qa(query)
res

Wrong Answer: because we could not fetch the right chunk. From Apple 10K in 2022 we have: "As of September 24, 2022 and September 25, 2021, the Company had total deferred revenue of $12.4 billion and $11.9 billion,
respectively. As of September 24, 2022, the Company expects 64% of total deferred revenue to be realized in less than a year, 27%
within one-to-two years, 7% within two-to-three years and 2% in greater than three years."

In [None]:
query = "what was revenue of Apple in 2022?"
res=qa(query)
res['result']

In [None]:
query = "How many employees work at Nike???"
res=qa(query)
res

wrong answer, it does not have the Nike data, but it did hallucinate given the wrong context by retrieval.

### Adding query analysis and hybrid search in QA chain

In [None]:
#Plugin your own query_analysis here, that includes NER, topic detection, intent detection, semantic routing etc. 
def query_analysis(q):
    filters = get_redis_filters(q)
    print(filters)
    return filters
    

def ask_question(question,
                 filters = None,
                 filter_strategy = 'AND',
                 distance_threshold =0.8,
                 search_type="similarity_distance_threshold"):
    
    q_filters = query_analysis(question)
    print(f"inferred filters: {q_filters}")
    if filters is None:
        filters = q_filters
    else:
        filters = " ( "+q_filters + " ) " + filter_strategy+ " ( " + filters + " ) "
    
    print(f"Final filters: {filters} to apply")
    if filters is not None:
        search_args = {"distance_threshold":distance_threshold, 
                   'include_metadata': True, 
                   'filter':filters}
    else:
        search_args = {"distance_threshold":distance_threshold, 
                   'include_metadata': True}
        
    fqa = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=rds.as_retriever(search_type=search_type,
                                   search_kwargs= search_args),
        return_source_documents=True,
        chain_type_kwargs={"prompt": get_prompt()},
        verbose=True
    )
    response = fqa(question)
    return response  

In [None]:
ask_question("what is the revenue of aapl?")

In [None]:
ask_question("what is the revenue of aapl in 2022?", filters = "@doc_type:{10K}")

In [None]:
ask_question("what is the total deferred revenue of aapl in 2022?", filters = "@doc_type:{10K} AND @content:(deferred revenue)")

Correct Retrieval by Redis Search but wrong extraction and generation by LLM!

In [None]:
rds.similarity_search_with_score(query="what is the total deferred revenue of Apple in 2022?", k=5, filter='(@content:(deferred) | @content:(revenue))')

Correct Retrieval by Redis Search!

## Cleanup

Cleanup the index and data.

In [None]:
#rds.drop_index(index_name=index_name, redis_url=REDIS_URL, delete_documents=True)