# RAG Process - Inference

This notebook shows how to use Retrieval Augmented Generation on the Domino platform to do Q&A over information that OpenAI's models have not been trained on and will not be able to provide answers out of the box. LangChain is used for both model and database access. The Process_data notebook demonstrates how to preprocess and load a document into the Pinecone vector data store to enable this process.



### Load the needed libraries

In [1]:
import os
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain import PromptTemplate
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain_pinecone import PineconeVectorStore

from pinecone import Pinecone
from getpass import getpass
import warnings
warnings.filterwarnings('ignore')

### Load Environment variables

In [2]:
#os.environ['ANTHROPIC_API_KEY'] = getpass("Enter Anthropic key:")
#os.environ['OPENAI_API_KEY'] = getpass("Enter OpenAI API key:")
#os.environ['PINECONE_API_KEY'] = getpass("Enter Pinecone API key:")
#os.environ['PINECONE_ENV'] = getpass("Enter Pinecone Environment:")

OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') 
PINECONE_API_KEY = os.getenv('PINECONE_API_KEY')
PINECONE_ENV = os.getenv('PINECONE_API_ENV')
os.environ['TOKENIZERS_PARALLELISM'] = 'false'

### Create embeddings to embed queries using OpenAI in LangChain 

In [5]:
# initialize embedding
model = 'text-embedding-ada-002'

# The OpenAIEmbeddings class is instantiated with two parameters: 
# 'model' and 'openai_api_key'. 'model' is the name of the model to be used 
# and 'openai_api_key' is the key for accessing the OpenAI API.
embeddings = OpenAIEmbeddings(
    model=model,
    openai_api_key=OPENAI_API_KEY
)

### Initialize Pinecone vector store with Pinecone 3.0 client 

In [6]:
# Defines the field name in the data which contains the text to be embedded.
text_field = "text"

# Defines the name of the Pinecone index to be used.
index_name = "mrag-fin-docs"
pc = Pinecone(api_key=PINECONE_API_KEY)  
index = pc.Index(index_name)

# Creates an instance of the Pinecone class. It uses the previously created index,
# the previously created embeddings object, and the text field.
vectorstore = PineconeVectorStore(  
    index, embeddings, text_field  
) 

### Create the Prompt Template

In [7]:
prompt_template  = """You are an AI assistant with expertise in financial analysis. You are given the following extracted parts and a question. 
If you don't know the answer, just say "Hmm, I'm not sure." Don't try to make up an answer.
If the question is not about financial analysis, politely inform them that you are tuned to only answer questions pertaining to financial analysis.
Question: {question}
=========
{context}
=========
Answer in Markdown:
"""
PROMPT = PromptTemplate(template=prompt_template, input_variables=["question", "context"])
#
chain_type_kwargs = {"prompt": PROMPT}

### Instantiate The OpenAIChat instance

In [8]:
# Creates an instance of the ChatOpenAI class.
rag_llm = ChatOpenAI(
    openai_api_key=OPENAI_API_KEY,
    model_name='gpt-4',
    temperature=0.0
)

### Instantiate the LangChain RetrievalQA chain for answering questions from the embedded data in the vectorstore

In [9]:
qa_chain = RetrievalQA.from_chain_type(llm=rag_llm,
                                       chain_type="stuff",
                                       chain_type_kwargs={"prompt": PROMPT},
                                       retriever=vectorstore.as_retriever(search_kwargs={"k": 5}),
                                       return_source_documents=True
                                      )

### Get question to answer in the docs and run the chain

In [11]:
user_question = input("Please ask your financial analysis question:")
result = qa_chain(user_question)

Please ask your financial analysis question: What was the gross income amount and percentage as share of total revenues in FY23


### Retrieve the result

In [12]:
result['result']

'The gross income amount for FY23 was $169,148 million. The gross margin percentage as a share of total revenues for FY23 was 44.1%.'

### Display Source Documents retrieved from the vector store and used for the answer

In [13]:
result['source_documents'][0].page_content

'Gross Margin\nProducts and Services gross margin and gross margin percentage for 2023, 2022 and 2021 were as follows (dollars in millions):\n2023 2022 2021\nGross margin:\nProducts $ 108,803 $ 114,728 $ 105,126 \nServices 60,345 56,054 47,710 \nTotal gross margin $ 169,148 $ 170,782 $ 152,836 \nGross margin percentage:\nProducts 36.5 % 36.3 % 35.3 %\nServices 70.8 % 71.7 % 69.7 %\nTotal gross margin percentage 44.1 % 43.3 % 41.8 %\nProducts Gross Margin\nProducts gross margin decreased during 2023 compared to 2022 due to the weakness in foreign currencies relative to the U.S. dollar and lower Products\nvolume, partially of fset by cost savings and a dif ferent Products mix.\nProducts gross margin percentage increased during 2023 compared to 2022 due to cost savings and a different Products mix, partially offset by the weakness in\nforeign currencies relative to the U.S. dollar and decreased leverage.\nServices Gross Margin'