# RAG LLMOps - Basic Demo
* Notebook by Adam Lang
* Date: 12/23/2024

# Overview
* This is a basic demo of RAG in LLMOps using Langchain


# Install Dependencies
* first install `requirements.txt` file
* Then load environment variables for OpenAI API access.

In [1]:
! pip install -r requirements.txt



In [2]:
import dotenv

dotenv.load_dotenv('.env')

True

# Basic RAG implementaiton
* We can load a document from Wikipedia via LangChain and perform RAG on it.
* LangChain gives us a basic document loader for Wikipedia.

In [3]:
from langchain_community.document_loaders import WikipediaLoader


## load documents
docs = WikipediaLoader(query="Artificial intelligence", load_max_docs=2, doc_content_chars_max=10000).load()
docs

[Document(metadata={'title': 'Artificial intelligence', 'summary': 'Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. Such machines may be called AIs.\nHigh-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., ChatGPT and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go). However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general applications, often without being ca

## Text Splitting and Chunking
* Basic text splitting and chunking.

In [4]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=100)
documents=text_splitter.split_documents(docs)
for doc in documents[:5]:
    print(doc.page_content, "\n")

Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. Such machines may be called AIs. 

High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., ChatGPT and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go). However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common eno

## Create Embeddings & Vector Store in Chroma Vector Database

In [5]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(
    documents,
    embedding=OpenAIEmbeddings(),
)

## Similarity Search in Vector Database

In [6]:
for similar_doc in vectorstore.similarity_search_with_score("What is AI ?", k=3):
    print(similar_doc[0].page_content, "\n")

Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. Such machines may be called AIs. 

Artificial intelligence was founded as an academic discipline in 1956, and the field went through multiple cycles of optimism, followed by periods of disappointment and loss of funding, known as AI winters. Funding and interest vastly increased after 2012 when deep learning outperformed previous AI techniques. This growth accelerated further after 2017 with the transformer architecture, and by the early 2020s hundreds of billions of dollars were being invested in AI (known as the "AI boom"). The widespread use of AI in the 21st century exposed several unintended consequences and harms 

Summary:
* We retrieved the top 3 most similar documents.

In [7]:
## now we can get the top 1 document
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 1},
)

retriever.batch(["what is AI?", "who invented AI ?"])

[[Document(metadata={'source': 'https://en.wikipedia.org/wiki/Artificial_intelligence', 'summary': 'Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. Such machines may be called AIs.\nHigh-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., ChatGPT and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go). However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general appl

# Creating a Question & Answer Basic Application
* Here we can use an LLM to perform RAG Question and Answer on the documents.

In [8]:
from langchain_openai import ChatOpenAI


## init the LLM 
llm = ChatOpenAI(model="gpt-4o-mini")

In [9]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser


## prompt message we are sending to the LLM
message = """
Answer this question using the provided context only.

{question}

Context:
{context}
"""

## send prompt to ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([("human", message)])


## init output parser
parser = StrOutputParser()


## create RAG Chain
rag_chain = {"context": retriever, "question": RunnablePassthrough()} | prompt | llm | parser

In [10]:
## invoke RAG chain
rag_chain.invoke("what is AI ?")

'Artificial intelligence (AI) is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that focuses on developing methods and software that allow machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals.'

# LangChain Globals
* https://api.python.langchain.com/en/latest/core/globals.html
* Globals give us a descriptive and detailed view during each stage of the RAG-LLM pipeline.

In [11]:
from langchain_core.globals import set_verbose, set_debug
set_verbose(True)
set_debug(True)
rag_chain.invoke("what is AI ?")

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": "what is AI ?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question>] Entering Chain run with input:
[0m{
  "input": "what is AI ?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question> > chain:RunnablePassthrough] Entering Chain run with input:
[0m{
  "input": "what is AI ?"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question> > chain:RunnablePassthrough] [0ms] Exiting Chain run with output:
[0m{
  "output": "what is AI ?"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question>] [366ms] Exiting Chain run with output:
[0m[outputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] Entering Prompt run with input:
[0m[inputs]
[36;1m[1;3m[chain/end

'Artificial intelligence (AI) is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science focused on developing methods and software that allow machines to perceive their environment and utilize learning and intelligence to take actions aimed at maximizing their chances of achieving defined goals. Such machines are referred to as AIs.'