# Simple Rag Implementation Using LangChain

<img align="left" width="450" alt="RagChart" src="https://miro.medium.com/v2/resize:fit:1400/format:webp/1*EF7wzF_mTbT-CK0OwEGzyQ.png" style="margin: 0px 15px 0px 0px;"/>

Retrieval-Augmented Generation (RAG) is the concept to provide LLMs with additional information from an external knowledge source. This allows them to generate more accurate and contextual answers while reducing hallucinations.

In [1]:
import requests

import weaviate
from langchain.vectorstores import Weaviate
from langchain.chat_models import ChatOpenAI
from weaviate.embedded import EmbeddedOptions
from langchain.prompts import ChatPromptTemplate
from langchain.document_loaders import TextLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.schema.runnable import RunnablePassthrough
from langchain.text_splitter import CharacterTextSplitter
from langchain.schema.output_parser import StrOutputParser



The data collected from President Biden's State of Union Address from 2022. The raw text document is available in [LangChain's Github](https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt) repo.

In [2]:
'''
url = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
    f.write(res.text)
'''
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()

Next, chunk your documents — Because the Document, in its original state, is too long to fit into the LLM’s context window

In [3]:
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)

Embed and store the chunks to enable semantic search across the text chunks

In [4]:
client = weaviate.Client(
  embedded_options = EmbeddedOptions()
)

vectorstore = Weaviate.from_documents(
    client = client,    
    documents = chunks,
    embedding = OpenAIEmbeddings(),
    by_text = False
)

            Consider upgrading to the new and improved v4 client instead!
            See here for usage: https://weaviate.io/developers/weaviate/client-libraries/python
            
{"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-04-29T08:42:27+03:00"}
{"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-04-29T08:42:27+03:00"}
{"level":"info","msg":"No resource limits set, weaviate will use all available memory and CPU. To limit resources, set LIMIT_RESOURCES=true","time":"2024-04-29T08:42:27+03:00"}


Started /Users/ktuna/.cache/weaviate-embedded: process ID 10150


{"action":"grpc_startup","level":"info","msg":"grpc server listening at [::]:50060","time":"2024-04-29T08:42:27+03:00"}
{"action":"restapi_management","level":"info","msg":"Serving weaviate at http://127.0.0.1:8079","time":"2024-04-29T08:42:27+03:00"}
  warn_deprecated(
{"level":"info","msg":"Created shard langchain_ddf295adc9d1426abed18004f2a75b55_H8w1tVyBTrYA in 4.350083ms","time":"2024-04-29T08:42:27+03:00"}
{"action":"hnsw_vector_cache_prefill","count":1000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-04-29T08:42:27+03:00","took":96625}
{"level":"info","msg":"Completed loading shard langchain_10d6217c2e384668ac218fd7fd818c87_jwimPNVul6RU in 1.716167ms","time":"2024-04-29T08:42:28+03:00"}
{"level":"info","msg":"Completed loading shard langchain_529ce1f424cb43459f36cc0a78127585_gE8bc3hcw0CI in 1.886334ms","time":"2024-04-29T08:42:28+03:00"}
{"action":"hnsw_vector_cache_prefill","count":3000,"index_id":"main","level":"info","limit"

In [5]:
retriever = vectorstore.as_retriever()

Augment the promt with the additional context

In [10]:
template = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the question. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)

print(prompt)

input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. \nUse the following pieces of retrieved context to answer the question. \nIf you don't know the answer, just say that you don't know. \nUse three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:\n"))]


Let's generate!

In [7]:
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

rag_chain = (
    {"context": retriever,  "question": RunnablePassthrough()} 
    | prompt 
    | llm
    | StrOutputParser() 
)

query = "What did the president say about Justice Breyer"
rag_chain.invoke(query)

  warn_deprecated(
/Users/ktuna/projects/github_repos/simple_rag_implementation/env/lib/python3.9/site-packages/pydantic/main.py:1070: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/
/Users/ktuna/projects/github_repos/simple_rag_implementation/env/lib/python3.9/site-packages/pydantic/main.py:1070: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/


"The president honored Justice Breyer for his service and dedication to the country. He mentioned nominating Judge Ketanji Brown Jackson to continue Justice Breyer's legacy of excellence. The president did not provide specific comments about Justice Breyer's work or character."