# Exploring Simple Local RAG

This is a very exploration with Langchain, Langsmith and AWS Bedrock, on how to create basic RAG with a locally running ChromaDB instance loaded with a single source PDF document.

Langsmith is not actually needed for the implementation but its useful if we want to trace whats going on in our application.

### Prerequisites


In [2]:
%pip install -r requirements.txt --quiet 

Note: you may need to restart the kernel to use updated packages.


### Authenticate to AWS and Langsmith and Load Envrionment Variables
Follow the instuctions in the `README.md`.

In [2]:
# Load Environment Variables
from dotenv import load_dotenv, find_dotenv
import os

load_dotenv(find_dotenv())  # load the environment variables from .env

True

### AWS Bedrock
creating a bedrock client which will be used by the Bedrock classes in Langchain.

In [3]:
import os
import boto3

# Get region and profile from env
region = os.environ.get("AWS_REGION", "us-east-1")

# Create a Bedrock client
bedrock_client = boto3.client(service_name='bedrock-runtime', region_name=region)

### Langsmith

In [4]:
# Initialize Langsmith Client
from langsmith import Client

client = Client()

### Load Documents 

In the real work loading data and vectorizing it to be used in a vectorstore is usually a done with a separate ETL(extract transform and load) process. 

In [5]:
from langchain_community.document_loaders import DirectoryLoader

loader = DirectoryLoader('./docs', glob="**/*.md", show_progress=True)
data = loader.load()


  0%|          | 0/10 [00:00<?, ?it/s]

100%|██████████| 10/10 [00:00<00:00, 15.01it/s]


With the docs loaded we then need prepare the data by splitting the text into smaller chunks using "recursive split" method. Which splits the data into chunks based on count of characters and/or special characters including linebreaks. 

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter()
# text_splitter = RecursiveCharacterTextSplitter(chunk_size = 50, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

print(f"Total number of splits: {len(all_splits)}")

# you can iterate of the first 10 splits and print out each split to check your chunking is working as expected
for split in all_splits:
    print(split)


Total number of splits: 10
page_content="Isabel Anderson\n\nI'm a results-focused professional with a commitment to delivering excellence, who likes to learn about artificial intelligence. I love coding and continuously improving my skills.\nTechnologies: Kotlin, Android Development, Jetpack" metadata={'source': 'docs/isabel.md'}
page_content="Eve Davis\n\nI'm a user-experience driven designer with a strong coding background, who likes to build personal projects. I love coding and continuously improving my skills.\nTechnologies: Ruby, Rails, Sinatra" metadata={'source': 'docs/eve.md'}
page_content="Frank Wilson\n\nI'm a data-driven analyst with extensive experience in software development, who likes to stay up-to-date with industry trends. I love coding and continuously improving my skills.\nTechnologies: PHP, Laravel, Symfony" metadata={'source': 'docs/frank.md'}
page_content="Dave Brown\n\nI'm a full-stack developer with a knack for learning new technologies, who likes to read tech b

### Create the embeddings from the pdf document with Bedrock
Creating vector embeddings from documents involves converting textual information into numerical form so that it can be processed and understood by machine learning models.

The embedding model is used for both creating the embeddings of pdf documents which will be stored in the vector-store as well as for creating the embedding of the user question.

Langchain has built in support for many different embedding models. https://python.langchain.com/docs/integrations/text_embedding/bedrock



In [7]:
# Initialize the BedrockEmbeddings Model
from langchain.embeddings import BedrockEmbeddings

embedding_model_id = "amazon.titan-embed-text-v1"

embedding_model = BedrockEmbeddings(
    client=bedrock_client,
    model_id=embedding_model_id
)

#### Initializing ChromaDB and generating the embeddings
This chromadb instance will only be available in-memory

In [8]:
# Create Embeddings
from langchain.vectorstores import Chroma

vectorstore = Chroma.from_documents(
    documents=all_splits, embedding=embedding_model)



### Load your local ChromaDB vector store as a Retriever
This part of the code is where we normally connect to a datastore and set the data store as a retriever. 

In [9]:
# Create a retriever

from langchain.vectorstores import Chroma

retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 2})

# testing the retriever by similarity search
# retriever.get_relevant_documents("which person has the most experience with JavaScript and has worked with data?")
retriever.get_relevant_documents("I'm looking for a full-stack developer that has worked for atleast 5 years?")

Number of requested results 20 is greater than number of elements in index 10, updating n_results = 10


[Document(page_content="Dave Brown\n\nI'm a full-stack developer with a knack for learning new technologies, who likes to read tech blogs. I love coding and continuously improving my skills. I have worked for 5 years so I'm pretty senior.\nTechnologies: C#, .NET Core, Entity Framework", metadata={'source': 'docs/dave.md'}),
 Document(page_content="Henry Taylor\n\nI'm a versatile software engineer who have worked for over 5 years and enjoys collaborative projects, who likes to practice competitive programming. I love coding and continuously improving my skills.\nTechnologies: Swift, iOS Development, Xcode", metadata={'source': 'docs/henry.md'})]

# Rag

In [33]:
# Pulling in a forked community prompt
from langchain import hub
prompt = hub.pull("mrkmod/rag-prompt")

from langchain_community.llms import Bedrock
from langchain.chains import RetrievalQA

# experiment with different models if you like
# llm_model_id = "anthropic.claude-instant-v1"
llm_model_id = "anthropic.claude-v2"

llm = Bedrock(
    client=bedrock_client, model_id=llm_model_id
)

qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=retriever,
    chain_type_kwargs={"prompt": prompt},
    return_source_documents=True
)

In [None]:
# Run the Retrieval Chain
# print(prompt.messages[0].prompt.template)
question = "I'm looking for a full-stack developer that has worked for atleast 5 years, which person is the best candidate?"

result = qa_chain.invoke({"query": question})

print(result.result)
