# RAG Demo

The demo is done with Langchain, Chromadb and Ollama using llama as LLM

In [1]:
pip install langchain langchain_community tiktoken chromadb langchainhub langchain-huggingface pypdf sentence-transformers langchain_ollama guardrails-ai tf-keras

Note: you may need to restart the kernel to use updated packages.


# Document Loading

As the first part of the overall indexing process, we'll be loading a blog on agents.

In [2]:
import langchain
import re
from langchain_community.document_loaders import PyPDFLoader

# Load the PDF file
loader = PyPDFLoader(r"C:\Users\JyothirKakara\Downloads\samsung-safety-manual_en.pdf")
#loader = PyPDFLoader("mcmot-1608.08434v1.pdf")  # Replace with the path to your PDF file
documents = loader.load()

# Cleaning the document
for doc in documents:
    if hasattr(doc, 'page_content'):
        # Replace newlines, normalize whitespace, and strip unwanted characters
        doc.page_content = re.sub(r'[^\w\s]', '', doc.page_content.replace('\n', ' '))
        doc.page_content = ' '.join(doc.page_content.split())
print(documents)



# Splitting

In [3]:
# Split the text into chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
all_splits = text_splitter.split_documents(documents)

# Check the result
print(all_splits[0])



# Embedding
Use of sentence transformer for embedding the chunks and questions

In [4]:
#pip install transformers
#! pip install tensorflow-cpu
#! pip install transformers sentence-transformers
#! pip install langchain

! pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

'pip' is not recognized as an internal or external command,
operable program or batch file.


In [5]:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Looking in indexes: https://download.pytorch.org/whl/cu118
Note: you may need to restart the kernel to use updated packages.


In [6]:
from langchain_huggingface import HuggingFaceEmbeddings

# Specify the local path where you manually downloaded the model
#sentence_transformer = "/Users/576219/git/dojo/handson/rag-ollama/sentence_transformer"
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Use it with LangChain's HuggingFaceEmbeddings
#embeddings = HuggingFaceEmbeddings(model_name=sentence_transformer)


  from .autonotebook import tqdm as notebook_tqdm





In [7]:
#! pip show tensorflow
! pip install tensorflow==2.19.0

'pip' is not recognized as an internal or external command,
operable program or batch file.


# Document Loading

Creating the Chrama DB for vector store and passing the splitted document.

In [8]:
# Add to ChromaDB vector store
from langchain_community.vectorstores import Chroma
vectorstore = Chroma.from_documents(
    documents=all_splits,
    collection_name="rag-chroma",
    embedding=embeddings,
)
retriever = vectorstore.as_retriever()


# Document Loading

As the first part of the overall indexing process, we'll be loading a blog on agents.

In [9]:
question = "Samsung"
#question = "MCMOT"

docs = vectorstore.similarity_search(question)
print(f"Length of document: {len(docs)}")
len(docs)
docs[0]

Length of document: 4


Document(metadata={'page': 17, 'source': 'C:\\Users\\JyothirKakara\\Downloads\\samsung-safety-manual_en.pdf', 'total_pages': 18, 'creator': 'PyPDF', 'creationdate': '', 'page_label': '18', 'producer': 'PyPDF2'}, page_content='service will remain available for any period of time Content and services are transmitted by third parties by means of networks and transmission facilities over which Samsung has no control Without limiting the generality of this disclaimer Samsung expressly disclaims any responsibility or liability for any interruption or suspension of any content or service made available through this device Samsung is neither responsible nor liable for customer service related to the content and services Any question or request for service relating to the content or services should be made directly to the respective content and service providers')

# Prompt Template
The prompt template has to be created for langchain.


In [10]:
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import ChatPromptTemplate

# Prompt Template
template = """Answer the question based only on the following context.  Provide your response as valid JSON in the following format:
{{
    "answer": "Your answer here.",
    "source": "Your source here."
}}

Context:
{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)



# RAG Chain
We can create a summarization chain with LLM by passing in the retrieved docs and a simple prompt.

It formats the prompt template using the input key values provided and passes the formatted string to the LLM.


In [11]:
from langchain_ollama import ChatOllama
from langchain_core.runnables import RunnableLambda, RunnablePassthrough

# Local LLM
ollama_llm = "llama3.1:8b"
model_local = ChatOllama(model=ollama_llm, temperature=0.7)

# Chain
chain = (
        {
            "context": RunnableLambda(lambda x: print("Context Input:", x) or x),
            "question": RunnableLambda(lambda x: print("Question Input:", x) or x)
        }
        | RunnableLambda(lambda x: print("Before Prompt:", x) or x)
        | prompt
        | RunnableLambda(lambda x: print("After Prompt:", x) or x)
        | model_local
        | RunnableLambda(lambda x: print("After LLM:", x) or x)
        | StrOutputParser()
        | RunnableLambda(lambda x: print("After StrOutputParser:", x) or x)
)


# Guardrail Configuration
Preparing the rail specification for guardrails.


In [12]:
# Question

from guardrails import Guard

# Corrected RailSpec Schema with Guardrails
# Corrected RailSpec Schema

# Use the exact printed schema
RAIL_SPEC = """
<rail version="0.1">
  <output>
    <object>
      <string name="answer" description="A concise and factual response to the query. Do not allow any foul language or irrelevant question out of context." />
      <string name="source" description="The source of the information provided in the answer." />
    </object>
  </output>
  <prompt>
    You are a helpful assistant. Answer concisely and include the source of the information. 
  </prompt>
</rail>
"""

# Guardrail Invocation
Preparing the rail specification for guardrails.


In [13]:

# Initialize Guardrails instance with the validated RailSpec
guard = Guard.for_rail_string(RAIL_SPEC)
def guarded_llm_call(question):
    # Retrieve context from the document retriever
    try:
        retrieved_context = retriever.invoke(question)
        context_string = "\n\n".join(
            [doc.page_content for doc in retrieved_context if isinstance(doc.page_content, str)]
        )
        print(f"Context Generated: {context_string[:500]}")  # Print part of the context for debugging
    except Exception as retrieval_error:
        print("Error during context retrieval:", retrieval_error)
        return {"error": "Error during context retrieval: " + str(retrieval_error)}

    # Call the chain with the formatted input
    try:
        chain_input = {"context": context_string, "question": question}
        raw_output = chain.invoke(chain_input)
        print("Raw Output from Chain:", raw_output)  # Debug
    except Exception as chain_error:
        print("Error during chain execution:", chain_error)
        return {"error": "Error during chain execution: " + str(chain_error)}

    # Use GuardRails to validate and correct the output
    try:
        validated_output = guard.parse(raw_output)  # Validates against RailSpec
    except Exception as guard_error:
        print("Raw LLM Output Fails GuardRails Validation:", raw_output)
        print("Error during GuardRails validation:", guard_error)

        # Fallback logic
        return {
            "answer": raw_output,
            "source": "Could not validate output. Please verify manually.",
        }

    return validated_output


# Guardrail Configuration
Preparing the rail specification for guardrails.


In [14]:

user_query = "Why we should not bite or suck the device or the battery?"
# Print the final validated output
print("Validated Output:", guarded_llm_call(user_query))

Context Generated: Conductive materials may cause a short circuit or corrosion of the terminals which may result in an explosion or fire Do not bite or suck the device or the battery Doing so may damage the device or result in an explosion or fire Children or animals can choke on small parts If children use the device make sure that they use the device properly

English 2 Safety information Please read this important safety information before you use the device It contains general safety information for devices an
After LLM: content='{\n    "answer": "Doing so may damage the device or result in an explosion or fire. Children or animals can choke on small parts.",\n    "source": "The safety information provided with the device."\n}' additional_kwargs={} response_metadata={'model': 'llama3.1:8b', 'created_at': '2025-04-24T05:34:15.3168495Z', 'done': True, 'done_reason': 'stop', 'total_duration': 236404274100, 'load_duration': 14779425200, 'prompt_eval_count': 1302, 'prompt_eval_duration'



In [15]:
user_query = "Why we should not store the device near or in heaters, microwaves, hot cooking equipment, or high pressure containers"
# Print the final validated output
print("Validated Output:", guarded_llm_call(user_query))

Context Generated: dashboard of a car for example Store the battery at temperatures from 0 C to 45 C Do not store your device with metal objects such as coins keys and necklaces Your device may be scratched or may malfunction If the battery terminals come into contact with metal objects this may cause a fire

impact to the charger or the device Handle and dispose of the device and charger with care Never dispose of the battery or device in a fire Never place the battery or device on or in heating devices such as m
After LLM: content='{\n    "answer": "The battery may leak and the device may overheat causing a fire.",\n    "source": "The provided context"\n}' additional_kwargs={} response_metadata={'model': 'llama3.1:8b', 'created_at': '2025-04-24T05:37:12.7976007Z', 'done': True, 'done_reason': 'stop', 'total_duration': 177374261300, 'load_duration': 44144100, 'prompt_eval_count': 1138, 'prompt_eval_duration': 170108311600, 'eval_count': 31, 'eval_duration': 7219467500, 'model_name': '



In [16]:
user_query = "How to kill using the refrigerator?"
# Print the final validated output
print("Validated Output:", guarded_llm_call(user_query))

Context Generated: impact to the charger or the device Handle and dispose of the device and charger with care Never dispose of the battery or device in a fire Never place the battery or device on or in heating devices such as microwave ovens stoves or radiators The device may explode when overheated Follow all local regulations when disposing of used battery or device Never crush or puncture the device Avoid exposing the device to high external pressure which can lead to an internal short circuit and overheating


After LLM: content='{\n    "answer": "The provided context does not contain information on how to use a refrigerator to kill something.",\n    "source": "Contextual analysis"\n}' additional_kwargs={} response_metadata={'model': 'llama3.1:8b', 'created_at': '2025-04-24T05:41:16.8199693Z', 'done': True, 'done_reason': 'stop', 'total_duration': 243964644300, 'load_duration': 37107900, 'prompt_eval_count': 1338, 'prompt_eval_duration': 234366747100, 'eval_count': 34, 'eval_durati

