#### Install dependencies

In [1]:
%pip install langchain_community
%pip install langchain_experimental
%pip install langchain-openai
%pip install langchainhub
%pip install chromadb
%pip install langchain
%pip install langchain-ollama
%pip install beautifulsoup4

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
# New OS parameter to avoid warnings.  
# This will not have a material impact on your code, but prevents warnings from appearing related to new LangChain features.
import os
os.environ['USER_AGENT'] = 'RAGUserAgent'

In [3]:
import openai
import bs4
import chromadb

from langchain_community.document_loaders import WebBaseLoader
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_ollama import OllamaEmbeddings, ChatOllama
from langchain import hub
from langchain_core.documents.base import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_community.vectorstores import Chroma
from langchain_experimental.text_splitter import SemanticChunker
from langchain_core.vectorstores.base import VectorStoreRetriever

from modules import utils

from typing import List

In [4]:
envs = utils.load_env_file("./../secrets/env")

# Indexing
## Web loading and crawling

In [5]:
# Load Documents
loader = WebBaseLoader(
    web_paths=("https://kbourne.github.io/chapter1.html",), 
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs: List[Document] = loader.load()

In [6]:
print(f"The number of docs loaded is {len(docs)}")

The number of docs loaded is 1


In [7]:
for doc in docs:
    print(doc.page_content)



      Introduction to Retrieval Augmented Generation (RAG)
    
Date: March 10, 2024  |  Estimated Reading Time: 15 min  |  Author: Keith Bourne

  In the rapidly evolving field of artificial intelligence, Retrieval-Augmented Generation (RAG) is emerging as a significant addition to the Generative AI toolkit. RAG harnesses the strengths of Large Language Models (LLMs) and integrates them with internal data, offering a method to enhance organizational operations significantly. This book delves into the essential aspects of RAG, examining its role in augmenting the capabilities of LLMs and leveraging internal corporate data for strategic advantage.
As it progresses, the book outlines the potential of RAG in business, suggesting how it can make AI applications smarter, more responsive, and aligned with organizational objectives. RAG is positioned as a key facilitator of customized, efficient, and insightful AI solutions, bridging the gap between Generative AI's potential and specific bu

## Splitting

In [8]:
embedding_model = "llama3.1:8b"
base_url = envs["OLLAMA_HOST"]  # for example, "http://localhost:11434"

# Split
text_splitter: SemanticChunker = SemanticChunker(OllamaEmbeddings(base_url=base_url, model=embedding_model))
splits: List[Document] = text_splitter.split_documents(docs)

In [9]:
print(f"The number of splits is {len(splits)}")

The number of splits is 1


In [10]:
for split in splits:
    print(split.page_content, end=f"\n\n\n{'='*300}\n\n\n")



      Introduction to Retrieval Augmented Generation (RAG)
    
Date: March 10, 2024  |  Estimated Reading Time: 15 min  |  Author: Keith Bourne

  In the rapidly evolving field of artificial intelligence, Retrieval-Augmented Generation (RAG) is emerging as a significant addition to the Generative AI toolkit. RAG harnesses the strengths of Large Language Models (LLMs) and integrates them with internal data, offering a method to enhance organizational operations significantly. This book delves into the essential aspects of RAG, examining its role in augmenting the capabilities of LLMs and leveraging internal corporate data for strategic advantage. As it progresses, the book outlines the potential of RAG in business, suggesting how it can make AI applications smarter, more responsive, and aligned with organizational objectives. RAG is positioned as a key facilitator of customized, efficient, and insightful AI solutions, bridging the gap between Generative AI's potential and specific bu

## Embedding

In [11]:
# Embed
vectorstore: Chroma = Chroma.from_documents(documents=splits, 
                                    embedding=OllamaEmbeddings(base_url=base_url, model=embedding_model))

retriever: VectorStoreRetriever = vectorstore.as_retriever()

In [12]:
print(f"Length of retriever is {len(vectorstore.get())}")

Length of retriever is 7


In [13]:
len(vectorstore.get()["documents"])

1

## Retrieval

In [14]:
query = "How does RAG compare with fine-tuning?"
relevant_docs = retriever.invoke(query)

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


In [15]:
for doc in relevant_docs:
    print(doc.page_content, end=f"\n\n\n{'='*300}\n\n\n")



      Introduction to Retrieval Augmented Generation (RAG)
    
Date: March 10, 2024  |  Estimated Reading Time: 15 min  |  Author: Keith Bourne

  In the rapidly evolving field of artificial intelligence, Retrieval-Augmented Generation (RAG) is emerging as a significant addition to the Generative AI toolkit. RAG harnesses the strengths of Large Language Models (LLMs) and integrates them with internal data, offering a method to enhance organizational operations significantly. This book delves into the essential aspects of RAG, examining its role in augmenting the capabilities of LLMs and leveraging internal corporate data for strategic advantage. As it progresses, the book outlines the potential of RAG in business, suggesting how it can make AI applications smarter, more responsive, and aligned with organizational objectives. RAG is positioned as a key facilitator of customized, efficient, and insightful AI solutions, bridging the gap between Generative AI's potential and specific bu

## Prepare the prompt template

In [16]:
# Prompt - ignore LangSmith warning, you will not need langsmith for this coding exercise
prompt = hub.pull("jclemens24/rag-prompt")



In [17]:
prompt

ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, metadata={'lc_hub_owner': 'jclemens24', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '1a1f3ccb9a5a92363310e3b130843dfb2540239366ebe712ddd94982acc06734'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know.\nQuestion: {question} \nContext: {context} \nAnswer:"), additional_kwargs={})])

In [18]:
# Post-processing
def format_docs(docs: List[Document]) -> str:
    return "\n\n".join(doc.page_content for doc in docs)

## Define the LLM model

In [19]:
llm = ChatOllama(base_url=base_url, 
                 model=embedding_model,
                 temperature=0)

## Using LCEL to set-up LangChain chain

In [20]:
# Chain it all together with LangChain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

## Submitting a question

In [21]:
rag_chain.invoke("What are the advantages of using RAG?")

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


"It appears that there is no question to answer. The text provided seems to be an excerpt from a book or article discussing the concepts of Retrieval-Augmented Generation (RAG) and its comparison with other AI techniques such as fine-tuning large language models.\n\nHowever, I can summarize the main points discussed in the text:\n\n1. RAG is a technique that allows for dynamic integration of external knowledge without modifying the model's weights.\n2. It is superior for retrieving factual information that is not present in the LLM's training data or is private.\n3. Fine-tuning is more suitable for teaching the model specialized tasks or adapting it to a specific domain.\n4. The context window size of large language models can be a limitation, and expanding it can lead to issues such as loss of details.\n\nIf you have any specific questions related to this topic, I'll do my best to provide an answer!"

## Get the relevant docs

In [22]:
query = "What are the advantages of using RAG?"
relevant_docs = retriever.invoke(query)

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


In [23]:
for doc in relevant_docs:
    print(doc.page_content, end=f"\n\n\n{'='*300}\n\n\n")



      Introduction to Retrieval Augmented Generation (RAG)
    
Date: March 10, 2024  |  Estimated Reading Time: 15 min  |  Author: Keith Bourne

  In the rapidly evolving field of artificial intelligence, Retrieval-Augmented Generation (RAG) is emerging as a significant addition to the Generative AI toolkit. RAG harnesses the strengths of Large Language Models (LLMs) and integrates them with internal data, offering a method to enhance organizational operations significantly. This book delves into the essential aspects of RAG, examining its role in augmenting the capabilities of LLMs and leveraging internal corporate data for strategic advantage. As it progresses, the book outlines the potential of RAG in business, suggesting how it can make AI applications smarter, more responsive, and aligned with organizational objectives. RAG is positioned as a key facilitator of customized, efficient, and insightful AI solutions, bridging the gap between Generative AI's potential and specific bu