In [1]:
%pip install --upgrade pip

# Uninstall conflicting packages
%pip uninstall -y langchain-core langchain-openai langchain-experimental beautifulsoup4 langchain-community langchain chromadb beautifulsoup4

# Install compatible versions of langchain-core and langchain-openai
%pip install langchain-core==0.3.6
%pip install langchain-openai==0.2.1
%pip install langchain-experimental==0.3.2
%pip install langchain-community==0.3.1
%pip install langchain==0.3.1

# Install remaining packages
%pip install chromadb==0.5.11
%pip install beautifulsoup4==4.12.3

# Restart the kernel after installation

Note: you may need to restart the kernel to use updated packages.
[0mNote: you may need to restart the kernel to use updated packages.
Collecting langchain-core==0.3.6
  Downloading langchain_core-0.3.6-py3-none-any.whl.metadata (6.3 kB)
Collecting PyYAML>=5.3 (from langchain-core==0.3.6)
  Downloading PyYAML-6.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain-core==0.3.6)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl.metadata (3.0 kB)
Collecting langsmith<0.2.0,>=0.1.125 (from langchain-core==0.3.6)
  Downloading langsmith-0.1.147-py3-none-any.whl.metadata (14 kB)
Collecting packaging<25,>=23.2 (from langchain-core==0.3.6)
  Downloading packaging-24.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pydantic<3.0.0,>=2.5.2 (from langchain-core==0.3.6)
  Downloading pydantic-2.11.3-py3-none-any.whl.metadata (65 kB)
Collecting tenacity!=8.4.0,<9.0.0,>=8.1.0 (from langchain-core==0.3.6)
  Downloading tenacity

In [6]:
%pip install pysqlite3-binary --upgrade

Note: you may need to restart the kernel to use updated packages.


In [2]:
# these three lines swap the stdlib sqlite3 lib with the pysqlite3 package
__import__('pysqlite3')
import sys, os
sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

# DATABASES = {
#     'default': {
#         'ENGINE': 'django.db.backends.sqlite3',
#         'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
#     }
# }

In [3]:
# New OS parameter to avoid warnings.  
# This will not have a material impact on your code, but prevents warnings from appearing related to new LangChain features.
import os
os.environ['USER_AGENT'] = 'RAGUserAgent'

In [4]:
from langchain_community.document_loaders import WebBaseLoader
import bs4
import openai
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain import hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
import chromadb
from langchain_community.vectorstores import Chroma
from langchain_experimental.text_splitter import SemanticChunker
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [5]:
# OpenAI Setup

import os

# The key is in .env file
# os.environ['OPENAI_API_KEY'] = ''

openai.api_key = os.environ['OPENAI_API_KEY']


In [19]:
#### INDEXING ####

In [6]:
# Load Documents
loader = WebBaseLoader(
    web_paths=("https://kbourne.github.io/chapter1.html",), 
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

In [7]:
print(len(docs))

1


In [8]:
# Split
text_splitter = SemanticChunker(OpenAIEmbeddings())
#text_splitter = RecursiveCharacterTextSplitter()
splits = text_splitter.split_documents(docs)

In [9]:
# Embed
vectorstore = Chroma.from_documents(documents=splits, 
                                    embedding=OpenAIEmbeddings())

retriever = vectorstore.as_retriever()

In [12]:
#### RETRIEVAL and GENERATION ####

In [10]:
# Prompt - ignore LangSmith warning, you will not need langsmith for this coding exercise
prompt = hub.pull("jclemens24/rag-prompt")



In [11]:
# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [12]:
# LLM
llm = ChatOpenAI(model_name="gpt-4o-mini", temperature=0)

In [13]:
# Chain it all together with LangChain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [14]:
# Question - run the chain
rag_chain.invoke("What are the advantages of using RAG?")

"The advantages of using Retrieval-Augmented Generation (RAG) include:\n\n1. **Improved Accuracy and Relevance**: RAG enhances the accuracy and relevance of responses generated by large language models (LLMs) by incorporating specific, real-time information from databases or datasets.\n\n2. **Customization and Flexibility**: RAG allows for tailored responses based on a company's specific needs by integrating internal databases, creating personalized experiences and outputs that meet unique business requirements.\n\n3. **Expanding Model Knowledge Beyond Training Data**: RAG enables models to access and utilize information that was not included in their initial training sets, effectively broadening the model's knowledge base without the need for retraining.\n\nThese advantages make RAG a powerful tool for organizations looking to leverage their internal data and improve the effectiveness of AI applications."

"The advantages of using Retrieval-Augmented Generation (RAG) include:\n\n1. **Improved Accuracy and Relevance**: RAG enhances the accuracy and relevance of responses generated by large language models (LLMs) by incorporating specific, real-time information from databases or datasets.\n\n2. **Customization and Flexibility**: RAG allows for tailored responses based on a company's specific needs by integrating internal databases, creating personalized experiences and outputs that meet unique business requirements.\n\n3. **Expanding Model Knowledge Beyond Training Data**: RAG enables models to access and utilize information that was not included in their initial training sets, effectively broadening the model's knowledge base without the need for retraining.\n\nThese advantages make RAG a powerful tool for organizations looking to leverage their internal data and improve the effectiveness of AI applications."

In [15]:
query = "How does RAG compare with fine-tuning?"
relevant_docs = retriever.get_relevant_documents(query)
relevant_docs

  relevant_docs = retriever.get_relevant_documents(query)


[Document(metadata={'source': 'https://kbourne.github.io/chapter1.html'}, page_content='Can you imagine what you could do with all of the benefits mentioned above, but combined with all of the data within your company, about everything your company has ever done, about your customers and all of their interactions, or about all of your products and services combined with a knowledge of what a specific customer’s needs are? You do not have to imagine it, that is what RAG does! Even smaller companies are not able to access much of their internal data resources very effectively. Larger companies are swimming in petabytes of data that is not readily accessible or is not being fully utilized. Prior to RAG, most of the services you saw that connected customers or employees with the data resources of the company were really just scratching the surface of what is possible compared to if they could access ALL of the data in the company. With the advent of RAG and generative AI in general, corpor

In [16]:
query = "How many hours school bus drivers work during the week in Washington state?"
relevant_docs = retriever.get_relevant_documents(query)
relevant_docs

[Document(metadata={'source': 'https://kbourne.github.io/chapter1.html'}, page_content='Once you have introduced the new knowledge, it will always have it! It is also how the model was originally created, by training with data, right? That sounds right in theory, but in practice, fine-tuning has been more reliable in teaching a model specialized tasks (like teaching a model how to converse in a certain way), and less reliable for factual recall. The reason is complicated, but in general, a model’s knowledge of facts is like a human’s long-term memory. If you memorize a long passage from a speech or book and then try to recall it a few months later, you will likely still understand the context of the information, but you may forget specific details. Whereas, adding knowledge through the input of the model is like our short-term memory, where the facts, details, and even the order of wording is all very fresh and available for recall. It is this latter scenario that lends itself better i