In [1]:
# Data Injestion
from langchain_community.document_loaders import TextLoader
loader = TextLoader("speech.txt")
text_documents = loader.load()
text_documents

[Document(page_content='We Shall Fight on the Beaches, 1940\n\n\nFrom the moment that the French defences at Sedan and on the Meuse were broken at the end of the second week of May, only a rapid retreat to Amiens and the south could have saved the British and French Armies who had entered Belgium at the appeal of the Belgian King; but this strategic fact was not immediately realised. The French High Command hoped they would be able to close the gap, and the Armies of the north were under their orders. Moreover, a retirement of this kind would have involved almost certainly the destruction of the fine Belgian Army of over 20 divisions and the abandonment of the whole of Belgium. Therefore, when the force and scope of the German penetration were realised and when a new French Generalissimo, General Weygand, assumed command in place of General Gamelin, an effort was made by the French and British Armies in Belgium to keep on holding the right hand of the Belgians and to give their own rig

In [25]:
import os
from dotenv import load_dotenv
from langchain_openai import AzureOpenAI
from langchain_openai import AzureOpenAIEmbeddings
load_dotenv()

model = AzureOpenAI(
    deployment_name=os.getenv("AZURE_OPENAI_DEPLOYMENT"),
    api_version=os.getenv("AZURE_OPENAI_VERSION"),
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),    
)

# Initialize embeddings with Azure configuration
embeddings = AzureOpenAIEmbeddings(
    openai_api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    azure_deployment=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT"),
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_version=os.getenv("AZURE_OPENAI_VERSION"),
    # model="text-embedding-ada-002",
)

In [18]:
from langchain_community.document_loaders import WebBaseLoader
import bs4

# Load, Chunk, and Index the content of the html page

loader = WebBaseLoader(web_paths=("https://www.rungalileo.io/blog/optimizing-llm-performance-rag-vs-finetune-vs-both",),
                       bs_kwargs=dict(parse_only=bs4.SoupStrainer(
                           class_=("my-4",
                                   "mt-10 text-4xl font-normal leading-tight font-sora", 
                                   "mt-10 text-xl font-normal leading-tight font-sora")
                           )
                        )
                    )
html_documents = loader.load()
html_documents

[Document(page_content="Welcome to the latest instalment in our LLM blog series! One of the most significant debates across generative AI revolves around the choice between Fine-tuning, Retrieval Augmented Generation (RAG) or a combination of both. In this blog post, we will explore both techniques, highlighting their strengths, weaknesses, and the factors that can help you make an informed choice for your LLM project. By the end of this blog, you will have a clear understanding of harnessing the full potential of these approaches to drive the success of your AI.This is the most basic RAG system, and you can refer to Enteprise architecture if you want to understand how to build one.Fine-Tuning vs. Retrieval Augmented Generation: A False DichotomyBefore diving into the comparison, it's crucial to understand that Fine-tuning and Retrieval Augmented Generation are not opposing techniques. Instead, they can be used in conjunction to leverage the strengths of each approach. Let’s explore th

In [20]:
# PDF Document reader
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader("tree_of_thought.pdf")
pdf_documents = loader.load()
pdf_documents

[Document(page_content='Tree of Thoughts: Deliberate Problem Solving\nwith Large Language Models\nShunyu Yao\nPrinceton UniversityDian Yu\nGoogle DeepMindJeffrey Zhao\nGoogle DeepMindIzhak Shafran\nGoogle DeepMind\nThomas L. Griffiths\nPrinceton UniversityYuan Cao\nGoogle DeepMindKarthik Narasimhan\nPrinceton University\nAbstract\nLanguage models are increasingly being deployed for general problem solving\nacross a wide range of tasks, but are still confined to token-level, left-to-right\ndecision-making processes during inference. This means they can fall short in\ntasks that require exploration, strategic lookahead, or where initial decisions play\na pivotal role. To surmount these challenges, we introduce a new framework for\nlanguage model inference, “Tree of Thoughts” (ToT), which generalizes over the\npopular “Chain of Thought” approach to prompting language models, and enables\nexploration over coherent units of text (“thoughts”) that serve as intermediate steps\ntoward problem 

In [23]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(pdf_documents)
documents[:5]

[Document(page_content='Tree of Thoughts: Deliberate Problem Solving\nwith Large Language Models\nShunyu Yao\nPrinceton UniversityDian Yu\nGoogle DeepMindJeffrey Zhao\nGoogle DeepMindIzhak Shafran\nGoogle DeepMind\nThomas L. Griffiths\nPrinceton UniversityYuan Cao\nGoogle DeepMindKarthik Narasimhan\nPrinceton University\nAbstract\nLanguage models are increasingly being deployed for general problem solving\nacross a wide range of tasks, but are still confined to token-level, left-to-right\ndecision-making processes during inference. This means they can fall short in\ntasks that require exploration, strategic lookahead, or where initial decisions play\na pivotal role. To surmount these challenges, we introduce a new framework for\nlanguage model inference, “Tree of Thoughts” (ToT), which generalizes over the\npopular “Chain of Thought” approach to prompting language models, and enables\nexploration over coherent units of text (“thoughts”) that serve as intermediate steps', metadata={'sou

In [26]:
# Chroma Vector Embeddings and Vesctor Store
from langchain_community.vectorstores import Chroma
db = Chroma.from_documents(pdf_documents[:30], embedding=embeddings) 

In [30]:
# Vector Database
query = "How do the authors make search tractable"
db.similarity_search(query)

[Document(page_content='µŗƓŤòˤÊˤçĵĎòŗòĮƜˤŔÊŝŝÊĈòˤĵƙˤʁˤŝĎĵŗƜˤŔÊŗÊĈŗÊŔĎŝʛˤ\x9aĎòˤòĮíˤŝòĮŤòĮçòˤĵƙˤòÊçĎˤŔÊŗÊĈŗÊŔĎˤĭũŝƜˤæòʝˤɾʛˤGƜˤƓŝĮ˙ƛˤíđƙƙƓçũĦƜˤƘĵˤíĵˤÊˤĎÊĮíŝŤÊĮíˤđƙˤƆĵũˤĠũŝƜˤŝŤÊĮíˤĵĮˤƆĵũŗˤĎÊĮíŝʛˤɿʛˤGƜˤçÊũĈĎƜˤĎđĭˤĵƙƙˤĈũÊŗíˤƛĎÊƜˤŝŔÊçòˤŝĭòĦĦòíˤĵƙˤŝòÊŗòíˤŝŤòÊģʛˤʀʛˤµĎòĮˤŝĎòˤíƓíĮ˒ƛˤĦđģòˤÊˤĈũƆˤƀĎĵˤƀÊŝˤƛŗƆđĮĈˤƘĵˤŔƓçģˤĎòŗˤũŔʜˤŝĎòˤŝŤÊŗŤòíˤũŝđĮĈˤŝƓĈĮˤĦÊĮĈũÊĈòʛˤʁʛˤ)ÊçĎˤŔòŗŝĵĮˤƀĎĵˤģĮĵƀŝˤƆĵũˤĎÊŝˤÊˤíđƙćòŗòĮƜˤŔòŗçòŔƜƓĵĮˤĵƙˤƀĎĵˤƆĵũˤÊŗòʛˤˤɾʛˤGĮƜŗĵíũçòˤÊĮíˤòƅŔĦÊđĮˤƛĎòˤƘòçĎĮƓŖũòˤĵƙˤíĵđĮĈˤÊˤĎÊĮíŝŤÊĮíˤɿʛˤ\x92ƀƓŤçĎˤƘĵˤÊˤŝŤĵŗƆˤÊæĵũƜˤÊĮˤÊŝƜŗĵĮÊũƜ˙ŝˤƚđŗŝƜˤƛđĭòˤđĮˤŝŔÊçòˤʀʛˤ#òŝçŗđæòˤÊˤŝƓƜũÊƜƓĵĮˤƀĎòŗòˤÊˤƀĵĭÊĮˤũŝòŝˤŝƓĈĮˤĦÊĮĈũÊĈòˤƘĵˤÊſĵƓíˤũĮƀÊĮŤòíˤÊƜŤòĮƜƓĵĮˤʁʛˤ\x9aĎòˤƚđĮÊĦˤŔÊŗÊĈŗÊŔĎˤòƅŔĦÊđĮŝˤĎĵƀˤòſòŗƆĵĮòˤĎÊŝˤíđƙćòŗòĮƜˤŔòŗçòŔƜƓĵĮŝˤĵƙˤĵƜĎòŗŝɾʛˤGĮƜŗĵíũçƜƓĵĮˤƘĵˤÊĮˤũĮũŝũÊĦˤŝòĦƙˊĎòĦŔˤæĵĵģʜˤĭòĮƜƓĵĮđĮĈˤÊˤĎÊĮíŝŤÊĮíˤÊŝˤÊˤĭòŤÊŔĎĵŗˤƗĵŗˤòĭæŗÊçđĮĈˤçĎÊĦĦòĮĈòŝʛˤɿʛˤ#ƓŝçũŝŝˤƛĎòˤũĮòƅŔòçŤòíˤƛĎđĮĈŝˤĦòÊŗĮòíˤƚŗĵĭˤÊŝƜŗĵĮÊũŤŝʜˤđĮçĦũíđĮĈˤƛĎòˤŝĭòĦĦˤĵƙˤŝŔÊçòʛˤʀʛˤ#òŝçŗđæòˤÊˤƀĵĭÊĮ˙ŝˤçĦòſòŗˤƘÊçƜƓçˤƗĵŗˤÊſĵƓíđĮĈˤũĮƀÊĮŤòíˤÊƜŤòĮƜƓĵĮˤÊƜˤÊˤæÊŗʛˤʁʛˤ\x1dĵĮŤ

In [31]:
# FAISS Vector Embeddings and Vesctor Store

from langchain_community.vectorstores import FAISS 
db_1 = FAISS.from_documents(pdf_documents[:30], embedding=embeddings)

In [32]:
# FAISS Vector Database
query = "How do the authors make search tractable"
db_1.similarity_search(query)

[Document(page_content='µŗƓŤòˤÊˤçĵĎòŗòĮƜˤŔÊŝŝÊĈòˤĵƙˤʁˤŝĎĵŗƜˤŔÊŗÊĈŗÊŔĎŝʛˤ\x9aĎòˤòĮíˤŝòĮŤòĮçòˤĵƙˤòÊçĎˤŔÊŗÊĈŗÊŔĎˤĭũŝƜˤæòʝˤɾʛˤGƜˤƓŝĮ˙ƛˤíđƙƙƓçũĦƜˤƘĵˤíĵˤÊˤĎÊĮíŝŤÊĮíˤđƙˤƆĵũˤĠũŝƜˤŝŤÊĮíˤĵĮˤƆĵũŗˤĎÊĮíŝʛˤɿʛˤGƜˤçÊũĈĎƜˤĎđĭˤĵƙƙˤĈũÊŗíˤƛĎÊƜˤŝŔÊçòˤŝĭòĦĦòíˤĵƙˤŝòÊŗòíˤŝŤòÊģʛˤʀʛˤµĎòĮˤŝĎòˤíƓíĮ˒ƛˤĦđģòˤÊˤĈũƆˤƀĎĵˤƀÊŝˤƛŗƆđĮĈˤƘĵˤŔƓçģˤĎòŗˤũŔʜˤŝĎòˤŝŤÊŗŤòíˤũŝđĮĈˤŝƓĈĮˤĦÊĮĈũÊĈòʛˤʁʛˤ)ÊçĎˤŔòŗŝĵĮˤƀĎĵˤģĮĵƀŝˤƆĵũˤĎÊŝˤÊˤíđƙćòŗòĮƜˤŔòŗçòŔƜƓĵĮˤĵƙˤƀĎĵˤƆĵũˤÊŗòʛˤˤɾʛˤGĮƜŗĵíũçòˤÊĮíˤòƅŔĦÊđĮˤƛĎòˤƘòçĎĮƓŖũòˤĵƙˤíĵđĮĈˤÊˤĎÊĮíŝŤÊĮíˤɿʛˤ\x92ƀƓŤçĎˤƘĵˤÊˤŝŤĵŗƆˤÊæĵũƜˤÊĮˤÊŝƜŗĵĮÊũƜ˙ŝˤƚđŗŝƜˤƛđĭòˤđĮˤŝŔÊçòˤʀʛˤ#òŝçŗđæòˤÊˤŝƓƜũÊƜƓĵĮˤƀĎòŗòˤÊˤƀĵĭÊĮˤũŝòŝˤŝƓĈĮˤĦÊĮĈũÊĈòˤƘĵˤÊſĵƓíˤũĮƀÊĮŤòíˤÊƜŤòĮƜƓĵĮˤʁʛˤ\x9aĎòˤƚđĮÊĦˤŔÊŗÊĈŗÊŔĎˤòƅŔĦÊđĮŝˤĎĵƀˤòſòŗƆĵĮòˤĎÊŝˤíđƙćòŗòĮƜˤŔòŗçòŔƜƓĵĮŝˤĵƙˤĵƜĎòŗŝɾʛˤGĮƜŗĵíũçƜƓĵĮˤƘĵˤÊĮˤũĮũŝũÊĦˤŝòĦƙˊĎòĦŔˤæĵĵģʜˤĭòĮƜƓĵĮđĮĈˤÊˤĎÊĮíŝŤÊĮíˤÊŝˤÊˤĭòŤÊŔĎĵŗˤƗĵŗˤòĭæŗÊçđĮĈˤçĎÊĦĦòĮĈòŝʛˤɿʛˤ#ƓŝçũŝŝˤƛĎòˤũĮòƅŔòçŤòíˤƛĎđĮĈŝˤĦòÊŗĮòíˤƚŗĵĭˤÊŝƜŗĵĮÊũŤŝʜˤđĮçĦũíđĮĈˤƛĎòˤŝĭòĦĦˤĵƙˤŝŔÊçòʛˤʀʛˤ#òŝçŗđæòˤÊˤƀĵĭÊĮ˙ŝˤçĦòſòŗˤƘÊçƜƓçˤƗĵŗˤÊſĵƓíđĮĈˤũĮƀÊĮŤòíˤÊƜŤòĮƜƓĵĮˤÊƜˤÊˤæÊŗʛˤʁʛˤ\x1dĵĮŤ

### Now onto Video 5-LangChain Series-Advanced RAG Q&A Chatbot with Chain ANd Retrievers Using Langchain

In [33]:
from langchain_community.llms import Ollama
# Load Ollama Llama2 model
llm = Ollama(model="llama2:latest")
llm

Ollama(model='llama2:latest')

In [35]:
# Desing ChatPrompt Template

from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template("""
Answer the following question based only on the provided context.
Think step by step before providing a detailed answer.
I will tip you $1000 if the user finds the answer helpful.
<context>
{context}
</context>
Question: {input}""")

In [36]:
# Chain introduction
# Create Stuff Document Chain

from langchain.chains.combine_documents import create_stuff_documents_chain

document_chain = create_stuff_documents_chain(llm, prompt)

In [37]:
"""
Retrievers: A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. 
A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can ve used as the backbone
of a retriever, but there are other types of retrievers as well.
"""

retriever = db.as_retriever()
retriever

VectorStoreRetriever(tags=['Chroma', 'AzureOpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x000001CE73B599D0>)

In [38]:
"""
Retrieval chain: This chain takes in a user inquiry, which is then passed to the retriever to fetch relevant documents. 
Those documents (and original inputs) are then passed to an LLM to generate a response.
"""

from langchain.chains import create_retrieval_chain
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [41]:
response = retrieval_chain.invoke({"input": "Chain of thought is lacking in some areas."})

In [42]:
response['answer']

"Answer: Yes, the chain of thought can be limited in certain areas. Here are some potential limitations and ways to address them:\n\n1. Limited contextual understanding: LMs may struggle to understand the context of a problem or task, leading to suboptimal solutions. To address this, researchers can incorporate external knowledge sources, such as databases or knowledge graphs, to improve the LM's understanding of the context.\n2. Lack of common sense: LMs may not have the same level of common sense or real-world experience as humans, which can lead to unexpected or illogical solutions. To address this, researchers can incorporate more diverse and extensive training data to improve the LM's ability to understand the nuances of human reasoning.\n3. Limited creativity: LMs may struggle to come up with novel or innovative solutions due to their reliance on statistical patterns in the training data. To address this, researchers can incorporate more diverse and unexpected training data to im