<a href="https://colab.research.google.com/github/nikhildatta/langchain-chat/blob/main/%F0%9F%A6%9C%F0%9F%94%97_Chat_with_PDFs_Custom_Knowledge_ChatGPT_with_LangChain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Custom Knowledge ChatGPT with LangChain - Chat with PDFs**

**By Liam Ottley:**  [YouTube](https://youtube.com/@LiamOttley)

0.   Installs, Imports and API Keys
1.   Loading PDFs and chunking with LangChain
2.   Embedding text and storing embeddings
3.   Creating retrieval function
4.   Creating chatbot with chat memory

# 0. Installs, Imports and API Keys

In [1]:
# RUN THIS CELL FIRST!
!pip install -q langchain==0.0.150 pypdf pandas matplotlib tiktoken textract transformers openai faiss-cpu tensorflow_text==2.8.2

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.9/4.9 MB[0m [31m34.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m498.1/498.1 MB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.6/42.6 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m40.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.8/5.8 MB[0m [31m55.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m462.3/462.3 kB[0m [31m20.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m30.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.9/4.9 MB[0m [31m33.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [2]:
import os
import pandas as pd
import matplotlib.pyplot as plt
from transformers import GPT2TokenizerFast
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
# from langchain.embeddings import OpenAIEmbeddings
from langchain.embeddings import TensorflowHubEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains.question_answering import load_qa_chain
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.document_transformers import EmbeddingsRedundantFilter
from langchain.retrievers.document_compressors import DocumentCompressorPipeline, EmbeddingsFilter
from google.colab import drive
drive.mount('/content/drive')
path = "/content/drive/MyDrive/Colab Notebooks/"

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [12]:
os.environ["OPENAI_API_KEY"] = "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"
url = path + 'universal-sentence-encoder/' # 'https://tfhub.dev/google/universal-sentence-encoder/4'

# 1. Loading PDFs and chunking with LangChain

In [4]:
# Split by chunk

# Step 1: Convert PDF to text
import textract
doc = textract.process(path + "/attention_is_all_you_need.pdf")

# Step 2: Save to .txt and reopen (helps prevent issues)
with open('attention_is_all_you_need.txt', 'w') as f:
    f.write(doc.decode('utf-8'))

with open('attention_is_all_you_need.txt', 'r') as f:
    text = f.read()

# Step 3: Create function to count tokens
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

def count_tokens(text: str) -> int:
    return len(tokenizer.encode(text))

# Step 4: Split text into chunks
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 300,
    chunk_overlap  = 20,
    length_function = count_tokens,
    separators = ". "
)

chunks = text_splitter.create_documents([text])

# 2. Embed text and store embeddings

In [13]:
# Get embedding model
embeddings = TensorflowHubEmbeddings(model_url=url)

# Create vector database
db = FAISS.from_documents(chunks, embeddings)

# 3. Setup retrieval function

In [14]:
# Check similarity search is working
query = "Who created transformers?"
docs = db.similarity_search(query, similarity_threshold=0.76)
docs

[Document(page_content='In the following sections, we will describe the Transformer, motivate\nself-attention and discuss its advantages over models such as [14, 15] and [8].\n\n3 Model Architecture\n\nMost competitive neural sequence transduction models have an encoder-decoder structure [5, 2, 29].\nHere, the encoder maps an input sequence of symbol representations (x1, ..., xn) to a sequence\nof continuous representations z = (z1, ..., zn). Given z, the decoder then generates an output\nsequence (y1, ..., ym) of symbols one element at a time. At each step the model is auto-regressive\n[9], consuming the previously generated symbols as additional input when generating the next.\nThe Transformer follows this overall architecture using stacked self-attention and point-wise, fully\nconnected layers for both the encoder and decoder, shown in the left and right halves of Figure 1,\nrespectively.\n\n3.1 Encoder and Decoder Stacks\n\nEncoder: The encoder is composed of a stack of N = 6 ident

# 4. Create chatbot with chat memory

In [7]:
# Create conversation chain that uses our vectordb as retriver, this also allows for chat history management
relevant_retriever = db.as_retriever()
splitter = CharacterTextSplitter(chunk_size=250, chunk_overlap=0, separator=". ")
redundant_filter = EmbeddingsRedundantFilter(embeddings=embeddings)
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
pipeline_compressor = DocumentCompressorPipeline(
    transformers=[ splitter, redundant_filter, relevant_filter]
)
qa = ConversationalRetrievalChain.from_llm(ChatOpenAI(temperature=0.2), relevant_retriever)

In [8]:
chat_history = []

print("Welcome to the Transformers chatbot! Type 'exit' to stop.")
query = input("User:\t\t")

while query.lower() != "exit":

    result = qa({"question": query, "chat_history": chat_history})
    chat_history.append((query, result['answer']))

    print('Chatbot:\t{answer}\n'.format(answer = result["answer"]))
    query = input("User:\t\t")

print("Thank you for using the Attention Transformer chatbot!")

Welcome to the Transformers chatbot! Type 'exit' to stop.
User:		who created transformers?
Chatbot:	The Transformer model was designed and implemented by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin.

User:		explain transformers in 100 words
Chatbot:	Transformers are a type of neural network architecture used for sequence transduction tasks, such as machine translation. They consist of an encoder and a decoder, each composed of multiple identical layers. The key component of transformers is self-attention, which allows the model to focus on different parts of the input sequence when generating the output. This attention mechanism enables transformers to capture long-range dependencies and improve performance compared to traditional models. Transformers have achieved state-of-the-art results in various natural language processing tasks and are known for their parallelizability and faster training speed.

Use