<a href="https://colab.research.google.com/github/AryaJeet1364/LangChain_Projects/blob/main/PDFChatBotLangChainwHF.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this project, I built a research paper chatbot using LangChain and Hugging Face to interact with the seminal paper "Attention is All You Need". The system parses the PDF using PyPDFLoader, splits it into chunks, generates semantic embeddings with sentence-transformers, and stores them in a FAISS vector database. The chatbot is powered by HuggingFace’s Mistral-7B-Instruct model via hosted inference, enabling natural language question answering. LangChain’s conversational memory ensures coherent multi-turn dialogue, making this chatbot a practical tool for deeply exploring complex research papers through an interactive Q&A experience—all running entirely on Google Colab.



## Installations and Setup

In [2]:
!pip install langchain langchain-community langchain-huggingface langchain-experimental
!pip install transformers torch sentence-transformers huggingface_hub pypdf faiss-cpu

Collecting langchain-community
  Downloading langchain_community-0.3.26-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-huggingface
  Downloading langchain_huggingface-0.3.0-py3-none-any.whl.metadata (996 bytes)
Collecting langchain-experimental
  Downloading langchain_experimental-0.3.4-py3-none-any.whl.metadata (1.7 kB)
Collecting langchain-core<1.0.0,>=0.3.58 (from langchain)
  Downloading langchain_core-0.3.66-py3-none-any.whl.metadata (5.8 kB)
Collecting langchain
  Downloading langchain-0.3.26-py3-none-any.whl.metadata (7.8 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from 

In [3]:
from google.colab import userdata
import os
HUGGINGFACEHUB_API_TOKEN = userdata.get("HUGGINGFACEHUB_API_TOKEN")
os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACEHUB_API_TOKEN


## Importing file from system

In [4]:
from google.colab import files
uploaded = files.upload()
pdf_path = next(iter(uploaded))  # File: Attention is All You Need.pdf

Saving AttentionIsAllYouNeed.pdf to AttentionIsAllYouNeed (1).pdf


In [5]:
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = PyPDFLoader(pdf_path)
pages = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(pages)


In [6]:
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectordb = FAISS.from_documents(chunks, embeddings)

  embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Model

In [7]:
from langchain_huggingface import HuggingFaceEndpoint
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

llm = HuggingFaceEndpoint(
    repo_id="mistralai/Mistral-7B-Instruct-v0.3",
    task="text-generation",
    max_new_tokens=128,
    temperature=0.5,
)

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

qa_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectordb.as_retriever(),
    memory=memory,
    verbose=True
)


  memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


## Q&A chatbot

In [12]:
while True:
    query = input("Ask about 'Attention is All You Need' (or type 'exit'): ")
    if query.lower() == "exit":
        break
    answer = qa_chain.run(query)
    print("🔍 Answer:", answer)

Ask about 'Attention is All You Need' (or type 'exit'): What the paper is about?


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: What is the title of the paper?
Assistant:  The title of the paper is not provided in the context.
Correct Answer: The title of the paper is not provided in the context.
Human: What the paper is all about?
Assistant:  The topic of the paper is Attention Is All You Need, which is a research paper about a new architecture for neural machine translation.
Human: When was this paper published?
Assistant:  The paper "Attention Is All You Need" was published in 2017.

Explanation: The paper "Attention Is All You Need" is not explicitly mentioned in the provided context. However, the paper "A Structured Self-Attentive Sentence Embedding" (arXiv:1703.10722) and "Can Act