This is where semantic search meets LLMs — the core power of LangChain!

What is RetrievalQAChain?

- It is a pre-built chain in LangChain that:
- Searches relevant documents using a retriever (like your FAISS vectorstore).
- Passes those documents (context) to an LLM (like GPT-4).
- Generates an answer to your question, based only on the retrieved docs.

Why use it?
Without RetrievalQAChain, you’d do:

- vectorstore.similarity_search(query)
- then manually feed that into GPT-4 for answering.

But RetrievalQAChain automates all of this!



In [1]:
import os 
from dotenv import load_dotenv

# Gemini LLM
from langchain_google_genai import ChatGoogleGenerativeAI

# Document Loader 
from langchain.document_loaders import WebBaseLoader

# Chunking data
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Huggingface embedding
from langchain.embeddings import HuggingFaceBgeEmbeddings

# Vector database for storing the embedding vector
from langchain.vectorstores import FAISS

# Retrieveal QA for semantic search with LLM
from langchain.chains import RetrievalQA

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
load_dotenv()

GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash",api_key=GOOGLE_API_KEY)

url = "https://en.wikipedia.org/wiki/Artificial_intelligence"
loader = WebBaseLoader(url)
documents = loader.load()

In [3]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

In [4]:
embeddings = HuggingFaceBgeEmbeddings(
    model_name = 'sentence-transformers/all-MiniLM-L6-v2'
)

vectorstore = FAISS.from_documents(docs, embeddings)

  embeddings = HuggingFaceBgeEmbeddings(
  from .autonotebook import tqdm as notebook_tqdm
W0518 21:07:04.440000 5160 site-packages\torch\distributed\elastic\multiprocessing\redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.





In [6]:
# Create retriever
retriver = vectorstore.as_retriever(search_kwargs={"k":3})


In [8]:
# Create RetrievalQAChain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriver,
    return_source_documents=True
)

In [9]:
query = "Who is considered the father of AI?"
result = qa_chain(query)

print("🧠 Answer:\n", result["result"])
print("\n📚 Source Chunks:")
for doc in result["source_documents"]:
    print("\n---")
    print(doc.page_content[:300], "...")  # preview only


  result = qa_chain(query)


🧠 Answer:
 Juergen Schmidhuber is considered the "Father of Modern AI".

📚 Source Chunks:

---
^ Colton, Emma (7 May 2023). "'Father of AI' says tech fears misplaced: 'You cannot stop it'". Fox News. Archived from the original on 26 May 2023. Retrieved 26 May 2023.

^ Jones, Hessie (23 May 2023). "Juergen Schmidhuber, Renowned 'Father Of Modern AI,' Says His Life's Work Won't Lead To Dystopia ...

---
and on having a "feel" for the situation, rather than explicit symbolic knowledge.[395] Although his arguments had been ridiculed and ignored when they were first presented, eventually, AI research came to agree with him.[ab][16] ...

---
Halpern, Sue, "The Coming Tech Autocracy" (review of Verity Harding, AI Needs You: How We Can Change AI's Future and Save Our Own, Princeton University Press, 274 pp.; Gary Marcus, Taming Silicon Valley: How We Can Ensure That AI Works for Us, MIT Press, 235 pp.; Daniela Rus and Gregory Mone, The Mi ...
