# Flash RAG


#✅ What is Flash RAG?
Flash RAG is a minimal Retrieval-Augmented Generation pipeline:

No agents.

Just fetch context from documents.

Send it to Gemini with the query.

Fast and effective for many use cases.

#🧩 Use Case:
Build a fast, document-aware Q&A system using LangChain’s RAG pipeline and Gemini.

#🧩 Prerequisites:

In [1]:
!pip install langchain langchain-google-genai chromadb




## Step 1: Import required libraries

In [2]:
# Step 1: Import required libraries
import os
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA


## Step 2: Set up Gemini LLM

In [3]:
# Step 2: Set up Gemini LLM
os.environ["GOOGLE_API_KEY"] = "AIzaSyDR7ItGwxOcbodnqRZXJQzFN_MVrRWxGaw"

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash-latest",
    temperature=0.3,
)


## Step 3: Load and chunk documents

In [4]:
# Step 3: Create a sample document (if needed)
with open("flash_rag_doc.txt", "w") as f:
    f.write("""
    LangChain is a framework for developing applications powered by language models.
    It provides integrations with external data, tools, and APIs.

    Gemini is Google's large language model that offers advanced reasoning and coding capabilities.

    Retrieval-Augmented Generation (RAG) is a method of enriching LLM outputs by retrieving supporting information from an external source.
    """)


In [6]:
# Step 3 Load and chunk the document
loader = TextLoader("flash_rag_doc.txt")
docs = loader.load()

# Split into smaller chunks for better retrieval
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(docs)


## Step 4: Create vector store using Gemini embeddings

In [7]:
# Step 4: Create a Chroma vector store with Gemini embeddings
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vectorstore = Chroma.from_documents(chunks, embedding=embeddings)

# Convert vector store to a retriever
retriever = vectorstore.as_retriever()


## Step 5: Create the RetrievalQA chain (Flash RAG)

In [8]:

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=False  # Optional: True if you want source docs
)


## Step 7: Ask a question (Flash Retrieval + Generation)

In [9]:
#
query = "What is LangChain used for?"
response = qa_chain.run(query)
print("Answer:\n", response)


  response = qa_chain.run(query)


Answer:
 LangChain is a framework for developing applications powered by language models.  It provides integrations with external data, tools, and APIs.
