# Simple RAG Application

This notebook builds a Retrieval-Augmented Generation (RAG) pipeline that answers questions using a custom Deep Learning document.
The text is chunked and embedded, stored in a Typesense vector database, and semantically retrieved for each query. A local Hugging Face language model then generates accurate, context-aware answers, with support for conversational follow-ups.

## Step 1: Install libraries

What this does


* langchain → core framework (chains, memory, prompts)
* langchain_community → integrations (Typesense, loaders, etc.)
* typesense → client to talk to Typesense Cloud


In [1]:
!pip install langchain langchain_community typesense

Collecting langchain_community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting typesense
  Downloading typesense-1.3.0-py3-none-any.whl.metadata (1.9 kB)
Collecting SQLAlchemy<3,>=1.4 (from langchain)
  Downloading sqlalchemy-2.0.45-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (9.5 kB)
INFO: pip is looking at multiple versions of langchain-community to determine which version is compatible with other requirements. This could take a while.
Collecting langchain_community
  Downloading langchain_community-0.4-py3-none-any.whl.metadata (3.0 kB)
  Downloading langchain_community-0.3.31-py3-none-any.whl.metadata (3.0 kB)
Downloading langchain_community-0.3.31-py3-none-any.whl (2.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m49.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading typesense-1.3.0-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━

## Step 2: Import everything

Conceptually

You are importing all building blocks of a RAG system:

| Part                         | Role                   |
| ---------------------------- | ---------------------- |
| TextLoader                   | load raw text          |
| TextSplitter                 | chunk text             |
| Embeddings                   | convert text → vectors |
| Typesense                    | vector database        |
| HuggingFacePipeline          | local LLM              |
| RetrievalQA                  | basic RAG              |
| ConversationalRetrievalChain | chat RAG               |
| Memory                       | conversation state     |


In [2]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Typesense
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain.llms import HuggingFacePipeline
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

2025-12-28 16:35:10.932745: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1766939711.205354      17 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1766939711.283063      17 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1766939711.915763      17 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1766939711.915824      17 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1766939711.915827      17 computation_placer.cc:177] computation placer alr

## Step 3: Load your data

What happens


* Reads the file
* Wraps it into a LangChain Document object
* Metadata is attached automatically


In [3]:
loader = TextLoader("/kaggle/input/deep-learning/Deep Learning.txt",encoding='utf-8')
data = loader.load()

## Step 4: Split text into chunks

Why this is critical

LLMs and embeddings:

* cannot handle very long text
* need semantic chunks


In [4]:
text_splitter = CharacterTextSplitter(chunk_size=1000,chunk_overlap=200)
docs = text_splitter.split_documents(data)

## Step 5: Create embeddings

What happens

Each chunk → vector (384 dimensions)

This model is:

* fast
* CPU-friendly
* decent quality


In [5]:
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

  embeddings = HuggingFaceEmbeddings(


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Step 6: Store vectors in Typesense

What this does internally

1. Creates a Typesense collection
2. Embeds all chunks
3. Uploads vectors + metadata
4. Returns a vector store object


In [6]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
typesense_api_key = user_secrets.get_secret("typesense_api_key")
TYPESENSE_HOST = user_secrets.get_secret("TYPESENSE_HOST")

In [7]:
docsearch = Typesense.from_documents(
    docs,
    embeddings,
    typesense_client_params={
        'host': TYPESENSE_HOST , 
        'port': '443',
        'protocol': 'https',
        'typesense_api_key': typesense_api_key ,
        'typesense_collection_name':"deeplearning"

    }
)



## Step 7: Test retrieval

Why this step matters

This verifies:

* embeddings work
* Typesense indexing worked
* retrieval returns relevant chunks

If this fails → RAG will fail.

In [8]:
query = "what are applications of deep learning"
founds = docsearch.similarity_search(query)
print(founds[0].page_content)

4. Applications of Deep Learning
(i) Computer Vision
Deep learning enables machines to interpret visual data. Applications include self-driving cars, surveillance systems, facial recognition, and medical diagnostics such as tumor detection.
(ii) Natural Language Processing
From chatbots and translation systems to voice assistants, deep learning allows machines to understand, generate, and respond to human language with increasing accuracy.
(iii) Healthcare and Science
Deep learning assists in drug discovery, disease prediction, genome analysis, and personalized treatment planning, significantly accelerating scientific research.

5. Advantages and Challenges
Advantages
* Automatically extracts features from raw data
* Performs well with large and complex datasets
* Achieves state-of-the-art accuracy in many domains
Challenges
* Requires massive computational power and data
* Models can be difficult to interpret (black-box nature)
*  Training is time-consuming and resource-intensive


## Step 8: Load local LLM (Gemma)

In [9]:
model_path = "/kaggle/input/gemma-2/transformers/gemma-2-2b-it/2"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

## Step 9: Create Hugging Face pipeline

In [10]:
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    temperature=0.2,
    do_sample=True,
    return_full_text=False
)

Device set to use cpu


## Step 10: Wrap pipeline for LangChain

In [11]:
llm = HuggingFacePipeline(pipeline=pipe)

  llm = HuggingFacePipeline(pipeline=pipe)


## Step 11: Basic RAG (RetrievalQA)

In [12]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    retriever = docsearch.as_retriever(),
    chain_type="stuff",
)

## Step 12: Ask a question

In [13]:
ans = qa.invoke(query)

In [14]:
print(ans['result'])

 
Deep learning has a wide range of applications, including:

* **Computer Vision:** Self-driving cars, surveillance systems, facial recognition, and medical diagnostics.
* **Natural Language Processing:** Chatbots, translation systems, voice assistants, and sentiment analysis.
* **Healthcare and Science:** Drug discovery, disease prediction, genome analysis, and personalized treatment planning. 



## Step 13: Add memory

In [15]:
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

  memory = ConversationBufferMemory(


## Step 14: Conversational RAG

In [16]:
QA = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever = docsearch.as_retriever(),
    memory=memory,
)

## Step 15: Ask & follow up

In [17]:
ans = QA.invoke(query)

In [18]:
print(ans['answer'])

 
Deep learning has a wide range of applications, including:

* **Computer Vision:** Self-driving cars, surveillance systems, facial recognition, and medical diagnostics.
* **Natural Language Processing:** Chatbots, translation systems, voice assistants, and text analysis.
* **Healthcare and Science:** Drug discovery, disease prediction, genome analysis, and personalized treatment planning. 



In [19]:
followup = QA.invoke("Explain in detail")

In [20]:
print(followup['answer'])



Deep learning is revolutionizing healthcare and science by accelerating research and improving diagnoses. Here's how:

**Healthcare:**

* **Drug Discovery:** Deep learning analyzes vast datasets of molecules and biological information to identify potential drug candidates, significantly reducing the time and cost of drug development.
* **Disease Prediction:** By analyzing patient data like medical history, genetic information, and lifestyle choices, deep learning models can predict the risk of developing diseases like cancer, heart disease, and Alzheimer's. This allows for early intervention and personalized treatment plans.
* **Medical Imaging Analysis:** Deep learning algorithms can analyze medical images like X-rays, CT scans, and MRIs to detect tumors, fractures, and other abnormalities with greater accuracy and speed than traditional methods.
* **Personalized Treatment Planning:** By analyzing a patient's medical history, genetic information, and lifestyle choices, deep learning