003_Production level RAG Workshop: Part 1

What is RAG?

Definition

RAG stands for:

Retrieval-Augmented Generation

It combines:

Retrieval
- Fetching relevant information from an external knowledge source
Generation
- Using an LLM to generate a response

Instead of relying only on the LLM’s pretrained knowledge, RAG supplements the LLM with retrieved context from external documents.

External tools

Lovable
Supabase

Why RAG Exists

Problem: Context Window Limitations

Consider a 1200-page nutrition textbook.

A naive solution:

   Question + Entire PDF
            ↓
           LLM

Problems:

Too many tokens
High cost
Context window overflow
Hallucinations
Slow responses

Example discussed:

PDF size ≈ 400K tokens

GPT context window ≈ 128K tokens

The entire document cannot fit into memory at once.

Hallucination Problem

When the relevant information is missing from the prompt:

The LLM may answer from pretrained knowledge rather than the provided document.

This leads to:

Incorrect answers
Non-grounded responses
Hallucinations

RAG helps reduce this issue by supplying only relevant document sections.

Open-Book Exam Analogy

The workshop explains RAG using an open-book exam.

Without RAG

    A student answers questions using memory only.

Equivalent:

    User Question
          ↓
         LLM
          ↓
       Answer

With RAG

A student:

Searches the book
Finds relevant pages
Uses both retrieved information and existing knowledge

Equivalent:

    User Question
           ↓
     Retrieval
           ↓
    Relevant Context
           ↓
          LLM
           ↓
        Answer

This is the core intuition behind RAG.

Evolution of RAG

RAG in 2021

Main objective:

Reduce hallucinations

Architecture:

    Documents
       ↓
    Retrieval
       ↓
    LLM
       ↓
    Answer

RAG Today

RAG is viewed as part of a larger discipline:

Context Engineering

Components include:

Retrieval
Prompt Engineering
Memory
State Management
Embeddings
Vector Databases
Long Context Windows

Modern RAG is therefore a subset of context engineering.

Context Engineering

Definition

    Context engineering is the practice of managing all information that enters an LLM's context.

Components:

Retrieval Context

     Information fetched from knowledge sources.

Conversation Memory

     Previous user interactions.

Application State

     Current workflow information.

Prompt Design

     Instructions guiding the model.

Storage Layer

Vector databases
Traditional databases
Hybrid storage systems

The workshop positions context engineering as the next evolution beyond prompt engineering.

Real-Life Example

Imagine a shopkeeper asks:

 "How much should I charge customer Amit?"

AI needs:

  Customer name = Amit
  Products in cart
  Discount rules
  GST rules
  Wallet balance

Without this information:

  AI = Guessing

With this information:

  AI = Accurate

Providing all this information is Context Engineering.

Today: Context Engineering

Now we provide much more than a prompt.

           Prompt
              +
           Company Documents
              +
           Customer Data
              +
           Previous Chats
              +
           Current Order
              +
           Database Information
             ↓
           LLM

This is Context Engineering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

003_Production level RAG Workshop: Part 1

What is RAG?

Definition

External tools

Why RAG Exists

Hallucination Problem

Open-Book Exam Analogy

Evolution of RAG

RAG in 2021

RAG Today

Context Engineering

Components:

Real-Life Example

Today: Context Engineering

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally