Skip to content

Farru049/mini-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Mini RAG - Semantic Search Fundamentals

This project is a hands-on exploration of how modern AI retrieval systems work internally.

Instead of directly jumping into frameworks or full RAG pipelines, this project focuses on understanding the core foundation step-by-step:

Text → Embeddings → Similarity Search → Retrieval

The goal is not just to "use AI tools", but to understand what actually happens behind systems like:

  • ChatGPT Retrieval
  • RAG Pipelines
  • AI Search Engines
  • Vector Databases
  • AI Document Search Systems

What We Are Building

We are building a small semantic retrieval engine.

Traditional search systems work using exact keyword matching.

Example:

Query: "CEO of OpenAI"

Matches only if exact words exist.

Semantic Search works differently.

It tries to understand the meaning of text.

Example:

"Who runs OpenAI?"

can still retrieve:

"The CEO of the company is Sam Altman."

even though the words are different.

This is the core idea behind modern AI retrieval systems.


Concepts Covered So Far

1. Embeddings

The model converts text into dense numerical vectors called embeddings.

Example:

"The CEO of OpenAI"
      ↓
[0.12, -0.44, 0.91, ...]

These vectors capture semantic meaning instead of exact words.

Texts with similar meaning produce vectors that are closer together in vector space.


2. Sentence Transformers

We use:

SentenceTransformer('all-MiniLM-L6-v2')

This is a pretrained embedding model optimized for semantic similarity tasks.

Its job is to transform text into embeddings.


3. Semantic Search

Instead of:

  • keyword search
  • exact matching

we perform:

  • meaning-based retrieval

This allows related sentences to be retrieved even when the wording changes.


4. Cosine Similarity

After converting text into vectors, we compare them mathematically using cosine similarity.

Higher cosine similarity score means:

  • vectors are closer
  • meanings are more similar

Example:

0.92 → highly similar
0.15 → weak similarity

5. Top-K Retrieval

Instead of retrieving only one result, we retrieve the Top-K most relevant sentences.

Example:

top_k = 2

This is how real retrieval systems work before passing context to LLMs.


6. Multi Query Retrieval

The system now supports multiple queries.

For each query:

  1. Generate query embedding
  2. Compare against stored sentence embeddings
  3. Rank by similarity
  4. Retrieve Top-K matches

Real World Insight

One important observation:

Semantic similarity ≠ factual understanding

Example:

A query about the CEO may sometimes retrieve:

  • company-related information
  • organization-related information

instead of the exact factual sentence.

Why?

Because embeddings capture semantic closeness, not strict factual reasoning.

This is one of the major challenges in real-world AI retrieval systems.


Why Modern RAG Systems Are More Advanced

Production systems improve retrieval using:

  • Better embedding models
  • Re-ranking models
  • Hybrid search
  • Metadata filtering
  • Vector databases

This project focuses on understanding the foundation first.


Current Retrieval Flow

User Query
    ↓
Embedding Model
    ↓
Query Vector
    ↓
Cosine Similarity Search
    ↓
Top-K Retrieval
    ↓
Relevant Sentences

This is already the core retrieval backbone behind:

  • RAG systems
  • AI search
  • semantic document retrieval
  • vector database search

Technologies Used

  • Python
  • sentence-transformers
  • scikit-learn

Installation

Install dependencies:

pip install sentence-transformers scikit-learn

Model Used

all-MiniLM-L6-v2

A lightweight and fast sentence-transformer model for semantic similarity tasks.


Example Queries

multiple_queries = [
    "Can you tell me about the CEO of OpenAI?",
    "Where is the company headquartered?",
    "Which company's headquarters are in San Francisco?",
    "The main goal of OpenAI?"
]

Example Output

Query: Where is the company headquartered?

Relevant Sentence:
The headquarters of the company is located in San Francisco.

Similarity Score: 0.7475

What Comes Next

This project will gradually evolve into a complete mini-RAG pipeline.

Next concepts:

  • Chunking
  • Vector Databases
  • FAISS / ChromaDB
  • Storing embeddings
  • Retrieval from documents
  • Context injection into LLMs
  • Full RAG pipeline

Main Learning Goal

The focus of this project is not just building features.

The focus is understanding:

  • how retrieval actually works
  • why embeddings matter
  • how semantic search differs from traditional search
  • why vector databases exist
  • how modern RAG systems are built internally

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages