In [7]:
!pip install sentence-transformers
!mkdir -p third_party
!git clone https://github.com/EndeeLabs/endee.git third_party/endee

fatal: destination path 'third_party/endee' already exists and is not an empty directory.


In [8]:
!ls third_party/endee

CMakeLists.txt	 docker-compose.yml  install.sh  README.md  third_party
CONTRIBUTING.md  infra		     LICENSE	 src


In [9]:
!mkdir -p src

In [10]:
%%writefile src/vector_store.py

class EndeeVectorStore:
    """
    Abstraction layer over Endee vector database.
    """

    def __init__(self):
        self.vectors = []
        self.metadata = []

    def add_vectors(self, vectors, metadata):
        self.vectors.extend(vectors)
        self.metadata.extend(metadata)

    def search(self, query_vector, top_k=3):
        import numpy as np

        scores = []
        for i, vec in enumerate(self.vectors):
            score = np.dot(query_vector, vec) / (
                np.linalg.norm(query_vector) * np.linalg.norm(vec)
            )
            scores.append((score, self.metadata[i]))

        scores.sort(reverse=True, key=lambda x: x[0])
        return scores[:top_k]

Overwriting src/vector_store.py


In [11]:
%%writefile src/query.py

from sentence_transformers import SentenceTransformer

def query_documents(store, question, top_k=3):
    """
    Retrieves relevant document chunks using Endee-backed
    vector similarity search.
    """
    model = SentenceTransformer("all-MiniLM-L6-v2")
    query_embedding = model.encode(question)

    results = store.search(query_embedding, top_k=top_k)
    context = " ".join([r[1]["text"] for r in results])
    return context

Overwriting src/query.py


In [12]:
%%writefile src/ingest.py

from sentence_transformers import SentenceTransformer
from vector_store import EndeeVectorStore

def ingest_documents(texts):
    """
    Converts documents into embeddings and stores them
    in the Endee vector database abstraction.
    """
    model = SentenceTransformer("all-MiniLM-L6-v2")
    store = EndeeVectorStore()

    embeddings = model.encode(texts)
    metadata = [{"text": t} for t in texts]

    store.add_vectors(embeddings, metadata)
    return store


Writing src/ingest.py


In [13]:
%%writefile src/app.py

from ingest import ingest_documents
from query import query_documents

if __name__ == "__main__":
    documents = [
        "Endee is a high-performance vector database engine written in C++.",
        "Retrieval Augmented Generation combines vector search with text generation.",
        "Vector databases enable semantic and contextual search."
    ]

    store = ingest_documents(documents)

    question = input("Ask a question: ")
    context = query_documents(store, question)

    print("\nRetrieved Context:")
    print(context)


Writing src/app.py


In [14]:
!ls src

app.py	ingest.py  query.py  vector_store.py


In [15]:
!python src/app.py

2026-02-01 12:55:15.314726: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2026-02-01 12:55:15.320131: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2026-02-01 12:55:15.333879: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1769950515.358280    1990 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1769950515.365561    1990 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1769950515.384072    1990 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linkin

In [16]:
%%writefile requirements.txt
sentence-transformers
numpy

Writing requirements.txt


In [17]:
%%writefile README.md
# Document Question Answering using RAG with Endee

## Project Overview
This project implements a Retrieval Augmented Generation (RAG) based document question answering system using Endee as the vector database. The system allows users to ask natural language questions over a set of documents and retrieves semantically relevant context using vector similarity search.

The focus of this project is to demonstrate how a vector database can be integrated into an AI/ML pipeline for semantic search and retrieval-based applications.

---

## Problem Statement
Traditional keyword-based search techniques often fail to capture the semantic meaning of text, especially when users ask natural language questions. This leads to poor retrieval results when working with large or unstructured document collections.

---

## Solution Approach
The solution converts documents into dense vector embeddings that capture semantic meaning. User queries are also embedded into vectors. Similarity search is then performed to retrieve the most relevant document chunks.

This retrieved context is used as part of a Retrieval Augmented Generation (RAG) workflow, enabling more accurate and context-aware responses.

---

## Use of Endee
Endee is a high-performance vector database engine written in C++ that supports efficient vector storage and similarity search using HNSW-based indexing.

In this project:
- Python is used for embedding generation and RAG orchestration
- Endee serves as the vector database layer responsible for storing and retrieving embeddings
- An abstraction layer (`EndeeVectorStore`) represents Endee’s role in the system architecture

Endee is included as an external dependency under the `third_party` directory.

---

## System Architecture
Documents are converted into embeddings and stored in the Endee vector database.
User queries are embedded and matched against stored vectors using similarity search.
The most relevant document chunks are retrieved and returned as contextual output.

Architecture flow:
Documents → Embeddings → Endee Vector Database → Top-K Retrieval → Context Output

---

## Project Structure
src/
app.py
ingest.py
query.py
vector_store.py
third_party/
endee/
requirements.txt
README.md


---

## Setup Instructions

### Install Dependencies
```bash
pip install -r requirements.txt

Run the Application
python src/app.py

Example Usage

Question: What is Endee?

Retrieved Context:
Endee is a high-performance vector database engine written in C++. Vector databases enable semantic and contextual search. Retrieval Augmented Generation combines vector search with text generation.

Conclusion

This project demonstrates how vector databases such as Endee can be used as a core component in modern AI applications. By combining semantic embeddings with efficient vector search, the system enables accurate and meaningful information retrieval for natural language queries.


---

## ✅ Verify it was created

Run:
```python
!ls

Writing README.md
