A tiny semantic search app where you upload a .txt
file in the browser, the server chunks + embeds it, and you query the document by meaning.
- Plain HTML + FastAPI: minimal setup, clean separation of UI and API.
- OpenAI embeddings (
text-embedding-3-small
default): fast/cheap semantic vectors;-large
optional for higher quality. - Paragraph-based chunking (≈200–1200 chars): good context/precision trade-off without extra NLP dependencies; hard-split long blocks.
- L2-normalized vectors + NumPy dot product: simplest, stable cosine similarity for top-K ranking.
- In-memory index + env-loaded key: no file URLs, private-by-default local demo, easy to run anywhere.
- Upload local
.txt
(no URLs) → server builds an embedding index - Search with semantic similarity (cosine on normalized vectors)
- Choose model:
text-embedding-3-small
(default) ortext-embedding-3-large
- Simple, responsive HTML UI; fast, minimal backend; no persistence by default
your-folder/
├─ semantic_search.py # FastAPI backend
├─ index.html # Frontend (served at GET /)
└─ requirements.txt # Python deps
- Python 3.9+
- An OpenAI API key in the environment (
OPENAI_API_KEY
) or.env
# 1) Create & activate a virtualenv
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 2) Install deps
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
# 3) Set your key (or create a .env with OPENAI_API_KEY=...)
export OPENAI_API_KEY=sk-...
# 4) Start the server (module form guarantees correct venv)
python -m uvicorn semantic_search:app --reload --port 7860
# 5) Open the UI
# http://localhost:7860
- Open
http://localhost:7860
. - Upload a
.txt
file and pick an embedding model. - Click Build Index → you’ll get an
index_id
internally (managed by the page). - Enter a query and Top-K (default 5), then click Search.
- Read the ranked chunks and similarity scores.
-
GET / → serves
index.html
-
POST /upload (multipart)
- fields:
file
(.txt
),model
(optional, defaulttext-embedding-3-small
) - returns:
{ index_id, filename, model, num_chunks }
- fields:
-
POST /search (JSON)
- body:
{ index_id: str, query: str, k?: int }
- returns:
{ results: [{ rank, chunk_index, score, text }], k, filename }
- body:
Create sample_terms.txt
and paste:
ACMECLOUD SERVICE AGREEMENT (EXCERPT)
Refunds
We provide a 14-day cooling-off period for first-time subscriptions...
SLA
We target 99.9% monthly uptime...
Data Retention and Deletion
After termination, workspace is read-only for 30 days...
Security and Compliance
SOC 2 Type II, ISO 27001, TLS 1.2+, AES-256 at rest...
Try queries like:
- “cooling-off refund policy”
- “uptime guarantee credits”
- “how long do you keep data after termination”
- Semantic Searching, embeddings, chunking, vectors