A multi-source synonym discovery tool with frequency-band filtering. Combines WordNet and fastText to surface candidates across the full Zipf range. Pick a corpus, pick a frequency band, get synonyms.
Live at synonymicon.xyz.
- Claude Opus 4.5
- Claude Opus 4.6
- Claude Opus 4.7
- Perplexity Computer
- Xiaomi MiMo-V2-Pro
- Xiaomi MiMo-V2.5-Pro
- MiniMax M2.7
- Python 3.12 + Flask (synchronous, single-process, no database)
- wordfreq for default frequency
- NLTK WordNet for primary synonyms
- fastText (
fasttext-wiki-news-subwords-300via gensim) for secondary candidates - Included frequency corpora: wordfreq, SUBTLEX-US, BNC, Google 1-grams, Wikipedia, Kaggle, OpenSubtitles, Project Gutenberg, Leipzig News 2025, Leipzig Web COM 2018, Leipzig Web UK 2018
- Definition fallback chain: Wiktionary REST API → Webster's 1913 (local) → WordNet gloss →
[undefined] - Vanilla single-page frontend (no build step, no framework)
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python scripts/setup_nltk.pyThe fastText model (~1 GB) downloads on first run via gensim and is cached under ~/gensim-data/.
flask run --no-reloadUse --no-reload because fastText loads at module scope and the reloader would spawn two processes that both load it. Startup takes ~2.5–3 minutes.
Server on localhost:5000.
gunicorn -w 1 -t 120 -b 127.0.0.1:5000 app:app-w 1(one worker) is intentional; each worker loads ~1.5 GB of model + corpus data.-t 120keeps gunicorn from killing the worker during the long startup.- Run behind a reverse proxy (nginx, Caddy) for TLS.
- Resident memory: ~1.5–2 GB (fastText ~1 GB, corpora ~200 MB, runtime).
- Cold start: ~2.5–3 minutes.
- Not compatible with serverless or sleep-on-idle hosting.
GET /synonyms?word=<x>&tier=<t>&pos=<p>&corpus=<c>
Returns JSON: [{word, zipf, definition, band}, ...].
| Param | Values |
|---|---|
word |
required; up to 2 words for phrase queries |
tier |
all, common, uncommon, rare, exotic, absurd (or comma-separated) |
pos |
all, noun, verb, adj, adv (or comma-separated) |
corpus |
wordfreq (default), subtlex, bnc, google_1grams, wikipedia, kaggle, opensubtitles, gutenberg, leipzig_news, leipzig_web_com, leipzig_web_uk |
min, max |
optional Zipf floats (advanced mode; overrides tier) |
app.py Flask app (all backend logic)
data/ Corpus files + Webster's 1913
static/index.html Single-page frontend (HTML + inline CSS + inline JS)
scripts/setup_nltk.py One-time NLTK data download
requirements.txt Pinned dependencies
CLAUDE.md Architecture and design rationale
MIT — see LICENSE.
Frequency corpora are credited in-app under the "corpora" link in the footer.