·
18 commits
to main
since this release
Immutable
release. Only release title and notes can be modified.
Improves Nepali retrieval quality and adds fully local LLM support via Ollama.
What's new
- Multilingual embeddings — switched to
intfloat/multilingual-e5-small
for better Nepali query understanding (wasall-MiniLM-L6-v2) - Balanced retrieval weights — BM25 0.5 / vector 0.5 (was 0.6 / 0.4)
- Nepali query preprocessing — NFC normalisation, question suffix
stripping, whitespace collapse for cleaner BM25 keyword overlap - OllamaClient — fully offline answer generation via local Ollama
(qwen2.5:7brecommended) download_corpus()— fetch the 5-document seed corpus from GitHub
without cloning the repo (works in Colab)- Cache invalidation — embedding cache clears automatically when
embedding model changes - Live Colab demo — 4 validated use cases, Nepali + English,
upload your own PDFs - 18 tests passing
Benchmark (5-document seed corpus via download_corpus())
Recall@1: 0.714 | Recall@3: 1.000 | Recall@5: 1.000
Keyword hit: 1.000 | Doc hit: 1.000 | Nepali recall@3: 1.000
Install
pip install nepal-gov-agent==0.3.0
Try it in Colab
Built on
ragnav, ragfallback, agentensemble — all MIT licensed