Morphology API (LLM-backed)

FastAPI service that provides morphology endpoints (conjugations, lemmas, plurals, number spelling, etc.). The primary backend uses Simon Willison's llm library so you can swap between Gemini, Ollama, OpenAI, or any other supported provider. A lightweight deterministic rule engine ships with the service for offline or forced-deterministic flows: We want to extend this to HFSTs in the future so less is by LLM and more by these.

Features

/v1/conjugations/{lang}/{form} – full verb paradigms with metadata
/v1/lemmas/{lang}/{form} – lemma candidates + POS
/v1/plurals/{lang}/{form}, /v1/singulars/{lang}/{form} – noun/adj inflection
/v1/numbers/{lang}/{digits} – number → words
/v1/definitions, /v1/terms – dictionary-style lookups (LLM-driven today)
/v1/help/* – catalogs mirroring ULAPI (partsofspeech, tenses, etc.)
Hybrid orchestration: LLM first, deterministic rule engine fallback on low confidence
English deterministic backend implemented with HFST finite-state transducers (falls back gracefully if HFST is unavailable)
force_deterministic=true shortcut for rule-only responses (returns HTTP 422 if unavailable)

Getting Started

pipx install uv  # one-time, if uv is not already available
uv venv
source .venv/bin/activate
uv pip install -e '.[dev]'

Configure your preferred LLM provider. For Gemini:

export GEMINI_API_KEY="AIza..."
uv run llm install llm-gemini  # installs the Gemini plugin once per environment

Prefer not to keep secrets in your shell history? Drop the key into a file and point the service at it:

mkdir -p .secrets
echo "AIza..." > .secrets/gemini.key
echo "MORPHOLOGY_GEMINI_API_KEY_FILE=.secrets/gemini.key" >> .env

The settings loader will read the key on startup and expose it to the llm plugin automatically.

HFST finite-state support

The deterministic English backend now prefers HFST transducers. The Python bindings are pulled in automatically through hfst when you install the project dependencies. On platforms without prebuilt HFST wheels, the service will fall back to the legacy heuristic rules while still exposing the same API surface. To force HFST usage, ensure the package installs cleanly in your environment:

uv pip install hfst

If HFST bindings are present but unstable in your environment, disable them and stick to deterministic rules by setting:

export MORPHOLOGY_DISABLE_HFST=1

A bundled German HFST transducer (from https://master.dl.sourceforge.net/project/hfst/resources/morphological-transducers/hfst-german.tar.gz?viasf=1) is included under src/morphology_service/data/hfst. Rule-based German conjugation/lemmatization/pluralization will use this transducer when source_preference=rule or force_deterministic=true.

To validate that your HFST installation is healthy, run the diagnostics script (uses the project venv):

uv run python scripts/hfst_diagnose.py

At runtime you can confirm the backend selection via /v1/help/languages, which now annotates each language with its deterministic provider.

Run the API:

uv run uvicorn morphology_service.main:app --reload

Tip: using uv run … ensures you pick up the project virtualenv (where HFST is installed) instead of a system Python without bindings.

Example request:

curl "http://localhost:8000/v1/conjugations/en/slept?expand_compound=true&variety=en-GB"

Configuration

Settings are loaded via environment variables using the MORPHOLOGY_ prefix. Key options:

Env var	Default	Description
`MORPHOLOGY_LLM_MODEL_NAME`	`gemini-2.0-flash`	Model identifier passed to `llm.get_model`
`MORPHOLOGY_LLM_TEMPERATURE`	`0.0`	Temperature for generations
`MORPHOLOGY_HYBRID_CONFIDENCE_THRESHOLD`	`0.8`	Confidence cut-off before falling back to deterministic rules
`MORPHOLOGY_MAX_LLM_RETRIES`	`2`	Retries for transient LLM failures

Optional .env files are supported.

Tests

uv run pytest

tests/ contains coverage around the hybrid orchestration and rule engine heuristics. Add your own gold lists for target languages as you plug in richer rule datasets.

To exercise the English/German demo prompts against a running API instance, start the server locally and run:

RUN_DEMO_SMOKE=1 uv run pytest tests/test_demo_examples.py

You can point at a remote instance by setting MORPHOLOGY_BASE_URL (defaults to http://localhost:8000/v1).

Extending

Swap models by installing the relevant llm-* plugin and updating MORPHOLOGY_LLM_MODEL_NAME
Implement richer deterministic engines via the RuleEngine class (e.g., integrate Pattern, Apertium)
Add caching or persistence by wrapping calls in your storage layer of choice

Roadmap Ideas

JSON Schema validation + structured retries for LLM responses
More granular provenance (per conjugated form)
Batch endpoints (/v1/batch/*) for higher throughput
Sandboxed offline bundle with Ollama or llama.cpp backends

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
scripts		scripts
src		src
tests		tests
web-demo		web-demo
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Morphology API (LLM-backed)

Features

Getting Started

HFST finite-state support

Configuration

Tests

Extending

Roadmap Ideas

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Morphology API (LLM-backed)

Features

Getting Started

HFST finite-state support

Configuration

Tests

Extending

Roadmap Ideas

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages