Control AI workloads before they reach a model.
KORA is an AI Workload Control Layer. It helps route deterministic, reusable, retrieval-needed, tool-needed, and provider-needed work before unnecessary model invocation.
Most AI systems treat every task as a model task. KORA starts one step earlier: it inspects the workload, chooses a route, and makes provider-needed work explicit.
- Inspect workloads before deployment.
- Route deterministic work without provider calls.
- Reuse repeated work through cache paths.
- Separate retrieval-needed and tool-needed work.
- Mark provider-needed tasks explicitly.
Current latest-feature use is from source:
git clone https://github.com/Krako-Labs/KORA.git
cd KORA
python3 -m pip install -e .Run the first-value paths:
python3 -m kora doctor examples/kora_doctor/customer_support_workload.json
python3 -m kora proxy-demo examples/openai_compatible_proxy/requests.json
python3 examples/cache_reuse/run.pypip install kora is not this project.
The planned future PyPI package name is getkora, with CLI command kora and Python import package kora.
getkora is not published yet. Use the source install path above for the latest KORA features.
| Example | Shows | Run | Details |
|---|---|---|---|
| KORA Doctor | Workload inspection | python3 -m kora doctor examples/kora_doctor/customer_support_workload.json |
README |
| Deterministic Classification | Rule-routed classification | python3 examples/deterministic_classification/run.py |
README |
| OpenAI-Compatible Proxy | OpenAI-style request routing | python3 -m kora proxy-demo examples/openai_compatible_proxy/requests.json |
README |
| RAG Routing | Retrieval-aware control | python3 examples/rag_routing/run.py |
README |
| Agent Workflow Optimization | Multi-step workflow routing | python3 examples/agent_workflow_optimization/run.py |
README |
| Cache Reuse | Repeated-work reuse | python3 examples/cache_reuse/run.py |
README |
See the full example catalog.
A workload enters KORA before it reaches a model.
KORA evaluates each unit of work and routes it to one of several paths:
- deterministic handling
- cache reuse
- retrieval-needed handling
- tool-needed handling
- provider-needed fallback
The included examples are offline and make zero provider calls.
KORA currently demonstrates offline sample workloads and simulated provider/model invocation avoidance.
The repository does not claim:
- production cost reduction proof
- real API-cost reduction proof
- production readiness
- benchmark superiority
- full OpenAI API compatibility
- production RAG, agent, or cache correctness
- model replacement
See the claim registry and public language guide.
- Documentation index
- Vision: AI Workload Control Layer
- Example catalog
- Packaging: getkora strategy
- Reports and evidence
- Contributing
MIT License. See LICENSE.