Open Kinetics Predictor is a production web interface for predicting enzyme kinetic parameters (kcat and KM) from protein sequence and substrate SMILES. It consolidates several state‑of‑the‑art machine learning / deep learning models behind a unified, asynchronous job API so you can submit sequences and retrieve structured predictions.
Live service: https://predictor.openkinetics.org/
| Engine | Input needed | Output | Citation |
|---|---|---|---|
| KinForm-H | Protein sequence + substrate SMILES | kcat or Km | Alwer & Fleming, npj Syst Biol Appl 2026 (GitHub) |
| KinForm-L | Protein sequence + substrate SMILES | kcat | Alwer & Fleming, npj Syst Biol Appl 2026 (GitHub) |
| UniKP | Protein sequence + substrate SMILES | kcat or Km | Yu et al., Nat Commun 2023 (GitHub) |
| DLKcat | Protein sequence + substrate SMILES | kcat | Li et al., Nat Catal 2022 (GitHub) |
| TurNup | Protein sequence + substrates list + products list | kcat | Kroll et al., Nat Commun 2023 (GitHub) |
| EITLEM | Protein sequence + substrate SMILES | kcat or Km | Shen et al., Biotechnol Adv 2024 (GitHub) |
| CataPro | Protein sequence + substrate SMILES | kcat, Km, or kcat/Km | Wang et al., Nat Commun 2025 (GitHub) |
| CatPred | Protein sequence + substrate SMILES | kcat or Km | Boorla et al., Nat Commun 2025 (GitHub) |
Each model is loaded with its published weights/code from models/ and invoked through integration wrappers in api/prediction_engines/, so new engines can be added with minimal wiring.
See docs/CONTRIBUTING.md for a step-by-step guide.
- Batch submission of sequences and substrates.
- Long‑running inference handled asynchronously (Celery + Redis) with progress tracking.
- Sequence similarity distribution of input data vs mehtods' training data (Using mmseq2).
- Caching sequence embeddings.
- React 18 + Vite (fast dev + ESM build)
- Bootstrap / React‑Bootstrap for layout & components
- Axios for API calls; Chart.js for result visualisation
- Django 5.1 (REST-style endpoints under
api/) - Celery workers for queued prediction tasks (
api/tasks.py) - Redis as Celery broker
- SQLite
- PyTorch, scikit-learn, RDKit, pandas for model computation & cheminformatics
- User submits a job (sequence + substrate(s) [+ products/mutant context if required]) via the frontend.
- Backend validates input (
api/services/validation_service.py). - A Celery task is enqueued; Redis broker stores the task message.
- Worker loads the selected model wrapper (e.g.
prediction_engines/kinform.py) and executes inference. - Results & intermediate status are persisted; cached for repeated identical queries.
- Frontend polls job status endpoint to update progress and results.
OpenKineticsPredictor provides a REST API for programmatic access. Submit prediction jobs, poll their status, and download results — no web browser required.
Base URL: https://predictor.openkinetics.org/api/v1
Full interactive documentation is also available on the live site at
/api-docs.
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/health/ |
No | Service health check |
GET |
/methods/ |
No | List available methods and required columns |
GET |
/quota/ |
Yes | Check remaining daily quota |
POST |
/validate/ |
Yes | Validate input data without submitting a job |
POST |
/submit/ |
Yes | Submit a prediction job |
GET |
/status/<jobId>/ |
Yes | Poll job status and progress |
GET |
/result/<jobId>/ |
Yes | Download results (CSV or ?format=json) |
import requests
import time
API_KEY = "ak_your_key_here"
BASE = "https://predictor.openkinetics.org/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}
# 1. Submit a job
with open("input.csv", "rb") as f:
resp = requests.post(
f"{BASE}/submit/",
headers=HEADERS,
files={"file": f},
data={
"predictionType": "kcat",
"kcatMethod": "DLKcat",
"handleLongSequences": "truncate",
"useExperimental": "true",
},
)
resp.raise_for_status()
job = resp.json()
print(f"Job ID: {job['jobId']} | Quota remaining: {job['quota']['remaining']:,}")
# 2. Poll until complete
while True:
status = requests.get(f"{BASE}/status/{job['jobId']}/", headers=HEADERS).json()
print(f" {status['status']} ({status['elapsedSeconds']}s)")
if status["status"] == "Completed":
break
if status["status"] == "Failed":
raise RuntimeError(f"Job failed: {status.get('error')}")
time.sleep(5)
# 3. Download results
result = requests.get(f"{BASE}/result/{job['jobId']}/", headers=HEADERS)
with open("output.csv", "wb") as f:
f.write(result.content)
print("Saved to output.csv")API_KEY="ak_your_key_here"
BASE="https://predictor.openkinetics.org/api/v1"
# 1. Submit
JOB=$(curl -s -X POST "$BASE/submit/" \
-H "Authorization: Bearer $API_KEY" \
-F "file=@input.csv" \
-F "predictionType=kcat" \
-F "kcatMethod=DLKcat" \
-F "handleLongSequences=truncate")
JOB_ID=$(echo "$JOB" | python3 -c "import sys,json; print(json.load(sys.stdin)['jobId'])")
echo "Submitted: $JOB_ID"
# 2. Poll
while true; do
STATE=$(curl -s "$BASE/status/$JOB_ID/" \
-H "Authorization: Bearer $API_KEY" \
| python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
echo " $STATE"
[ "$STATE" = "Completed" ] && break
[ "$STATE" = "Failed" ] && { echo "Job failed"; exit 1; }
sleep 5
done
# 3. Download
curl -s "$BASE/result/$JOB_ID/" \
-H "Authorization: Bearer $API_KEY" \
-o output.csvFor small datasets (≤ 10,000 rows) you can send data directly as JSON:
requests.post(
f"{BASE}/submit/",
headers={**HEADERS, "Content-Type": "application/json"},
json={
"predictionType": "kcat",
"kcatMethod": "DLKcat",
"handleLongSequences": "truncate",
"useExperimental": True,
"data": [
{"Protein Sequence": "MKTLLIFAG...", "Substrate": "CC(=O)O"},
{"Protein Sequence": "MGSSHHHHH...", "Substrate": "C1CCCCC1"},
],
},
)Use /validate/ to check substrate SMILES/InChI strings, protein sequences,
and per-model length limits without consuming any quota or running predictions.
This is equivalent to the validation step available in the web interface.
import requests
API_KEY = "ak_your_key_here"
BASE = "https://predictor.openkinetics.org/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}
# Basic validation (fast)
with open("input.csv", "rb") as f:
resp = requests.post(
f"{BASE}/validate/",
headers=HEADERS,
files={"file": f},
data={"runSimilarity": "false"},
)
result = resp.json()
print(f"Rows: {result['rowCount']}")
print(f"Invalid substrates: {len(result['invalidSubstrates'])}")
print(f"Invalid proteins: {len(result['invalidProteins'])}")
print(f"Length violations: {result['lengthViolations']}")Set runSimilarity=true to also run MMseqs2 sequence similarity analysis
against each method's training database. The request blocks synchronously
until the analysis is complete (can take several minutes for large inputs):
with open("input.csv", "rb") as f:
resp = requests.post(
f"{BASE}/validate/",
headers=HEADERS,
files={"file": f},
data={"runSimilarity": "true"},
timeout=600,
)
similarity = resp.json()["similarity"]
for method, data in similarity.items():
print(f"{method}: avg max identity = {data['average_max_similarity']:.1f}%")The similarity field in the response is a dict keyed by method name. Each entry
contains histogram_max, histogram_mean (10-bin arrays, 0–100% identity),
average_max_similarity, average_mean_similarity, count_max, and count_mean.
| Method | Predicts | Required columns | Max sequence length |
|---|---|---|---|
| DLKcat | kcat | Protein Sequence, Substrate |
No limit |
| TurNup | kcat | Protein Sequence, Substrates, Products |
1,024 residues |
| EITLEM | kcat or Km | Protein Sequence, Substrate |
1,024 residues |
| UniKP | kcat or Km | Protein Sequence, Substrate |
1,000 residues |
| CataPro | kcat, Km, or kcat/Km | Protein Sequence, Substrate |
1,000 residues |
| KinForm-H | kcat or Km | Protein Sequence, Substrate |
1,500 residues |
| KinForm-L | kcat only | Protein Sequence, Substrate |
1,500 residues |
| CatPred | kcat or Km | Protein Sequence, Substrate |
2,048 residues |
Substrates must be SMILES or InChI strings. For multi-substrate/product models, separate multiple
substrates/products with semicolons: CC(=O)O;C1CCCCC1.
- 20,000 predictions/day per API key (default; custom limits available).
- Counter resets at midnight UTC.
- Response headers:
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset. - HTTP
429is returned when the quota is exceeded.
All errors return JSON with a single error key:
{ "error": "A human-readable description of what went wrong." }| Status | Meaning |
|---|---|
| 400 | Invalid parameters, missing CSV columns, or bad data |
| 401 | Missing or invalid API key |
| 403 | Account suspended |
| 404 | Job not found |
| 409 | Results not ready yet |
| 429 | Quota exceeded |
| 500 | Internal server error |
Please cite the original publications when using predictions from a specific engine. Cite all underlying sources plus this platform.
For questions or collaboration: open an issue or reach out to the authors of the respective model.
This work was supported by the European Union’s Horizon Europe Framework Programme (#101080997), the Swiss State Secretariat for Education, Research and Innovation (#23.00232), and United Kingdom Research and Innovation (#10083717 and #10080153).