Open Kinetics Predictor

Open Kinetics Predictor is a production web interface for predicting enzyme kinetic parameters (kcat and KM) from protein sequence and substrate SMILES. It consolidates several state‑of‑the‑art machine learning / deep learning models behind a unified, asynchronous job API so you can submit sequences and retrieve structured predictions.

Live service: https://predictor.openkinetics.org/

Prediction Engines

Engine	Input needed	Output	Citation
KinForm-H	Protein sequence + substrate SMILES	kcat or Km	Alwer & Fleming, npj Syst Biol Appl 2026 (GitHub)
KinForm-L	Protein sequence + substrate SMILES	kcat	Alwer & Fleming, npj Syst Biol Appl 2026 (GitHub)
UniKP	Protein sequence + substrate SMILES	kcat or Km	Yu et al., Nat Commun 2023 (GitHub)
DLKcat	Protein sequence + substrate SMILES	kcat	Li et al., Nat Catal 2022 (GitHub)
TurNup	Protein sequence + substrates list + products list	kcat	Kroll et al., Nat Commun 2023 (GitHub)
EITLEM	Protein sequence + substrate SMILES	kcat or Km	Shen et al., Biotechnol Adv 2024 (GitHub)
CataPro	Protein sequence + substrate SMILES	kcat, Km, or kcat/Km	Wang et al., Nat Commun 2025 (GitHub)
CatPred	Protein sequence + substrate SMILES	kcat or Km	Boorla et al., Nat Commun 2025 (GitHub)

Each model is loaded with its published weights/code from models/ and invoked through integration wrappers in api/prediction_engines/, so new engines can be added with minimal wiring.

Adding a New Prediction Method

See docs/CONTRIBUTING.md for a step-by-step guide.

Features

Batch submission of sequences and substrates.
Long‑running inference handled asynchronously (Celery + Redis) with progress tracking.
Sequence similarity distribution of input data vs mehtods' training data (Using mmseq2).
Caching sequence embeddings.

Stack

Frontend

React 18 + Vite (fast dev + ESM build)
Bootstrap / React‑Bootstrap for layout & components
Axios for API calls; Chart.js for result visualisation

Backend

Django 5.1 (REST-style endpoints under api/)
Celery workers for queued prediction tasks (api/tasks.py)
Redis as Celery broker
SQLite
PyTorch, scikit-learn, RDKit, pandas for model computation & cheminformatics

High-Level Flow

User submits a job (sequence + substrate(s) [+ products/mutant context if required]) via the frontend.
Backend validates input (api/services/validation_service.py).
A Celery task is enqueued; Redis broker stores the task message.
Worker loads the selected model wrapper (e.g. prediction_engines/kinform.py) and executes inference.
Results & intermediate status are persisted; cached for repeated identical queries.
Frontend polls job status endpoint to update progress and results.

API Access

OpenKineticsPredictor provides a REST API for programmatic access. Submit prediction jobs, poll their status, and download results — no web browser required.

Base URL: https://predictor.openkinetics.org/api/v1

Full interactive documentation is also available on the live site at /api-docs.

Endpoint Overview

Method	Endpoint	Auth	Description
`GET`	`/health/`	No	Service health check
`GET`	`/methods/`	No	List available methods and required columns
`GET`	`/quota/`	Yes	Check remaining daily quota
`POST`	`/validate/`	Yes	Validate input data without submitting a job
`POST`	`/submit/`	Yes	Submit a prediction job
`GET`	`/status/<jobId>/`	Yes	Poll job status and progress
`GET`	`/result/<jobId>/`	Yes	Download results (CSV or `?format=json`)

Quick Start — Python

import requests
import time

API_KEY = "ak_your_key_here"
BASE    = "https://predictor.openkinetics.org/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# 1. Submit a job
with open("input.csv", "rb") as f:
    resp = requests.post(
        f"{BASE}/submit/",
        headers=HEADERS,
        files={"file": f},
        data={
            "predictionType":      "kcat",
            "kcatMethod":          "DLKcat",
            "handleLongSequences": "truncate",
            "useExperimental":     "true",
        },
    )
resp.raise_for_status()
job = resp.json()
print(f"Job ID: {job['jobId']}  |  Quota remaining: {job['quota']['remaining']:,}")

# 2. Poll until complete
while True:
    status = requests.get(f"{BASE}/status/{job['jobId']}/", headers=HEADERS).json()
    print(f"  {status['status']} ({status['elapsedSeconds']}s)")
    if status["status"] == "Completed":
        break
    if status["status"] == "Failed":
        raise RuntimeError(f"Job failed: {status.get('error')}")
    time.sleep(5)

# 3. Download results
result = requests.get(f"{BASE}/result/{job['jobId']}/", headers=HEADERS)
with open("output.csv", "wb") as f:
    f.write(result.content)
print("Saved to output.csv")

Quick Start — curl

API_KEY="ak_your_key_here"
BASE="https://predictor.openkinetics.org/api/v1"

# 1. Submit
JOB=$(curl -s -X POST "$BASE/submit/" \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@input.csv" \
  -F "predictionType=kcat" \
  -F "kcatMethod=DLKcat" \
  -F "handleLongSequences=truncate")

JOB_ID=$(echo "$JOB" | python3 -c "import sys,json; print(json.load(sys.stdin)['jobId'])")
echo "Submitted: $JOB_ID"

# 2. Poll
while true; do
  STATE=$(curl -s "$BASE/status/$JOB_ID/" \
    -H "Authorization: Bearer $API_KEY" \
    | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
  echo "  $STATE"
  [ "$STATE" = "Completed" ] && break
  [ "$STATE" = "Failed" ]    && { echo "Job failed"; exit 1; }
  sleep 5
done

# 3. Download
curl -s "$BASE/result/$JOB_ID/" \
  -H "Authorization: Bearer $API_KEY" \
  -o output.csv

JSON Body Submission (no CSV file needed)

For small datasets (≤ 10,000 rows) you can send data directly as JSON:

requests.post(
    f"{BASE}/submit/",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={
        "predictionType":      "kcat",
        "kcatMethod":          "DLKcat",
        "handleLongSequences": "truncate",
        "useExperimental":     True,
        "data": [
            {"Protein Sequence": "MKTLLIFAG...", "Substrate": "CC(=O)O"},
            {"Protein Sequence": "MGSSHHHHH...", "Substrate": "C1CCCCC1"},
        ],
    },
)

Validating Input Before Submission

Use /validate/ to check substrate SMILES/InChI strings, protein sequences, and per-model length limits without consuming any quota or running predictions. This is equivalent to the validation step available in the web interface.

import requests

API_KEY = "ak_your_key_here"
BASE    = "https://predictor.openkinetics.org/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Basic validation (fast)
with open("input.csv", "rb") as f:
    resp = requests.post(
        f"{BASE}/validate/",
        headers=HEADERS,
        files={"file": f},
        data={"runSimilarity": "false"},
    )

result = resp.json()
print(f"Rows: {result['rowCount']}")
print(f"Invalid substrates: {len(result['invalidSubstrates'])}")
print(f"Invalid proteins:   {len(result['invalidProteins'])}")
print(f"Length violations:  {result['lengthViolations']}")

Set runSimilarity=true to also run MMseqs2 sequence similarity analysis against each method's training database. The request blocks synchronously until the analysis is complete (can take several minutes for large inputs):

with open("input.csv", "rb") as f:
    resp = requests.post(
        f"{BASE}/validate/",
        headers=HEADERS,
        files={"file": f},
        data={"runSimilarity": "true"},
        timeout=600,
    )

similarity = resp.json()["similarity"]
for method, data in similarity.items():
    print(f"{method}: avg max identity = {data['average_max_similarity']:.1f}%")

The similarity field in the response is a dict keyed by method name. Each entry contains histogram_max, histogram_mean (10-bin arrays, 0–100% identity), average_max_similarity, average_mean_similarity, count_max, and count_mean.

CSV Format

Method	Predicts	Required columns	Max sequence length
DLKcat	kcat	`Protein Sequence`, `Substrate`	No limit
TurNup	kcat	`Protein Sequence`, `Substrates`, `Products`	1,024 residues
EITLEM	kcat or Km	`Protein Sequence`, `Substrate`	1,024 residues
UniKP	kcat or Km	`Protein Sequence`, `Substrate`	1,000 residues
CataPro	kcat, Km, or kcat/Km	`Protein Sequence`, `Substrate`	1,000 residues
KinForm-H	kcat or Km	`Protein Sequence`, `Substrate`	1,500 residues
KinForm-L	kcat only	`Protein Sequence`, `Substrate`	1,500 residues
CatPred	kcat or Km	`Protein Sequence`, `Substrate`	2,048 residues

Substrates must be SMILES or InChI strings. For multi-substrate/product models, separate multiple substrates/products with semicolons: CC(=O)O;C1CCCCC1.

Rate Limits

20,000 predictions/day per API key (default; custom limits available).
Counter resets at midnight UTC.
Response headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.
HTTP 429 is returned when the quota is exceeded.

Error Format

All errors return JSON with a single error key:

{ "error": "A human-readable description of what went wrong." }

Status	Meaning
400	Invalid parameters, missing CSV columns, or bad data
401	Missing or invalid API key
403	Account suspended
404	Job not found
409	Results not ready yet
429	Quota exceeded
500	Internal server error

Attribution

Please cite the original publications when using predictions from a specific engine. Cite all underlying sources plus this platform.

Contact

For questions or collaboration: open an issue or reach out to the authors of the respective model.

Funding

This work was supported by the European Union’s Horizon Europe Framework Programme (#101080997), the Swiss State Secretariat for Education, Research and Innovation (#23.00232), and United Kingdom Research and Innovation (#10083717 and #10080153).

Name		Name	Last commit message	Last commit date
Latest commit History 289 Commits
.vscode		.vscode
api		api
db_models		db_models
docker-requirements		docker-requirements
docs		docs
fastas		fastas
frontend		frontend
models		models
staticfiles/admin		staticfiles/admin
tests		tests
tools		tools
webKinPred		webKinPred
.codex		.codex
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.web		Dockerfile.web
Makefile		Makefile
README.md		README.md
deploy.sh		deploy.sh
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
manage.py		manage.py
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Kinetics Predictor

Prediction Engines

Adding a New Prediction Method

Features

Stack

Frontend

Backend

High-Level Flow

API Access

Endpoint Overview

Quick Start — Python

Quick Start — curl

JSON Body Submission (no CSV file needed)

Validating Input Before Submission

CSV Format

Rate Limits

Error Format

Attribution

Contact

Funding

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Open Kinetics Predictor

Prediction Engines

Adding a New Prediction Method

Features

Stack

Frontend

Backend

High-Level Flow

API Access

Endpoint Overview

Quick Start — Python

Quick Start — curl

JSON Body Submission (no CSV file needed)

Validating Input Before Submission

CSV Format

Rate Limits

Error Format

Attribution

Contact

Funding

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages