Skip to content

RajAakash/MCP_server

Repository files navigation

miniVite Cross-Platform Prediction Server

Overview

Your machine trains a neural net on HPC job runtime data and exposes it as a prediction server. Remote users submit job configs and get back a predicted runtime + scheduling verdict — without needing TensorFlow or the training data.

Remote user A ─┐
Remote user B ─┼──► POST /predict ──► YOUR SERVER ──► model.predict()
Remote user C ─┘                           │
                                           └──► AI scheduler ──► SLURM/PBS

Your machine (server) — one-time setup

pip install -r requirements.txt

# 1. Train the model and save all artifacts
python train_and_save.py

# 2. Start the prediction server
uvicorn prediction_server:app --host 0.0.0.0 --port 8000

Server is now live at http://YOUR_IP:8000

Check it's running:

curl http://localhost:8000/health
# → {"status":"ok","model":"miniVite_pass2"}

See valid machines/apps:

curl http://localhost:8000/schema

Remote user — client setup

# Only needs requests — no TensorFlow required
pip install requests

# Single job
python ai_scheduler.py \
  --server http://localhost:8000/ \
  --machine summit \
  --app miniVite \
  --ranks 64 \
  --nodes 4 \
  --threads 4 \
  --scale 20 \
  --avg-degree 16 \
  --base-time 142

python ai_scheduler.py
--server http://YOUR_SERVER_IP:8000
--machine summit
--app miniVite
--ranks 64
--nodes 4
--threads 4
--scale 20
--avg-degree 16
--base-time 142

Output:

────────────────────────────────────────────────────────
  Job           : (no id)
  Machine / App : summit / miniVite
  Ranks / Nodes : 64 / 4
  Graph scale   : 20M vertices
────────────────────────────────────────────────────────
  Predicted Δ   : 1.32×  (1.122–1.518)
  Est. wall time: 187.4s
  Inference     : 12.3ms
  Verdict       : SCHEDULE_NOW
  Reason        : Predicted cost is low — safe to run immediately.
────────────────────────────────────────────────────────

Batch submission

python ai_scheduler.py \
  --server http://YOUR_SERVER_IP:8000 \
  --batch example_jobs.json

Dry run (predict only, do not submit)

python ai_scheduler.py --server http://YOUR_SERVER_IP:8000 \
  --machine frontier --ranks 256 --scale 80 --dry-run

Use as a Python library

from ai_scheduler import AIScheduler, JobConfig

scheduler = AIScheduler("http://YOUR_SERVER_IP:8000")

job = JobConfig(
    machine="summit", app="miniVite",
    ranks=64, nodes=4, threads_per_rank=4,
    graph_scale_M=20, avg_edges_per_vertex=16,
    base_runtime_s=142.0,
    job_id="my-job-001",
)

result = scheduler.submit(job)
print(result["verdict"])         # "SCHEDULE_NOW"
print(result["est_wall_time_s"]) # 187.4

Files

File Where it runs Purpose
train_and_save.py your machine trains pass-1 + pass-2 model, saves all artifacts
prediction_server.py your machine FastAPI server, loads model, serves /predict
ai_scheduler.py remote user client, calls /predict, submits to SLURM
example_jobs.json remote user sample batch input
model_output/ your machine miniVite_pass2.keras + preprocessor.pkl + feature_names.json

Connecting to a real job scheduler

In ai_scheduler.py, replace the _queue_job method body with your actual SLURM/PBS call:

def _queue_job(self, config: JobConfig, priority: str):
    import subprocess
    cmd = [
        "sbatch",
        f"--job-name={config.app}",
        f"--nodes={config.nodes}",
        f"--ntasks={config.ranks}",
        f"--qos={priority}",
        "run_job.sh",
    ]
    subprocess.run(cmd, check=True)

About

This is MCP server code for cross-platform application prediction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages