Skip to content

axsy-dev/axsy-inference-api

Repository files navigation

Axsy Inference API

FastAPI service exposing YOLO detection and CNN classification on images. Detection defaults to local axsy-yolo.pt; classification defaults to local axsy-classifier.pt.

Run locally

  1. Install deps
pip install -r requirements.txt  # or: pip install fastapi uvicorn pillow ultralytics google-cloud-storage
  1. Start API
# If using GCS, set credentials before starting the server
export GOOGLE_APPLICATION_CREDENTIALS=/absolute/path/to/service-account.json

uvicorn server:get_app --factory --host 0.0.0.0 --port 3000
  1. Test detection (defaults to axsy-yolo.pt):
curl -sS -X POST http://localhost:3000/infer \
  -F "image=@/path/to/image.jpg"
  1. Test classification (defaults to axsy-classifier.pt):
curl -sS -X POST http://localhost:3000/classify \
  -F "image=@/path/to/image.jpg"

Headers

  • detector (optional): absolute path or gs:// path to YOLO model. If omitted, defaults to ./axsy-yolo.pt.
  • gcs_bucket (optional): bucket name when detector is a blob path (no scheme).
  • customer_id, model_id (optional): when both are provided and detector is a blob path, the bucket is inferred as customer_id, and the blob as model_id + '/' + detector.

Classification headers

  • classifier (optional): absolute path or gs:// path to classifier weights. If omitted, defaults to ./axsy-classifier.pt.
  • gcs_bucket, customer_id, model_id behave the same as detection for resolving remote paths.

Classification response

{
  "classifier": "...",
  "result": {
    "input_size": 32,
    "top_index": 12,
    "top_prob": 0.93,
    "probs": [0.01, 0.00, ...]
  }
}

Response

{
  "detector": "...",
  "result": {
    "image": {"width": 1928, "height": 2472},
    "speed_ms": {"preprocess": 2.7, "inference": 454.0, "postprocess": 8.4},
    "num_detections": 74,
    "class_counts": {"shelf": 11, "product": 63},
    "detections": [
      {
        "class_id": 2,
        "class_name": "shelf",
        "confidence": 0.9554,
        "box": {
          "xyxy": [x1, y1, x2, y2],
          "center_xywh": [cx, cy, w, h],
          "center_xywh_norm": [cxn, cyn, wn, hn]
        }
      }
    ]
  }
}

Operational notes

  • The API is multi-user safe: model downloads are serialized per file, models are cached, and inference runs under a per‑model lock in a threadpool.
  • Keep smart-vision-trainiing-sa.json and large model weights out of git.

Deploy to Google Cloud Run

Prerequisites:

  • Install and initialize gcloud (https://cloud.google.com/sdk/docs/install).
  • Enable APIs:
    gcloud services enable artifactregistry.googleapis.com run.googleapis.com cloudbuild.googleapis.com

Build and deploy using Cloud Build + Artifact Registry:

# From repo root
./cloudrun-deploy.sh <PROJECT_ID> <REGION> axsy-inference
# Example:
# ./cloudrun-deploy.sh my-gcp-project europe-west1 axsy-inference

Notes:

  • The Docker image listens on PORT (default 8080) and starts uvicorn server:get_app --factory.
  • To pass runtime env vars (e.g. workers/log level) edit cloudrun-deploy.sh or run:
    gcloud run deploy axsy-inference \
      --set-env-vars "UVICORN_WORKERS=1,UVICORN_LOG_LEVEL=info" \
      --project <PROJECT_ID> --region <REGION>

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published