FastAPI service exposing YOLO detection and CNN classification on images. Detection defaults to local axsy-yolo.pt; classification defaults to local axsy-classifier.pt.
- Install deps
pip install -r requirements.txt # or: pip install fastapi uvicorn pillow ultralytics google-cloud-storage- Start API
# If using GCS, set credentials before starting the server
export GOOGLE_APPLICATION_CREDENTIALS=/absolute/path/to/service-account.json
uvicorn server:get_app --factory --host 0.0.0.0 --port 3000- Test detection (defaults to
axsy-yolo.pt):
curl -sS -X POST http://localhost:3000/infer \
-F "image=@/path/to/image.jpg"- Test classification (defaults to
axsy-classifier.pt):
curl -sS -X POST http://localhost:3000/classify \
-F "image=@/path/to/image.jpg"detector(optional): absolute path orgs://path to YOLO model. If omitted, defaults to./axsy-yolo.pt.gcs_bucket(optional): bucket name whendetectoris a blob path (no scheme).customer_id,model_id(optional): when both are provided anddetectoris a blob path, the bucket is inferred ascustomer_id, and the blob asmodel_id + '/' + detector.
classifier(optional): absolute path orgs://path to classifier weights. If omitted, defaults to./axsy-classifier.pt.gcs_bucket,customer_id,model_idbehave the same as detection for resolving remote paths.
{
"classifier": "...",
"result": {
"input_size": 32,
"top_index": 12,
"top_prob": 0.93,
"probs": [0.01, 0.00, ...]
}
}{
"detector": "...",
"result": {
"image": {"width": 1928, "height": 2472},
"speed_ms": {"preprocess": 2.7, "inference": 454.0, "postprocess": 8.4},
"num_detections": 74,
"class_counts": {"shelf": 11, "product": 63},
"detections": [
{
"class_id": 2,
"class_name": "shelf",
"confidence": 0.9554,
"box": {
"xyxy": [x1, y1, x2, y2],
"center_xywh": [cx, cy, w, h],
"center_xywh_norm": [cxn, cyn, wn, hn]
}
}
]
}
}- The API is multi-user safe: model downloads are serialized per file, models are cached, and inference runs under a per‑model lock in a threadpool.
- Keep
smart-vision-trainiing-sa.jsonand large model weights out of git.
Prerequisites:
- Install and initialize gcloud (
https://cloud.google.com/sdk/docs/install). - Enable APIs:
gcloud services enable artifactregistry.googleapis.com run.googleapis.com cloudbuild.googleapis.com
Build and deploy using Cloud Build + Artifact Registry:
# From repo root
./cloudrun-deploy.sh <PROJECT_ID> <REGION> axsy-inference
# Example:
# ./cloudrun-deploy.sh my-gcp-project europe-west1 axsy-inferenceNotes:
- The Docker image listens on
PORT(default 8080) and startsuvicorn server:get_app --factory. - To pass runtime env vars (e.g. workers/log level) edit
cloudrun-deploy.shor run:gcloud run deploy axsy-inference \ --set-env-vars "UVICORN_WORKERS=1,UVICORN_LOG_LEVEL=info" \ --project <PROJECT_ID> --region <REGION>