17. Backend API Design for AI Models (Design)
Scenario: How should we expose our AI models (anomaly detection, prediction) via REST APIs?

Tasks: Design a REST API structure. Consider: What endpoints are needed? How should inputs (features) and outputs (predictions, explanations) be structured (JSON)? How should we handle asynchronous predictions for long-running models? How would you version these AI-specific APIs?

**1. Core Design Principles:**
* Resource-Oriented: The API is structured around resources (the predictions and anomaly detection jobs).
* Stateless: Each request from a client contains all the information needed for the server to fulfill it + No client context is stored on the server between requests.
* HTTP Methods: We will use standard HTTP verbs (GET, POST) and status codes (200 OK, 202 Accepted, 400 Bad Request, etc.) to indicate request outcomes.
* JSON for Data Exchange: All request and response bodies will use JSON for its readability and broad support.
* Clear Separation of Sync & Async: The API provides distinct patterns for quick, real-time predictions and for long-running, asynchronous jobs.

**2. API Versioning:**
Crucial for iterating on our API structure without breaking client implementations. We will use **URI path versioning**: /api/v1/..., /api/v2/...

**3. Authentication:**
It is primordial to secure our API. I suggest using standars approaches like JWT and API Keys: The client sends a unique key in the request headers Authorization.

**4. Endpoint Design: Prediction Model:**
Like asked, we need 2 endpoints for preditions: one for synchronous (real-time) requests and one for asynchronous (batch/long-running) jobs.

**4.1. Synchronous Prediction:**
for models that can return a prediction in a few milliseconds.
Endpoint: POST /api/v1/predict
Request Body (Example with Anomaly Detection):
```json
{
  "model_id": "anomaly_detection",
  "data": {
    "timestamp": "2025-01-01",
    "emissions": 100,
    "production_volume": 50,
    "emissions_roll_avg_7d": 105,
    "emission_intensity": 2.0,
    "day_of_week": 0,
    "month": 1
  }
}
```
Request Body (Example: LCA Emissions):
```json
{
  "model_id": "lca_emissions",
  "data": {
    "steel_kg": 1000,
    "transport_km": 500
  }
}
```
Success Response (200 OK):
```json
{
  "model_id": "anomaly_detection",
  "predictions": [0, 1, 0, …],            // binary flags for anomalies
  "scores":      [0.02, 0.85, 0.10, …],   // probabilities
  "meta": {
    "timestamp": "2025-06-08T12:34:56Z",
    "duration_ms": 45
  }
}
```

**4.2 Asynchronous Prediction:**
This pattern is for models whose inference takes significant time (>2–3 s).
or forexplanation routines like SHAP, Monte Carlo(computationally intensive)..

**Step 1: Initiate Prediction Job**
- **Endpoint:** `POST /api/v1/prediction-jobs`
- **Request Body (Example: LCA Emissions with Monte Carlo):**
  ```json
  {
    "model_id": "lca_emissions",
    "data": {
      "steel_kg": 1000,
      "transport_km": 500
    },
    "parameters": {
      "n_simulations": 1000
    }
  }
  ```
- **Request Body (Example: Anomaly Detection Batch):**
  ```json
  {
    "model_id": "anomaly_detection",
    "data": [
      {
        "timestamp": "2025-01-01",
        "emissions": 100,
        "production_volume": 50,
        "emissions_roll_avg_7d": 105,
        "emission_intensity": 2.0,
        "day_of_week": 0,
        "month": 1
      },
      {
        "timestamp": "2025-01-02",
        "emissions": 110,
        "production_volume": 55,
        "emissions_roll_avg_7d": 108,
        "emission_intensity": 2.0,
        "day_of_week": 1,
        "month": 1
      }
    ]
  }
  ```
- **Response (202 Accepted):**
  ```json
  {
    "job_id": "abc123",
    "status_url": "/api/v1/prediction-jobs/abc123"
  }
  ```

**Step 2: Poll Job Status**
Endpoint: GET /api/v1/prediction-jobs/{job_id}

Responses:

Running (200 OK):
```json
{
  "job_id": "abc123",
  "status": "RUNNING",
  "progress_percent": 65,
  "updated_at": "2025-06-08T12:40:22Z"
}
```
Completed (200 OK):
```json
{
  "job_id": "abc123",
  "status": "COMPLETED",
  "result_uri": "s3://bucket/abc123/results.json",
  "completed_at": "2025-06-08T12:42:15Z"
}
```
Failed (200 OK):
```json
{
  "job_id": "abc123",
  "status": "FAILED",
  "error": "Input data format mismatch.",
  "completed_at": "2025-06-08T12:42:15Z"
}
```

**5. Endpoint Design: Anomaly Detection Model:**
This follows the same synchronous/asynchronous pattern as the prediction model, but with a different resource name and output structure.

**6. Error Handling:**
A standardized error response format will be used for all 4xx and 5xx status codes.
Example: 
400 Bad Request
404 Not found (missing feature for example)
500 Server error
