STPA-for-AI extension: ML lifecycle hazards, data-driven UCAs, DeepSTPA patterns

## Context

STPA is being rapidly adopted for AI/ML safety analysis. Recent research extends classic STPA to handle ML-specific failure modes:

- **DeepSTPA** (arXiv 2302.10588): Extends control loops to cover the ML lifecycle (data collection → training → deployment → monitoring). Adds layer-wise analysis of ML model internals.
- **UniSTPA** (arXiv 2505.15005): Extends along lifecycle and structural-depth dimensions for end-to-end autonomous driving.
- **MIT/SEI research**: STPA for frontier AI — characterizing loss of control in AI-containing sociotechnical systems.
- **ISO/PAS 8800**: Requires AI safety lifecycle analysis (aligns with STPA methodology).
- **SAE J3187**: STPA is now a formal SAE standard.

Rivet's existing `stpa.yaml` (15+ artifact types) covers classic STPA. This issue extends it for AI/ML systems.

## New artifact types for `stpa.yaml` or `schemas/stpa-ai.yaml`

### ML lifecycle control loop extensions

| Artifact Type | Description | Links |
|---|---|---|
| `ml-controller` | An ML model acting as controller (e.g., perception CNN, decision transformer) | `refines → controller` |
| `ml-process-model` | The ML model's learned representation of the world (implicit, not explicit like traditional controllers) | `models → ml-controller` |
| `training-data-source` | Training dataset with provenance, bias assessment, distribution characteristics | `trains → ml-controller` |
| `data-hazard` | Hazard arising from data quality (bias, distribution shift, labeling errors, adversarial samples) | `leads-to-hazard → hazard` |
| `ml-uca` | Unsafe control action specific to ML failure modes — extends the 4 classic UCA types with ML-specific causes | `refines → uca` |
| `ml-loss-scenario` | Causal scenario specific to ML (training data bias → misclassification → UCA → hazard → loss) | `refines → loss-scenario` |
| `monitoring-trigger` | Post-deployment monitoring condition that indicates model degradation (distribution drift, accuracy decay) | `monitors → ml-controller` |
| `retraining-requirement` | Requirement triggered by monitoring (when to retrain, with what data, validation criteria) | `satisfies → monitoring-trigger` |

### ML-specific UCA categories

Classic STPA defines 4 UCA types: (1) not providing, (2) providing causes hazard, (3) too early/late/wrong order, (4) stopped too soon/applied too long.

For ML controllers, additional categories:
- **Misclassification UCA**: ML controller classifies input incorrectly → wrong control action
- **Confidence UCA**: ML controller acts with inappropriate confidence (overconfident on OOD inputs)
- **Latency UCA**: ML inference too slow for real-time control
- **Degradation UCA**: Model performance degrades over time (data drift) → previously safe actions become unsafe

### New fields on existing types

```yaml
# Extend existing `uca` type with optional ML fields
- name: ml-failure-mode
  type: string
  required: false
  allowed-values:
    - misclassification
    - overconfidence
    - out-of-distribution
    - adversarial
    - data-drift
    - latency
    - mode-collapse
  description: ML-specific failure mode causing this UCA

- name: operational-design-domain
  type: text
  required: false
  description: ODD conditions under which this UCA can occur (ISO 21448 SOTIF alignment)
```

### Traceability rules

- Every `ml-controller` must have at least one `training-data-source` (error)
- Every `training-data-source` should have a `data-hazard` assessment (warning)
- Every `ml-uca` must link to a hazard (error)
- Every `monitoring-trigger` must have a `retraining-requirement` response (warning)
- Every `ml-controller` should have post-deployment monitoring (`monitoring-trigger` backlink, warning)

### Bridge to ISO/PAS 8800

```yaml
# stpa-ai-8800-bridge.yaml
link-types:
  - name: ai-lifecycle-phase
    description: Maps STPA-AI artifacts to ISO/PAS 8800 lifecycle phases
    source-types: [ml-controller, training-data-source, monitoring-trigger]
    target-types: [ai-assurance-argument]  # From safety-case schema (#103)
```

## Integration with spar

Spar's EMV2 fault tree analysis already handles error propagation in AADL models. For ML components modeled in AADL (via WASI P3 threads or custom component categories):
- EMV2 error states can map to ML failure modes (misclassification → error state)
- Fault trees generated from AADL+EMV2 can reference STPA-AI UCAs
- Architecture-level analysis (scheduling, latency) validates that ML inference meets timing requirements

## References
- [DeepSTPA: STPA for Learning-Enabled Systems](https://arxiv.org/abs/2302.10588)
- [UniSTPA: Safety Analysis for End-to-End Autonomous Driving](https://arxiv.org/html/2505.15005v1)
- [STAMP/STPA for Loss of Control in AI Systems](https://arxiv.org/abs/2512.17600)
- [SEI: My AI System Works...But Is It Safe?](https://www.sei.cmu.edu/blog/my-ai-system-worksbut-is-it-safe-to-use/)
- [Simon Mylius: Systematic Hazard Analysis for Frontier AI using STPA](https://simonmylius.com/blog/stpa)
- [STPA for ML Perception in Automotive](https://link.springer.com/chapter/10.1007/978-3-031-14835-4_21)
- [ISO/PAS 8800:2024](https://www.iso.org/standard/83303.html) — AI safety lifecycle
- [ISO 21448 SOTIF](https://www.iso.org/standard/77490.html) — functional insufficiencies
- SAE J3187 (STPA as SAE standard)
- Existing: `schemas/stpa.yaml`, `schemas/stpa-sec.yaml`
- Spar: `spar-analysis/src/emv2_analysis.rs` (fault tree generation from EMV2)

Artifact Type	Description	Links
`ml-controller`	An ML model acting as controller (e.g., perception CNN, decision transformer)	`refines → controller`
`ml-process-model`	The ML model's learned representation of the world (implicit, not explicit like traditional controllers)	`models → ml-controller`
`training-data-source`	Training dataset with provenance, bias assessment, distribution characteristics	`trains → ml-controller`
`data-hazard`	Hazard arising from data quality (bias, distribution shift, labeling errors, adversarial samples)	`leads-to-hazard → hazard`
`ml-uca`	Unsafe control action specific to ML failure modes — extends the 4 classic UCA types with ML-specific causes	`refines → uca`
`ml-loss-scenario`	Causal scenario specific to ML (training data bias → misclassification → UCA → hazard → loss)	`refines → loss-scenario`
`monitoring-trigger`	Post-deployment monitoring condition that indicates model degradation (distribution drift, accuracy decay)	`monitors → ml-controller`
`retraining-requirement`	Requirement triggered by monitoring (when to retrain, with what data, validation criteria)	`satisfies → monitoring-trigger`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STPA-for-AI extension: ML lifecycle hazards, data-driven UCAs, DeepSTPA patterns #105

Context

New artifact types for `stpa.yaml` or `schemas/stpa-ai.yaml`

ML lifecycle control loop extensions

ML-specific UCA categories

New fields on existing types

Traceability rules

Bridge to ISO/PAS 8800

Integration with spar

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

STPA-for-AI extension: ML lifecycle hazards, data-driven UCAs, DeepSTPA patterns #105

Description

Context

New artifact types for stpa.yaml or schemas/stpa-ai.yaml

ML lifecycle control loop extensions

ML-specific UCA categories

New fields on existing types

Traceability rules

Bridge to ISO/PAS 8800

Integration with spar

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

New artifact types for `stpa.yaml` or `schemas/stpa-ai.yaml`