# Unit Real-Time UX & Failsafe (Human-in-the-Loop)

In this notebook you will design the logic that transforms model outputs into safe, real-time actions for users.

You will implement:
- UX event mapping (prediction → action)
- failsafe logic (uncertainty, instability, invalid inputs)
- human-in-the-loop confirmation / override

This notebook assumes you already have a working Safe Inference API from Unit 2.2 (or you can simulate model outputs).

## Lab Step 1 - From Prediction to UX Event

In this step, you will transform AI model outputs into user-facing UX events.
The goal is to decide *what the system should do* based on predictions, confidence, and context.



In [1]:
from dataclasses import dataclass
from typing import Dict, Any

### 1.1 Defining a UX Event

A UX event represents what the system communicates or does after evaluating a prediction.

In [2]:
@dataclass
class UXEvent:
    event_type: str        # e.g. "NO_ACTION", "SHOW_ALERT", "REQUEST_CONFIRMATION"
    message: str           # user-facing message
    severity: str          # "low", "medium", "high"
    metadata: Dict[str, Any]


### 1.2 Mapping Predictions to UX Events

We now define simple rules that map predictions, confidence, and context to UX events.


In [3]:
def map_prediction_to_event(activity: str, confidence: float, restricted_area: bool) -> UXEvent:
    if confidence < 0.6:
        return UXEvent(
            event_type="NO_ACTION",
            message="Low confidence – no action taken.",
            severity="low",
            metadata={"activity": activity, "confidence": confidence}
        )

    if restricted_area and activity in ["RUNNING", "CLIMBING"]:
        return UXEvent(
            event_type="SHOW_ALERT",
            message="Risky activity detected in restricted area.",
            severity="high",
            metadata={
                "activity": activity,
                "confidence": confidence,
                "restricted_area": restricted_area
            }
        )

    return UXEvent(
        event_type="NO_ACTION",
        message="Normal activity.",
        severity="low",
        metadata={"activity": activity, "confidence": confidence}
    )

### 1.3 Simulating Different Scenarios

In [4]:
test_cases = [
    ("RUNNING", 0.92, True),
    ("WALKING", 0.85, True),
    ("RUNNING", 0.45, True),
    ("SITTING", 0.80, False),
]

for activity, confidence, restricted in test_cases:
    event = map_prediction_to_event(activity, confidence, restricted)
    print(event)


UXEvent(event_type='SHOW_ALERT', message='Risky activity detected in restricted area.', severity='high', metadata={'activity': 'RUNNING', 'confidence': 0.92, 'restricted_area': True})
UXEvent(event_type='NO_ACTION', message='Normal activity.', severity='low', metadata={'activity': 'WALKING', 'confidence': 0.85})
UXEvent(event_type='NO_ACTION', message='Low confidence – no action taken.', severity='low', metadata={'activity': 'RUNNING', 'confidence': 0.45})
UXEvent(event_type='NO_ACTION', message='Normal activity.', severity='low', metadata={'activity': 'SITTING', 'confidence': 0.8})


### Reflection

Observe how small changes in confidence or context change the resulting UX event.
Why is this behaviour important for user trust and safety?

## Lab Step 2 – Implementing Failsafe Logic

In this step, you will extend the decision logic by adding failsafe mechanisms.
The goal is to handle uncertainty and instability in a safe and controlled way.


### 2.1 Detecting Prediction Instability

Rapid changes in predicted activities may indicate uncertainty rather than real changes.
We detect instability by analysing recent predictions over a short time window.



In [5]:
from typing import List

def is_unstable(predictions: List[str], max_changes: int = 2) -> bool:
    changes = sum(
        1 for i in range(1, len(predictions))
        if predictions[i] != predictions[i - 1]
    )
    return changes > max_changes


### 2.2 Applying Failsafe Rules

If predictions are unstable, the system should avoid triggering high-severity alerts.
Instead, it may request confirmation or take no action.


In [6]:
def apply_failsafe(event: UXEvent, unstable: bool) -> UXEvent:
    if unstable and event.event_type == "SHOW_ALERT":
        return UXEvent(
            event_type="REQUEST_CONFIRMATION",
            message="Unstable detection – user confirmation required.",
            severity="medium",
            metadata=event.metadata
        )
    return event


### 2.3 Testing Failsafe Behaviour

In [7]:
recent_predictions_stable = ["WALKING", "WALKING", "WALKING", "WALKING"]
recent_predictions_unstable = ["RUNNING", "WALKING", "RUNNING", "WALKING"]

print("Stable sequence unstable?:", is_unstable(recent_predictions_stable))
print("Unstable sequence unstable?:", is_unstable(recent_predictions_unstable))

base_event = UXEvent(
    event_type="SHOW_ALERT",
    message="Risky activity detected.",
    severity="high",
    metadata={}
)

print("Event without failsafe:", base_event)
print("Event with failsafe:", apply_failsafe(base_event, unstable=True))


Stable sequence unstable?: False
Unstable sequence unstable?: True
Event without failsafe: UXEvent(event_type='SHOW_ALERT', message='Risky activity detected.', severity='high', metadata={})
Event with failsafe: UXEvent(event_type='REQUEST_CONFIRMATION', message='Unstable detection – user confirmation required.', severity='medium', metadata={})


### Reflection

Observe how the failsafe logic changes system behaviour.
Why is it important to downgrade alerts when predictions are unstable?

## Lab Step 3 – Human-in-the-Loop (HITL) Integration

In this step, you will add a human decision point to the AI workflow.
The system will request confirmation or allow override before executing certain actions.


### 3.1 Human Confirmation and Override

Human-in-the-Loop mechanisms allow users or supervisors to confirm or cancel AI-generated events.
This is especially important when actions are safety-critical or uncertainty is high.


In [8]:
def human_decision(event: UXEvent, simulated_response: str = "CONFIRM") -> UXEvent:
    """
    Simulates a human response to an AI-generated event.
    simulated_response can be: 'CONFIRM' or 'CANCEL'
    """
    if event.event_type != "REQUEST_CONFIRMATION":
        return event

    if simulated_response == "CONFIRM":
        return UXEvent(
            event_type="SHOW_ALERT",
            message="Alert confirmed by user.",
            severity="high",
            metadata=event.metadata
        )

    return UXEvent(
        event_type="NO_ACTION",
        message="Alert cancelled by user.",
        severity="low",
        metadata=event.metadata
    )


### 3.2 End-to-End HITL Simulation

We now simulate the full decision chain:
prediction → UX event → failsafe → human confirmation.


In [9]:
# Base prediction scenario
activity = "RUNNING"
confidence = 0.85
restricted_area = True

# Step 1: prediction to UX event
base_event = map_prediction_to_event(activity, confidence, restricted_area)

# Step 2: failsafe logic
recent_predictions = ["RUNNING", "WALKING", "RUNNING", "WALKING"]
unstable = is_unstable(recent_predictions)
safe_event = apply_failsafe(base_event, unstable)

# Step 3: human-in-the-loop
final_event = human_decision(safe_event, simulated_response="CONFIRM")

print("Base event:", base_event)
print("After failsafe:", safe_event)
print("After HITL:", final_event)


Base event: UXEvent(event_type='SHOW_ALERT', message='Risky activity detected in restricted area.', severity='high', metadata={'activity': 'RUNNING', 'confidence': 0.85, 'restricted_area': True})
After failsafe: UXEvent(event_type='REQUEST_CONFIRMATION', message='Unstable detection – user confirmation required.', severity='medium', metadata={'activity': 'RUNNING', 'confidence': 0.85, 'restricted_area': True})
After HITL: UXEvent(event_type='SHOW_ALERT', message='Alert confirmed by user.', severity='high', metadata={'activity': 'RUNNING', 'confidence': 0.85, 'restricted_area': True})


### Reflection

Observe how the final system decision depends on both AI logic and human input.
Why is this combination important for safety-critical applications?


## Lab Step 4 - End-to-End Integration: From Sensor Data to Safe Action

In this final lab step, we connect everything into a single end-to-end pipeline:

1. Sensor data input  
2. On-device model inference  
3. Secure inference and validation (Safe Inference API)  
4. Decision logic and UX mapping  
5. Failsafe mechanisms  
6. Human-in-the-Loop interaction  
7. Final system action  

Goal: run a realistic flow using the real model and (optionally) real data, while keeping the notebook runnable even if some artifacts are missing.


In [10]:
import os
import numpy as np
import onnxruntime as ort
from dataclasses import dataclass
from typing import Dict, Any, Optional, List

### 4.1 Sensor Data Input

In this step, we load a sensor window from a dataset (preferably the benchmark windows created in Unit 2.1/2.2) or generate a synthetic window if no dataset is available. 

- **Dataset Loading:** If available, we load sensor data (in this case, from an NPZ file) using the appropriate keys. The sensor window (`X_window`) is loaded and its shape and data type are printed for inspection.
- **Synthetic Data:** If no dataset is found, a synthetic window matching the model's input shape will be generated to allow further testing of the pipeline.

**Important Notes:**
1. **Window Size and Shape:** The `X_window` data is expected to match the dimensions `(W, C)`, where `W` is the window length and `C` is the number of channels. 
2. **Data Type:** The window's data type should be `float32` to be compatible with the model.
3. **Range and Normalization:** In this example, the loaded window data shows large magnitude values, which may affect model validation. If necessary, we will normalize or scale the data to ensure proper model functioning.



In [12]:
import os
import numpy as np

# ---------- Paths ----------
DATA_DIR = os.path.join("..", "data")
NPZ_PATH = os.path.join(DATA_DIR, "uca_ehar_preprocessed_win100_step50.npz")

# ---------- Load one window ----------
X_window = None
loaded_key = None

if os.path.exists(NPZ_PATH):
    npz = np.load(NPZ_PATH, allow_pickle=False)

    # Try these keys in order (common patterns)
    candidate_keys = ["X_bench", "X_test", "X", "X_windows"]

    for k in candidate_keys:
        if k in npz:
            X = npz[k]
            # Expect (N, W, C)
            if isinstance(X, np.ndarray) and X.ndim == 3 and X.shape[0] > 0:
                X_window = X[0].astype(np.float32)  # (W, C)
                loaded_key = k
                break

if X_window is None:
    print("No dataset window found. We will generate a synthetic window in Step 4.2.")
else:
    print(f"Loaded window from NPZ key '{loaded_key}' with shape {X_window.shape}")

# ---------- Sanity checks ----------
if X_window is not None:
    print("X_window dtype:", X_window.dtype)
    print("Any NaN?:", bool(np.isnan(X_window).any()))
    print("Any Inf?:", bool(np.isinf(X_window).any()))
    mn, mx = float(np.min(X_window)), float(np.max(X_window))
    print("X_window min/max:", mn, mx)

    # This is only a diagnostic threshold. We do NOT modify data here.
    if mx > 20 or mn < -20:
        print("WARNING: The loaded window looks unscaled (large magnitude values).")
        print("We will use a synthetic or normalized window later to avoid validation failures.")
    else:
        print("The loaded window magnitude looks reasonable for normalized IMU input.")


Loaded window from NPZ key 'X_test' with shape (100, 7)
X_window dtype: float32
Any NaN?: False
Any Inf?: False
X_window min/max: -5.519999980926514 1007.5
We will use a synthetic or normalized window later to avoid validation failures.


### 4.2 On-Device Model Inference

In this step, we load the optimized on-device model produced in **Unit 2.1** and run a **baseline inference pass**.

At this stage:
- We do **not** apply any secure validation or safety logic.
- We intentionally observe the **raw model behavior**.
- This establishes a baseline before introducing the secure inference layer in Step 4.3.

The goal is to understand:
- How the model expects its inputs
- What the raw outputs look like
- How predictions and confidence scores are derived


In [13]:
import os
import numpy as np
import onnxruntime as ort

# ---------------------------------------------------------------------
# Model loading
# ---------------------------------------------------------------------

MODELS_DIR = os.path.join("..", "models")
BASELINE_MODEL_PATH = os.path.join(MODELS_DIR, "har_baseline.onnx")

assert os.path.exists(BASELINE_MODEL_PATH), f"Model not found at: {BASELINE_MODEL_PATH}"

session = ort.InferenceSession(
    BASELINE_MODEL_PATH,
    providers=["CPUExecutionProvider"]
)

input_tensor = session.get_inputs()[0]
output_tensor = session.get_outputs()[0]

input_name = input_tensor.name
output_name = output_tensor.name
input_shape = input_tensor.shape  # e.g. ['unk__121', 100, 7]

print("Baseline model loaded successfully ✅")
print("Input name :", input_name)
print("Input shape:", input_shape)
print("Output name:", output_name)
print("Output shape:", output_tensor.shape)

# ---------------------------------------------------------------------
# Prepare input for inference
# ---------------------------------------------------------------------
# Model expects input shape (1, W, C)

X_input = X_window[np.newaxis, :, :].astype(np.float32)

print("\nPrepared inference input")
print("Inference input shape:", X_input.shape)
print("Inference input dtype:", X_input.dtype)

# ---------------------------------------------------------------------
# Run baseline (non-secure) inference
# ---------------------------------------------------------------------

raw_output = session.run(
    [output_name],
    {input_name: X_input}
)[0]

predicted_class = int(np.argmax(raw_output))
confidence = float(np.max(raw_output))

print("\nRaw model output:", raw_output)
print("Predicted class:", predicted_class)
print("Confidence score:", confidence)


Baseline model loaded successfully ✅
Input name : input
Input shape: ['unk__121', 100, 7]
Output name: dense_1
Output shape: ['unk__122', 8]

Prepared inference input
Inference input shape: (1, 100, 7)
Inference input dtype: float32

Raw model output: [[0. 0. 0. 0. 0. 1. 0. 0.]]
Predicted class: 5
Confidence score: 1.0


### 4.3 Secure Inference and Validation

In this step, we run inference through a **secure validation layer**.

Why this matters:
- Real sensor inputs may be malformed (wrong shape/dtype), contain NaNs/Infs, or be outside the expected value range.
- A secure wrapper prevents unsafe inputs from reaching the model and returns a **structured result** that downstream logic can handle safely.

We follow a **hybrid approach**:
- **Preferred:** reuse `secure_predict()` from Unit 2.2 (`src/edge_inference_secure.py`) if available.
- **Fallback:** use a minimal `secure_predict()` implemented in this notebook.

Important:  
In Step 4.1 we detected that the dataset window may be **unscaled** (very large magnitudes).  
To keep the pipeline stable, we also build a **safe window** (synthetic, normalized) and test secure inference on both:
1. `X_window` (dataset window, may be unscaled)
2. `X_window_safe` (guaranteed valid window for the model)


In [14]:
import os, sys
import numpy as np

# ---------------------------------------------------------------------
# (A) Ensure repo root is on sys.path (so we can import src/ from notebooks/)
# ---------------------------------------------------------------------
repo_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
if repo_root not in sys.path:
    sys.path.insert(0, repo_root)

# ---------------------------------------------------------------------
# (B) Derive expected (W, C) from the model input metadata
#     input_shape is like ['unk__121', 100, 7]
# ---------------------------------------------------------------------
_, W, C = input_shape
W, C = int(W), int(C)

# ---------------------------------------------------------------------
# (C) Build a "safe" window that is guaranteed to pass validation
#     (synthetic normalized IMU-like data in [-1, 1])
# ---------------------------------------------------------------------
rng = np.random.default_rng(0)
X_window_safe = rng.uniform(-1.0, 1.0, size=(W, C)).astype(np.float32)

# ---------------------------------------------------------------------
# (D) Local (fallback) validation + secure predict
# ---------------------------------------------------------------------
def validate_window_local(window: np.ndarray,
                          expected_shape=(W, C),
                          expected_dtype=np.float32,
                          min_value=-4.0,
                          max_value=4.0):
    if not isinstance(window, np.ndarray):
        return False, "not_numpy_array"
    if window.shape != expected_shape:
        return False, f"wrong_shape_expected_{expected_shape}_got_{window.shape}"
    if window.dtype != expected_dtype:
        return False, f"wrong_dtype_expected_{expected_dtype}_got_{window.dtype}"
    if not np.isfinite(window).all():
        return False, "contains_nan_or_inf"
    mn, mx = float(window.min()), float(window.max())
    if mn < min_value or mx > max_value:
        return False, f"out_of_range_expected_{min_value}_to_{max_value}_got_{mn:.3f}_to_{mx:.3f}"
    return True, None

def softmax(x: np.ndarray) -> np.ndarray:
    x = x - np.max(x)
    ex = np.exp(x)
    return ex / (np.sum(ex) + 1e-12)

def secure_predict_fallback(session, input_name: str, output_name: str, window: np.ndarray):
    ok, err = validate_window_local(window)
    if not ok:
        return {"ok": False, "error": err, "prediction": None, "confidence": None}

    x = window[None, :, :].astype(np.float32)  # (1, W, C)
    out = session.run([output_name], {input_name: x})[0][0]  # (num_classes,)

    probs = softmax(out)
    pred = int(np.argmax(probs))
    conf = float(probs[pred])

    return {"ok": True, "error": None, "prediction": pred, "confidence": conf}

# ---------------------------------------------------------------------
# (E) Preferred: try to import secure_predict from src (Unit 2.2)
#     If it fails (missing / still stub / raises errors), we fall back.
# ---------------------------------------------------------------------
secure_predict_src = None
try:
    from src.edge_inference_secure import secure_predict as secure_predict_src
    print("Imported secure_predict() from src.edge_inference_secure ✅")
except Exception as e:
    print("Could not import src.secure_predict(). Will use fallback. Reason:", repr(e))

def secure_predict_hybrid(session, input_name: str, output_name: str, window: np.ndarray):
    if secure_predict_src is not None:
        try:
            # Note: src implementation should accept (session, input_name, output_name, window)
            return secure_predict_src(session, input_name, output_name, window)
        except NotImplementedError:
            print("[hybrid] src.secure_predict is a placeholder. Using fallback.")
        except Exception as e:
            print("[hybrid] src.secure_predict failed. Using fallback. Reason:", repr(e))
    return secure_predict_fallback(session, input_name, output_name, window)

# ---------------------------------------------------------------------
# (F) Run secure inference on:
#     1) dataset window (may fail validation due to scaling)
#     2) safe synthetic window (should pass)
# ---------------------------------------------------------------------
print("\n--- Secure inference on dataset window (X_window) ---")
result_dataset = secure_predict_hybrid(session, input_name, output_name, X_window)
print("Result:", result_dataset)

print("\n--- Secure inference on safe window (X_window_safe) ---")
result_safe = secure_predict_hybrid(session, input_name, output_name, X_window_safe)
print("Result:", result_safe)

# We'll use the safe result for downstream steps if the dataset one is rejected.
secure_result = result_dataset if result_dataset["ok"] else result_safe
print("\nSelected secure_result for downstream steps:", secure_result)



Imported secure_predict() from src.edge_inference_secure ✅

--- Secure inference on dataset window (X_window) ---
Result: {'ok': False, 'error': 'out_of_range_expected_-4.0_to_4.0', 'prediction': None, 'confidence': None}

--- Secure inference on safe window (X_window_safe) ---
Result: {'ok': True, 'error': None, 'prediction': 1, 'confidence': 0.18003952503204346}

Selected secure_result for downstream steps: {'ok': True, 'error': None, 'prediction': 1, 'confidence': 0.18003952503204346}


### 4.4 Decision Logic and UX Mapping

At this stage, the system has a *secure inference result* produced by the Safe Inference layer.

The goal of this step is to translate the model output into a **user-facing decision**, taking into account:
- Whether inference was successful
- The predicted class
- The confidence score
- Conservative safety thresholds

This logic bridges **ML outputs** and **UX/system behavior**, which is a critical aspect of on-device AI systems.


#### Decision Policy

We apply the following simple policy:

- If inference failed → **No Action**
- If confidence ≥ high threshold → **Automatic Action**
- If confidence is moderate → **Request Human Confirmation**
- If confidence is low → **No Action (failsafe)**

These thresholds are illustrative and would normally be tuned per application.


In [15]:
def decision_logic(secure_result,
                   high_conf_threshold=0.7,
                   low_conf_threshold=0.3):
    """
    Map secure inference results to UX/system actions.
    
    Returns a dictionary describing the chosen action.
    """

    if not secure_result.get("ok", False):
        return {
            "action": "NO_ACTION",
            "reason": "Inference failed",
            "prediction": None,
            "confidence": None
        }

    confidence = secure_result.get("confidence", 0.0)
    prediction = secure_result.get("prediction")

    if confidence >= high_conf_threshold:
        return {
            "action": "AUTO_CONFIRM",
            "reason": "High confidence prediction",
            "prediction": prediction,
            "confidence": confidence
        }

    if confidence >= low_conf_threshold:
        return {
            "action": "REQUEST_CONFIRMATION",
            "reason": "Medium confidence prediction",
            "prediction": prediction,
            "confidence": confidence
        }

    return {
        "action": "NO_ACTION",
        "reason": "Low confidence prediction (failsafe)",
        "prediction": prediction,
        "confidence": confidence
    }


In [16]:
decision = decision_logic(secure_result)

print("Decision logic output:")
for k, v in decision.items():
    print(f"  {k}: {v}")


Decision logic output:
  action: NO_ACTION
  reason: Low confidence prediction (failsafe)
  prediction: 1
  confidence: 0.18003952503204346


#### Expected Outcome

Depending on the confidence level produced in Step 4.3, the system will:

- Automatically confirm the prediction
- Ask for human confirmation
- Or take no action at all

This ensures that **model uncertainty does not directly translate into unsafe or confusing user experiences**.


### 4.5 Failsafe Mechanisms

Even with secure validation, on-device predictions can still be *uncertain* (low confidence) or *unstable* (confidence fluctuates).

Failsafe mechanisms ensure the system behaves conservatively by:
- suppressing actions when confidence is low
- requiring repeated agreement over multiple windows (“stability over time”)
- preventing repeated triggers (“cooldown”)

In this step we run two mini-simulations:

1) **Realistic stream (based on the current secure result):**  
   Confidence remains low, so the system should consistently choose **NO_ACTION**.

2) **Demonstration stream (intentionally mixed):**  
   We inject a short high-confidence streak to show how the failsafe gate:
   - only allows action after **K consecutive high-confidence windows**
   - applies **cooldown** after an action


In [18]:
import numpy as np

def failsafe_gate(decisions,
                  require_consecutive=3,
                  min_conf_for_action=0.7,
                  cooldown_steps=5):
    """
    Apply failsafe gating on a sequence of decisions.

    Policy:
    - Only allow AUTO_CONFIRM if confidence >= min_conf_for_action for
      `require_consecutive` consecutive windows.
    - After an allowed AUTO_CONFIRM, enforce a cooldown period where actions
      are suppressed (NO_ACTION).

    Returns:
      gated_actions: list of action strings after failsafe gating
    """
    gated_actions = []
    streak = 0
    cooldown = 0

    for d in decisions:
        conf = d.get("confidence") if d.get("confidence") is not None else 0.0
        proposed = d.get("action", "NO_ACTION")

        if cooldown > 0:
            gated_actions.append("NO_ACTION")
            cooldown -= 1
            streak = 0
            continue

        if conf >= min_conf_for_action:
            streak += 1
        else:
            streak = 0

        if proposed == "AUTO_CONFIRM" and streak >= require_consecutive:
            gated_actions.append("AUTO_CONFIRM")
            cooldown = cooldown_steps
            streak = 0
        else:
            gated_actions.append("NO_ACTION")

    return gated_actions

def propose_action_from_conf(conf: float) -> str:
    """Simple proposal policy before failsafe gating."""
    if conf >= 0.7:
        return "AUTO_CONFIRM"
    if conf >= 0.3:
        return "REQUEST_CONFIRMATION"
    return "NO_ACTION"

def run_simulation(conf_stream, title, require_consecutive=3, cooldown_steps=3):
    proposed_decisions = []
    for c in conf_stream:
        proposed_decisions.append({
            "action": propose_action_from_conf(float(c)),
            "confidence": float(c),
            "prediction": secure_result.get("prediction", None),
            "reason": "simulated"
        })

    gated = failsafe_gate(
        proposed_decisions,
        require_consecutive=require_consecutive,
        min_conf_for_action=0.7,
        cooldown_steps=cooldown_steps
    )

    print(f"\n=== {title} ===")
    print("Confidence stream:")
    print([round(float(c), 3) for c in conf_stream])
    print("Proposed actions (before failsafe):")
    print([d["action"] for d in proposed_decisions])
    print("Gated actions (after failsafe):")
    print(gated)

# -------------------------------------------------------------
# Simulation 1: Realistic stream around current secure_result confidence
# -------------------------------------------------------------
rng = np.random.default_rng(1)
base_conf = float(secure_result.get("confidence") or 0.0)
conf_stream_realistic = np.clip(base_conf + rng.normal(0, 0.08, size=12), 0.0, 1.0)

run_simulation(
    conf_stream_realistic,
    title="Simulation 1 (Realistic): Low-confidence stream → conservative behavior",
    require_consecutive=3,
    cooldown_steps=3
)

# -------------------------------------------------------------
# Simulation 2: Demonstration stream that includes a high-confidence streak
# (to show the gate + cooldown in action)
# -------------------------------------------------------------
conf_stream_demo = np.array([
    0.25, 0.35, 0.55,   # includes REQUEST_CONFIRMATION, but not AUTO_CONFIRM
    0.72, 0.78, 0.81,   # 3 consecutive highs → AUTO_CONFIRM allowed once
    0.85, 0.90,         # would be high, but cooldown suppresses
    0.40, 0.20,         # drop back down
    0.75, 0.80, 0.82    # another streak → may allow again after cooldown
], dtype=np.float32)

run_simulation(
    conf_stream_demo,
    title="Simulation 2 (Demo): High-confidence streak → gated AUTO_CONFIRM + cooldown",
    require_consecutive=3,
    cooldown_steps=3
)




=== Simulation 1 (Realistic): Low-confidence stream → conservative behavior ===
Confidence stream:
[0.208, 0.246, 0.206, 0.076, 0.252, 0.216, 0.137, 0.227, 0.209, 0.204, 0.182, 0.224]
Proposed actions (before failsafe):
['NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION']
Gated actions (after failsafe):
['NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION', 'NO_ACTION']

=== Simulation 2 (Demo): High-confidence streak → gated AUTO_CONFIRM + cooldown ===
Confidence stream:
[0.25, 0.35, 0.55, 0.72, 0.78, 0.81, 0.85, 0.9, 0.4, 0.2, 0.75, 0.8, 0.82]
Proposed actions (before failsafe):
['NO_ACTION', 'REQUEST_CONFIRMATION', 'REQUEST_CONFIRMATION', 'AUTO_CONFIRM', 'AUTO_CONFIRM', 'AUTO_CONFIRM', 'AUTO_CONFIRM', 'AUTO_CONFIRM', 'REQUEST_CONFIRMATION', 'NO_ACTION', 'AUTO_CONFIRM', 'AUTO_CONFIRM', 'AUTO_C

#### Interpretation

- The **realistic simulation** often results in NO_ACTION throughout, which is expected when the model is uncertain.
- The **demo simulation** proves the failsafe logic works:
  - actions are only allowed after stable high confidence
  - cooldown prevents repeated triggers

This reinforces the engineering goal: **prefer conservative behavior unless the model is consistently confident.**



### 4.6 Human-in-the-Loop Interaction

Even with secure validation and failsafe gating, edge AI systems often require a **Human-in-the-Loop (HITL)** mechanism.

Why HITL matters:
- Some decisions are safety-critical or ambiguous.
- Confidence may be moderate (not high enough for automatic action).
- The system may need confirmation, correction, or override.

In this step, we simulate a HITL workflow:

- If the system proposes `REQUEST_CONFIRMATION`, we ask for a user confirmation.
- Since this is a notebook, we simulate human feedback with a simple function:
  - In **realistic mode**, the human may reject uncertain predictions.
  - In **demo mode**, we show both accept and reject outcomes.

The output is a final decision that can be executed safely in Step 4.7.


In [20]:
def human_in_loop(decision,
                  human_response=None,
                  default_on_no_response="NO_ACTION"):
    """
    Simulate Human-in-the-Loop confirmation.

    Inputs:
      - decision: dict from decision_logic()
      - human_response: one of {"YES", "NO", None}
        * YES: accept prediction
        * NO : reject prediction
        * None: no response / timeout
      - default_on_no_response: fallback action if no response

    Returns:
      final_decision: dict with final action
    """
    action = decision.get("action", "NO_ACTION")

    # Only intervene when system requests confirmation
    if action != "REQUEST_CONFIRMATION":
        return {
            **decision,
            "final_action": action,
            "hitl_used": False,
            "human_response": None
        }

    # Handle human response
    if human_response == "YES":
        return {
            **decision,
            "final_action": "CONFIRMED_BY_HUMAN",
            "hitl_used": True,
            "human_response": "YES"
        }

    if human_response == "NO":
        return {
            **decision,
            "final_action": "REJECTED_BY_HUMAN",
            "hitl_used": True,
            "human_response": "NO"
        }

    # No response / timeout
    return {
        **decision,
        "final_action": default_on_no_response,
        "hitl_used": True,
        "human_response": None,
        "reason": (decision.get("reason", "") + " | No human response (timeout)").strip(" |")
    }


# -------------------------------------------------------------
# Realistic case: use the decision we already computed in Step 4.4
# (Often NO_ACTION when confidence is low)
# -------------------------------------------------------------
final_realistic = human_in_loop(decision, human_response=None)

print("=== Realistic case (using current decision) ===")
for k, v in final_realistic.items():
    print(f"  {k}: {v}")


# -------------------------------------------------------------
# Demo case: force a REQUEST_CONFIRMATION decision to show HITL behavior
# -------------------------------------------------------------
demo_decision_request = {
    "action": "REQUEST_CONFIRMATION",
    "reason": "Medium confidence prediction",
    "prediction": secure_result.get("prediction", 0),
    "confidence": 0.55
}

final_demo_yes = human_in_loop(demo_decision_request, human_response="YES")
final_demo_no  = human_in_loop(demo_decision_request, human_response="NO")
final_demo_none = human_in_loop(demo_decision_request, human_response=None)

print("\n=== Demo case (REQUEST_CONFIRMATION) - Human says YES ===")
for k, v in final_demo_yes.items():
    print(f"  {k}: {v}")

print("\n=== Demo case (REQUEST_CONFIRMATION) - Human says NO ===")
for k, v in final_demo_no.items():
    print(f"  {k}: {v}")

print("\n=== Demo case (REQUEST_CONFIRMATION) - No response (timeout) ===")
for k, v in final_demo_none.items():
    print(f"  {k}: {v}")


=== Realistic case (using current decision) ===
  action: NO_ACTION
  reason: Low confidence prediction (failsafe)
  prediction: 1
  confidence: 0.18003952503204346
  final_action: NO_ACTION
  hitl_used: False
  human_response: None

=== Demo case (REQUEST_CONFIRMATION) - Human says YES ===
  action: REQUEST_CONFIRMATION
  reason: Medium confidence prediction
  prediction: 1
  confidence: 0.55
  final_action: CONFIRMED_BY_HUMAN
  hitl_used: True
  human_response: YES

=== Demo case (REQUEST_CONFIRMATION) - Human says NO ===
  action: REQUEST_CONFIRMATION
  reason: Medium confidence prediction
  prediction: 1
  confidence: 0.55
  final_action: REJECTED_BY_HUMAN
  hitl_used: True
  human_response: NO

=== Demo case (REQUEST_CONFIRMATION) - No response (timeout) ===
  action: REQUEST_CONFIRMATION
  reason: Medium confidence prediction | No human response (timeout)
  prediction: 1
  confidence: 0.55
  final_action: NO_ACTION
  hitl_used: True
  human_response: None


#### Interpretation

- In the **realistic case**, the system often produces `NO_ACTION`, so HITL is not invoked.
- In the **demo case**, we show how HITL:
  - confirms uncertain predictions (`CONFIRMED_BY_HUMAN`)
  - rejects them (`REJECTED_BY_HUMAN`)
  - or defaults safely on timeout (`NO_ACTION`)

This reflects real deployment patterns where edge AI systems must support **confirmation and override** rather than relying purely on automation.


### 4.7 Final System Action

In the final step, we convert the *final decision* into an explicit **system action**.

The purpose is to ensure the pipeline ends with an output that is:
- safe (never triggers on invalid/uncertain inputs)
- interpretable (clear reason codes)
- actionable (can be connected to UI, logs, or device behavior)

We map the final decision into one of the following events:
- `NO_ACTION` → do nothing (failsafe)
- `AUTO_CONFIRM` / `CONFIRMED_BY_HUMAN` → trigger an application action (e.g., alert)
- `REJECTED_BY_HUMAN` → record override and do nothing


In [21]:
def map_to_system_event(final_decision):
    """
    Convert a final decision dict into a system-level event.
    """
    final_action = final_decision.get("final_action", final_decision.get("action", "NO_ACTION"))
    prediction = final_decision.get("prediction")
    confidence = final_decision.get("confidence")
    reason = final_decision.get("reason", "")

    # Default event structure
    event = {
        "event_type": "NO_ACTION",
        "message": "No action taken.",
        "prediction": prediction,
        "confidence": confidence,
        "reason": reason
    }

    if final_action in ("AUTO_CONFIRM", "CONFIRMED_BY_HUMAN"):
        event["event_type"] = "TRIGGER_ALERT"
        event["message"] = f"Action triggered for class={prediction} (final_action={final_action})."
        return event

    if final_action == "REJECTED_BY_HUMAN":
        event["event_type"] = "HUMAN_OVERRIDE"
        event["message"] = "Human rejected the suggested prediction. No action taken."
        return event

    # NO_ACTION or timeout fallback → keep defaults
    return event


# -------------------------------------------------------------
# Realistic case: use the result from Step 4.6
# -------------------------------------------------------------
system_event_realistic = map_to_system_event(final_realistic)

print("=== System event (Realistic case) ===")
for k, v in system_event_realistic.items():
    print(f"  {k}: {v}")


# -------------------------------------------------------------
# Demo cases: show how the final system action changes
# -------------------------------------------------------------
system_event_demo_yes = map_to_system_event(final_demo_yes)
system_event_demo_no = map_to_system_event(final_demo_no)
system_event_demo_none = map_to_system_event(final_demo_none)

print("\n=== System event (Demo YES) ===")
for k, v in system_event_demo_yes.items():
    print(f"  {k}: {v}")

print("\n=== System event (Demo NO) ===")
for k, v in system_event_demo_no.items():
    print(f"  {k}: {v}")

print("\n=== System event (Demo timeout) ===")
for k, v in system_event_demo_none.items():
    print(f"  {k}: {v}")


=== System event (Realistic case) ===
  event_type: NO_ACTION
  message: No action taken.
  prediction: 1
  confidence: 0.18003952503204346
  reason: Low confidence prediction (failsafe)

=== System event (Demo YES) ===
  event_type: TRIGGER_ALERT
  message: Action triggered for class=1 (final_action=CONFIRMED_BY_HUMAN).
  prediction: 1
  confidence: 0.55
  reason: Medium confidence prediction

=== System event (Demo NO) ===
  event_type: HUMAN_OVERRIDE
  message: Human rejected the suggested prediction. No action taken.
  prediction: 1
  confidence: 0.55
  reason: Medium confidence prediction

=== System event (Demo timeout) ===
  event_type: NO_ACTION
  message: No action taken.
  prediction: 1
  confidence: 0.55
  reason: Medium confidence prediction | No human response (timeout)


#### Interpretation

- The **realistic case** often produces `NO_ACTION`, which is appropriate when confidence is low.
- The **demo YES** case produces a `TRIGGER_ALERT` event, showing how the system can act after confirmation.
- The **demo NO** case produces a `HUMAN_OVERRIDE` event, demonstrating safe rejection.
- The **timeout** case defaults safely to `NO_ACTION`.

This completes the integrated pipeline end-to-end with safety-by-design behavior.


## Wrap-Up and Key Takeaways

In this unit, you implemented and validated a **complete end-to-end secure AI pipeline**, moving beyond isolated model execution to a **system-level perspective**.

### What You Built

You integrated all core elements of a trustworthy on-device AI system:

1. **Sensor data ingestion**  
   - Real or synthetic sensor windows aligned with the model interface.

2. **On-device model inference**  
   - Execution of an optimized ONNX model with explicit inspection of inputs and outputs.

3. **Secure inference and validation**  
   - A safety layer ensuring correct shape, type, and value ranges before inference.
   - Graceful handling of invalid or unsafe inputs.

4. **Decision logic and UX mapping**  
   - Translation of raw predictions into user-facing actions based on confidence thresholds.

5. **Failsafe mechanisms**  
   - Conservative behaviour under uncertainty, noise, or unstable confidence streams.

6. **Human-in-the-Loop interaction**  
   - Optional human confirmation, rejection, or timeout handling.

7. **Final system action**  
   - A safe, interpretable system event (alert, override, or no action).

### Why This Matters

This lab demonstrates that **AI systems are not models alone**.

A production-ready AI solution must:
- Be robust to invalid inputs
- Avoid unsafe automation
- Communicate uncertainty
- Support human oversight
- Fail safely by design

These principles are essential in domains such as:
- Wearables and health monitoring
- Smart mobility and IoT
- Industrial edge AI
- Safety-critical human–machine interaction

### Key Insight

> **Trustworthy AI emerges from system design, not model accuracy alone.**

With this unit, you have completed **Module 2**, progressing from:
- model optimization (Unit 2.1),
- secure integration (Unit 2.2),
- to full system orchestration (Unit 2.3).

You are now ready to move forward.
