## Demo POMDP: Early Lung Cancer Decision Model

This simplified POMDP simulates sequential medical decision-making under uncertainty for early-stage lung cancer.

### States (Hidden True Condition)
- `Healthy`: No disease
- `PreCancer`: Early signs of potential cancer
- `Cancer`: Developed cancer

### Actions (Medical Choices)
- `Wait`: Take no action, monitor
- `OrderCTScan`: Request a CT scan to get more information
- `StartTreatment`: Begin treatment immediately

### Observations
- `NoAbnormality`: Scan appears normal
- `SuspiciousFinding`: Scan shows something abnormal

In [1]:
states = ["Healthy", "PreCancer", "Cancer"]
actions = ["Wait", "OrderCTScan", "StartTreatment"]
observations = ["NoAbnormality", "SuspiciousFinding"]

## Transition Model: P(s' | s, a)

This defines how the patient's **true health state** changes after each action.

### Rules:

- **Wait**:
  - Healthy stays Healthy most of the time, but can develop PreCancer.
  - PreCancer may progress to Cancer.
  - Cancer remains Cancer.

- **OrderCTScan**:
  - Does **not affect the health state** — it's purely diagnostic.

- **StartTreatment**:
  - Can **regress disease** in PreCancer or Cancer.
  - Health may improve with some probability.

These transition probabilities are defined in a nested dictionary structure:  
`T[action][current_state][next_state]`

In [2]:
import numpy as np

# Transition probabilities: T[action][current_state][next_state]
T = {
    "Wait": {
        "Healthy":     {"Healthy": 0.9, "PreCancer": 0.1, "Cancer": 0.0},
        "PreCancer":   {"Healthy": 0.0, "PreCancer": 0.7, "Cancer": 0.3},
        "Cancer":      {"Healthy": 0.0, "PreCancer": 0.0, "Cancer": 1.0},
    },
    "OrderCTScan": {
        "Healthy":     {"Healthy": 1.0, "PreCancer": 0.0, "Cancer": 0.0},
        "PreCancer":   {"Healthy": 0.0, "PreCancer": 1.0, "Cancer": 0.0},
        "Cancer":      {"Healthy": 0.0, "PreCancer": 0.0, "Cancer": 1.0},
    },
    "StartTreatment": {
        "Healthy":     {"Healthy": 1.0, "PreCancer": 0.0, "Cancer": 0.0},
        "PreCancer":   {"Healthy": 0.6, "PreCancer": 0.4, "Cancer": 0.0},
        "Cancer":      {"Healthy": 0.2, "PreCancer": 0.3, "Cancer": 0.5},
    }
}

## Observation Model: P(o | s', a)

This defines how likely a doctor is to observe certain findings, given the **resulting state** `s'` after taking action `a`.

### Rules:

- **OrderCTScan**:
  - High chance of detecting abnormalities if the patient is in `PreCancer` or `Cancer`.
  - Small chance of false positives if the patient is `Healthy`.

- **Wait** and **StartTreatment**:
  - Do **not generate new diagnostic information** — always results in `NoAbnormality`.

Observations depend on the **resulting (current) state** after transition, not the previous state.

Structure:  
`O[action][resulting_state][observation]`

In [3]:
# Observation model: O[action][resulting_state][observation]
O = {
    "Wait": {
        "Healthy":     {"NoAbnormality": 1.0, "SuspiciousFinding": 0.0},
        "PreCancer":   {"NoAbnormality": 1.0, "SuspiciousFinding": 0.0},
        "Cancer":      {"NoAbnormality": 1.0, "SuspiciousFinding": 0.0},
    },
    "OrderCTScan": {
        "Healthy":     {"NoAbnormality": 0.95, "SuspiciousFinding": 0.05},
        "PreCancer":   {"NoAbnormality": 0.3,  "SuspiciousFinding": 0.7},
        "Cancer":      {"NoAbnormality": 0.1,  "SuspiciousFinding": 0.9},
    },
    "StartTreatment": {
        "Healthy":     {"NoAbnormality": 1.0, "SuspiciousFinding": 0.0},
        "PreCancer":   {"NoAbnormality": 1.0, "SuspiciousFinding": 0.0},
        "Cancer":      {"NoAbnormality": 1.0, "SuspiciousFinding": 0.0},
    }
}

## Diagnosis Update (Bayes' Rule)

After taking an action and receiving an observation, we update our **diagnosis** — the probability distribution over the patient's true condition — using:

$$
\text{diagnosis}(s') = \eta \cdot P(o \mid s', a) \cdot \sum_{s} P(s' \mid s, a) \cdot \text{diagnosis}(s)
$$

Where:

- `diagnosis(s)`: current diagnosis (subjective probability) for state $s$
- $P(s' \mid s, a)$: transition probability to resulting state $s'$
- $P(o \mid s', a)$: observation likelihood in resulting state $s'$
- $\eta$: normalization constant to ensure probabilities sum to 1

This update is implemented in the `update_diagnosis` function.

In [4]:
def update_diagnosis(diagnosis, action, observation, states, T, O):
    updated = {}
    for s_prime in states:
        prob_obs = O[action][s_prime][observation]
        total = 0.0
        for s in states:
            total += T[action][s][s_prime] * diagnosis[s]
        updated[s_prime] = prob_obs * total
    # Normalize
    total_prob = sum(updated.values())
    for s in states:
        updated[s] /= total_prob
    return updated

## Simulate One Action-Observation Cycle

This step simulates what happens when the doctor takes an action:

1. A **true state** is sampled based on the current diagnosis (probabilities over health states).
2. The system transitions to a **resulting state** based on the transition model.
3. An **observation** is generated based on the resulting state and the action taken.
4. The doctor **updates their diagnosis** using Bayes' Rule, incorporating the new observation.

This logic is implemented in the `simulate_step` function, which returns:
- The previous (sampled) true state
- The resulting state after transition
- The generated observation
- The updated diagnosis

In [5]:
import random

def simulate_step(diagnosis, action, states, T, O):
    # Sample a true resulting state based on current diagnosis and transition model
    current_state = random.choices(states, weights=[diagnosis[s] for s in states])[0]
    resulting_state = random.choices(
        states,
        weights=[T[action][current_state][s_prime] for s_prime in states]
    )[0]

    # Sample an observation based on the resulting state and action
    obs_weights = [O[action][resulting_state][o] for o in observations]
    observation = random.choices(observations, weights=obs_weights)[0]

    # Update diagnosis
    updated_diagnosis = update_diagnosis(diagnosis, action, observation, states, T, O)

    return {
        "previous_state": current_state,
        "resulting_state": resulting_state,
        "observation": observation,
        "updated_diagnosis": updated_diagnosis
    }

## Running a Sample Simulation Step

We start with an **initial diagnosis** (equal probability for all states) and simulate the outcome of taking the `"OrderCTScan"` action.

The simulation prints:
- The sampled previous state (hidden from doctor)
- The resulting state after the transition
- The observation received
- The updated diagnosis (revised probabilities for each state)

In [6]:
# Initial diagnosis: uniform uncertainty
initial_diagnosis = {
    "Healthy": 1/3,
    "PreCancer": 1/3,
    "Cancer": 1/3
}

# Simulate one decision step using "OrderCTScan"
result = simulate_step(initial_diagnosis, "OrderCTScan", states, T, O)

# Print results
print("Previous (sampled) true state:", result["previous_state"])
print("Resulting state after action:", result["resulting_state"])
print("Observation received:", result["observation"])
print("Updated diagnosis:")
for state, prob in result["updated_diagnosis"].items():
    print(f"  {state}: {prob:.3f}")

Previous (sampled) true state: Cancer
Resulting state after action: Cancer
Observation received: SuspiciousFinding
Updated diagnosis:
  Healthy: 0.030
  PreCancer: 0.424
  Cancer: 0.545
