# Tutorial 1: Understanding HERON

**Goal:** Understand what HERON solves and why its design matters.

**Time:** ~5 minutes

---

## The Problem

You want to train RL agents for a **power grid with multiple microgrids**. Each microgrid has devices (generators, batteries) managed by a coordinator. Seems simple—until you realize:

| Challenge | What It Means |
|-----------|---------------|
| **Hierarchical Control** | Devices → Microgrids → System Operator |
| **Partial Observability** | A battery shouldn't see the entire grid |
| **Realistic Timing** | Real systems have delays and different update rates |
| **Swappable Coordination** | Test price signals vs setpoints vs consensus |

### The Typical Approach (Without HERON)

```python
# Manually filter observations for each agent
def get_obs_for_battery(global_state):
    return global_state['battery_1']['soc']  # Manual filtering

def get_obs_for_microgrid(global_state):
    return np.concatenate([
        global_state['battery_1']['soc'],
        global_state['gen_1']['output'],
        # ... 20 more lines of manual filtering
    ])

# Rewrite for each coordination protocol
if protocol == 'setpoint':
    # ... 50 lines of setpoint logic
elif protocol == 'price_signal':
    # ... 50 different lines
```

**Result:** 500+ lines of boilerplate, tightly coupled, hard to experiment.

### With HERON

```python
class BatterySOC(FeatureProvider):
    visibility = ['owner', 'upper_level']  # Declarative - no filtering code
    soc: float = 0.5

class MyEnv(PettingZooParallelEnv):
    def __init__(self):
        super().__init__(env_id="my_env")
        self.register_agent(microgrid)  # HERON handles the rest
```

**Result:** ~50 lines, clean separation of concerns, easy experimentation.

## The Key Insight: Agent-Centric vs Environment-Centric

**PettingZoo (environment-centric):** The environment decides everything.
```python
# Environment controls what each agent sees
obs = env.step(actions)  # Black box - how does env filter observations?
```

**HERON (agent-centric):** Agents are first-class citizens.
```python
# Each feature declares its own visibility
class BatterySOC(FeatureProvider):
    visibility = ['owner', 'upper_level']  # Battery + coordinator can see

# Agent observes based on its level in hierarchy
obs = agent.observe()  # Automatically filtered by visibility rules
```

### Why This Matters

| Traditional Approach | HERON Approach |
|---------------------|----------------|
| Observation filtering is manual | Visibility is declarative (`visibility = [...]`) |
| Protocol logic mixed with agent logic | Protocols are pluggable components |
| Single execution mode | Dual modes: sync (training) + event-driven (testing) |
| Agents are stateless policy wrappers | Agents have state, timing, hierarchy |

## The 4 Core Abstractions

### 1. FeatureProvider — Observable State
A piece of state with declared visibility.

```python
@dataclass
class BatterySOC(FeatureProvider):
    visibility = ['owner', 'upper_level']  # Who can see this
    soc: float = 0.5
    
    def vector(self) -> np.ndarray:
        return np.array([self.soc])
```

### 2. Agent — Autonomous Entity
Has state, observes, acts. Two types: `FieldAgent` (devices) and `CoordinatorAgent` (managers).

```python
class SimpleBattery(FieldAgent):
    tick_interval = 1.0  # For event-driven mode
    
    def observe(self, global_state=None):
        return self.state.vector()
    
    def step(self, action):
        self.state.features['soc'].soc += action[0] * 0.1
```

### 3. Protocol — Coordination Mechanism
Defines how coordinators and devices communicate. Swap without changing agent code.

```python
microgrid = SimpleMicrogrid(protocol=SetpointProtocol())      # Direct control
# OR
microgrid = SimpleMicrogrid(protocol=PriceSignalProtocol())   # Market-based
# OR
microgrid = SimpleMicrogrid(protocol=ConsensusProtocol())     # Distributed
```

### 4. Environment Adapter — RL Framework Bridge
Wraps HERON agents for RLlib/StableBaselines compatibility.

```python
class MyEnv(PettingZooParallelEnv):  # HERON adapter
    def __init__(self):
        super().__init__(env_id="my_env")
        self.register_agent(agent)  # HERON tracks agents
        # Dual modes: step() for sync, run_event_driven() for async
```

## Dual Execution Modes

HERON's key differentiator: **train fast, test realistically**.

### Synchronous Mode (Training)
All agents step together. Fast and deterministic for RL training.

```
t=0: All observe → All act → Environment steps → Rewards
t=1: All observe → All act → Environment steps → Rewards
```

### Event-Driven Mode (Testing)
Agents tick at their own rates with realistic delays. Validates policy robustness.

```
t=0.00s: Battery1 ticks (interval=1s)
t=0.05s: Battery1 action takes effect (act_delay=0.05s)
t=1.00s: Battery1, Battery2 tick
t=5.00s: Coordinator ticks (interval=5s)
```

### Why Both Modes?

| Mode | Use Case | Speed | Realism |
|------|----------|-------|---------|
| Synchronous | RL training | Fast | Low |
| Event-driven | Deployment validation | Slow | High |

**The workflow:**
1. Train policy in synchronous mode (fast)
2. Test in event-driven mode with realistic timing
3. If policy degrades, retrain with timing awareness

**This cannot be achieved by wrapping PettingZoo**—it requires architectural support.

## What You'll Build in This Tutorial Series

A complete multi-agent RL case study for power grid control:

```
SimpleMicrogridEnv (PettingZooParallelEnv)
├── mg_0 (CoordinatorAgent)
│   ├── mg_0_bat (FieldAgent - Battery)
│   └── mg_0_gen (FieldAgent - Generator)
├── mg_1 (CoordinatorAgent)
│   └── ...
└── mg_2 (CoordinatorAgent)
    └── ...
```

### Tutorial Roadmap

| # | Notebook | What You'll Build | Key HERON Concept |
|---|----------|-------------------|-------------------|
| 01 | This notebook | Conceptual foundation | Agent-centric design |
| 02 | Features & State | `BatterySOC`, `GenOutput` | Visibility-based filtering |
| 03 | Agents | `SimpleBattery`, `SimpleMicrogrid` | Hierarchical agents |
| 04 | Environment | `SimpleMicrogridEnv` | HERON adapters |
| 05 | Training | MAPPO training script | RLlib integration |
| 06 | Event-Driven | Dual-mode testing | Realistic timing validation |

**Total time:** ~1 hour to understand + implement from scratch.

## Quick Sanity Check

Let's verify HERON is importable:

In [1]:
# Verify imports work
from heron.core.feature import FeatureProvider
from heron.agents.field_agent import FieldAgent
from heron.agents.coordinator_agent import CoordinatorAgent
from heron.protocols.vertical import SetpointProtocol

print("HERON imports successful!")
print(f"FeatureProvider visibility options: public, owner, upper_level, system")

HERON imports successful!
FeatureProvider visibility options: public, owner, upper_level, system


## Key Takeaways

| HERON Contribution | What It Enables |
|--------------------|-----------------|
| **Agent-centric architecture** | Agents have state, timing, observability (not just policy wrappers) |
| **Declarative visibility** | `visibility = ['owner']` — no manual filtering |
| **Pluggable protocols** | Swap `SetpointProtocol` ↔ `PriceSignalProtocol` without code changes |
| **Dual execution modes** | Train synchronously, test with realistic timing |
| **HERON adapters** | Full PettingZoo/RLlib compatibility |

---

**Next:** [02_features_and_state.ipynb](02_features_and_state.ipynb) — Build FeatureProviders with visibility