# Quantitative Green Infrastructure (QGI) for Urban Flood Mitigation — MVP Congestion MARL Notebook

**Supervision:** Prof. Hang Ma, Supervised by QGI Lab (HMARL Team)

## Introduction
This notebook narrates a minimum viable product (MVP) for a multi-agent reinforcement learning (MARL) approach to urban flood mitigation using quantitative green infrastructure (QGI). It is a self-contained, lightweight scaffold designed to run in Colab with standard packages and provide clear next steps for full implementation.

## Problem Statement
Urban flooding is exacerbated by climate change and rapid urbanization. We seek a scalable decision-making system that allocates green infrastructure actions across multiple agents (e.g., neighborhood catchments) to reduce peak congestion (inflows) and minimize flooding impacts.

## Research Questions
1. How can MARL coordinate distributed QGI decisions to reduce flood risk?
2. How do forecasting accuracy and uncertainty affect control performance?
3. What is the best tradeoff between mitigation performance and deployment cost?

## Methodology
We combine (1) a synthetic congestion signal generator, (2) a forecasting module, and (3) a multi-agent controller. The MVP focuses on wiring these components with simple placeholders that can be expanded into a full pipeline.

## Architecture Diagram (text)
```
[Rain/Runoff Signals] -> [Forecasting Module] -> [Multi-Agent Controller]
        |                          |                     |
        v                          v                     v
   [Synthetic Data]           [Predictions]        [QGI Actions]
        |                                                |
        v                                                v
           [Environment / Hydrology Surrogate (Gymnasium)]
```

## MDP Formulation
- **States (s):** recent runoff/congestion history + forecast summary + local storage.
- **Actions (a):** per-agent QGI actions (e.g., retention, infiltration, diversion).
- **Transition (P):** hydrology surrogate driven by rainfall + actions.
- **Reward (r):** negative congestion/flooding + penalties for action costs.
- **Episode:** fixed horizon (e.g., 24–72 timesteps).

## Forecasting Module
We start with a trivial baseline (persistence or AR-like) and a minimal PyTorch LSTM stub that can be expanded into a full model.

In [None]:
# Minimal packages (standard Colab installs)
import numpy as np

In [None]:
# Synthetic congestion signal generator

def generate_congestion_series(T=96, seed=42):
    rng = np.random.default_rng(seed)
    t = np.arange(T)
    # Simple periodic pattern + noise
    signal = 1.0 + 0.5 * np.sin(2 * np.pi * t / 24) + 0.1 * rng.normal(size=T)
    return np.clip(signal, 0.0, None)

series = generate_congestion_series()
series[:10]

In [None]:
# Forecasting baseline: persistence

def forecast_persistence(history, horizon=6):
    if len(history) == 0:
        return np.zeros(horizon)
    return np.full(horizon, history[-1])

forecast_persistence(series[:10], horizon=4)

In [None]:
# Optional PyTorch LSTM placeholder (kept minimal)
try:
    import torch
    import torch.nn as nn

    class TinyLSTM(nn.Module):
        def __init__(self, input_size=1, hidden_size=8):
            super().__init__()
            self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)
            self.head = nn.Linear(hidden_size, 1)

        def forward(self, x):
            out, _ = self.lstm(x)
            return self.head(out[:, -1])

    model = TinyLSTM()
except Exception as exc:
    print("PyTorch not available in this environment:", exc)

## RL Algorithm / Coordination
The MVP uses a minimal MAPPO-style loop skeleton without heavy dependencies. This is a pseudocode-like structure designed for expansion.

In [None]:
# Minimal Gymnasium-style environment placeholder

class FloodEnv:
    def __init__(self, n_agents=3, horizon=24):
        self.n_agents = n_agents
        self.horizon = horizon
        self.t = 0
        self.state = None

    def reset(self, seed=None):
        self.t = 0
        self.state = np.zeros(self.n_agents, dtype=float)
        return self.state

    def step(self, actions):
        # Placeholder transition: next state is damped by actions
        actions = np.asarray(actions)
        congestion = np.maximum(self.state + 0.2 - 0.1 * actions, 0.0)
        reward = -congestion.sum() - 0.05 * (actions ** 2).sum()
        self.state = congestion
        self.t += 1
        terminated = self.t >= self.horizon
        info = {"congestion": congestion}
        return self.state, reward, terminated, False, info

In [None]:
# Simple step loop for sanity check

env = FloodEnv(n_agents=3, horizon=5)
state = env.reset()
for _ in range(5):
    actions = np.zeros(env.n_agents)
    state, reward, terminated, truncated, info = env.step(actions)
    print(state, reward, terminated)

In [None]:
# MAPPO-like training loop skeleton (pseudocode)

n_agents = 3
n_episodes = 2

env = FloodEnv(n_agents=n_agents, horizon=10)

for ep in range(n_episodes):
    obs = env.reset()
    done = False
    ep_return = 0.0

    while not done:
        # Placeholder policy: zero actions
        actions = np.zeros(n_agents)
        next_obs, reward, terminated, truncated, info = env.step(actions)
        done = terminated or truncated
        ep_return += reward
        obs = next_obs

    print(f"Episode {ep} return: {ep_return:.2f}")

## Baselines
- Rule-based (static) QGI allocation.
- Centralized single-agent RL (no coordination).
- Forecasting: persistence vs. naive moving average.

## Expected Contributions
- A minimal, reproducible MARL pipeline for QGI control.
- Insights into coordination vs. centralized control.
- Benchmarks for forecasting + control coupling.

## Experimental Plan / Data
- Start with synthetic signals and simplified hydrology surrogate.
- Introduce real rainfall/runoff datasets when available.
- Compare MARL vs. baselines using cumulative congestion and cost.

## Timeline
1. **Week 1–2:** MVP scaffolding + synthetic data.
2. **Week 3–4:** Forecasting module integration.
3. **Week 5–6:** MAPPO training + baseline comparison.
4. **Week 7+:** Real data integration + refinement.

## Conclusion
This notebook provides a lightweight, executable template for a QGI congestion MARL MVP. It outlines the problem, methodology, and a minimal code scaffold to be expanded into a full experimental pipeline.

## References
- Sutton, R. S., & Barto, A. G. (2018). *Reinforcement Learning: An Introduction*.
- Yu, C. et al. (2021). *The Surprising Effectiveness of PPO in Cooperative MARL*.
- Urban hydrology and green infrastructure survey literature.

## Next Steps
- [ ] Replace synthetic data with real rainfall/runoff series.
- [ ] Implement proper hydrology surrogate or link to SWMM.
- [ ] Add MAPPO policy/value networks and rollout storage.
- [ ] Evaluate coordination vs. centralized baselines.
- [ ] Add uncertainty-aware forecasting.