# A conceptual framework combining the following:

## Components:

1. Data-Driven Robust Optimizations: (https://www.mit.edu/~dbertsim/papers/Robust%20Optimization/Data-driven%20robust%20optimization.pdf)
   * Construct probabilistic guaranteed uncertainty sets from both historical and real time data.
   * These sets adapt over time based on hypothesis testing or Bayesian updates, ensuring they remain as tight as possible while maintaining the necessary confidence interval.
   * Quote from the paper: "We propose a novel schema for designing uncertainty sets for robust optimization from data using hypothesis tests. Sets designed using our schema imply a probabilistic guarantee and are typically much smaller than corresponding data poor variants. Models built from these sets are thus less conservative than conventional robust approaches, yet retain the same robustness guarantees."

2. Learning-Based Predictive Models (LBPM):
   * Use function approximators (e.g., neural networks, Gaussian Processes, or hybrid physics-plus-ML models) to capture system dynamics.
   * Continuously refine model parameters with new data, ensuring the predictive model remains accurate in a changing environment.

3. Distributed Model Predictive Control (DMPC):
   * Each agent solves a local MPC problem with robust constraints derived from the data-driven uncertainty sets.
   * Agents coordinate via decentralized mechanisms (e.g., ADMM, consensus) to ensure global objectives (e.g., overall cost minimization) and feasibility constraints are met.

4. Multi-Agent Reinforcement Learning (MARL):
   * A global or partially-shared value function (or policy network) is updated based on agents' local experiences
   * The robust DMPC layer "encodes" near-term decision-making and safety-constraints, while the MARL algorithm focuses on higher-level or long-term objectives.


## Key Interactions

1. LBPM $\leftrightarrow$ Data-Driven Uncertainty Sets:
   * As more data arrives, the learning-based model and the uncertainty sets are jointly updated.
   * The tighter the uncertainty sets become, the less conservative the MPC solutions -- and consequently, the better the MARL agent's reward signals.

2. DMPC $\leftrightarrow$ MARL:
   * DMPC ensures short-horizon stability and constraint satisfaction.
   * MARL learns from the improved system trajectories (less conservative, yet robust to real uncertainties) to optimize long-horizon performance.

3. Decentralized Coordination (ADMM, Consensu):
   * Each agent's local MPC solution must be consistent with neighbor's decisions (e.g., shared states or constraints)
   * MARL uses partial or full coordination signals for collective policy updates, avoiding the need for a single centralized controller. 

## Example Scenario: Multi-Stage Chemical Process

### Overview: 
1. Process Layout:
   * multiple reactions in series or parallel, each with local temperature, pressure, and concentration controls.
   * Shared resources like raw materials, energy (steam, cooling water), or intermediate storage tanks.
2. Control Objectives:
   * Maximize throughout or yield of a target chemical product
   * Maintain product quality (e.g., purity, consistent composition)
   * Minimize energy consumption, emissions, or operating costs
3. RL can be a good fit because:
   * Partial physical models: Engineers often have approximate kinetic equations and mass/energy balance equations, but real-world complexities remain uncertain.
   * Modern plants generate massive logs allowing for adaptive data-driven modelling.
   * Decentralized Contorl: Each reactor or unit needs its own local controller, but they must coordinate to maintain overall throughput and quality.


### Applying Framework to Multi-Stage Chemical Processes
1. Data-Driven Uncertainty Sets
   * Data Souces:
     - Historical Logs
     - Online Sensor Streams
   * Building Uncertainty sets:
     - Construct hypothesis-test-based bounds on parameters, which subject to variation like reaction rates, heat transfer coefficients, capturing data-driven "most-likely" fluctuations.
     - Update uncertainty sets periodically as new data arrives -- keep it up-to-date
   * Reduced Convervatism with Robust Guarantees:
     - Traditional robust approaches might assume "worst-case" reaction rates for all conditions;
     - data-driven sets allow the system to relax constraints when evidence shows smaller deviations while still maintaining high-probability guarantee of safety and feasibility.
2. Learning-Based Predictive Models
   * Hybrid Physical + Data-Driven Model:
     - Physical Model Component: approximates mass & energy balances; std kinetics (Arrhenius) for reactions
     - Residual / Error-term Model: A NN that captures unmodeled interactions (side reactions, catalyst deactivation, etc.)
   * Online Model Refinement
     - Update the residual function with real-time data, capturing time-varying factors (such as catalyst poisoning, varying feed composition)
     - Keep track of predicitive uncertainty (Bayesian) => feed uncertainty into robust optimization process.
3. Distributed Model Predictive Control (DMPC)
   * This part requires a gradual transition that retains PID's simplicity while leveraging MPC's predictive capabilities.
     - implement PID-MPC hybrid control (define MPC cost function as PID gain) => allows MPC to act as a PID-like function approximator while keeping its predictive and constraint-handling advantage
     - Use MPC to Auto-Tune PID Parameters
     - 