# Generality

Generality for a trained Machine Learning model can be categorized into different levels:

## Levels of Generality in Machine Learning Models for CFD

$\underline{\textbf{Basic Generality (Traditional Definition)}}$
Definition: A trained model is considered to have a high degree of generality if it can accurately predict different flow cases, such as varying Reynolds number (Re), angle of attack (AoA), and geometries, without requiring retraining.

- Common in: Surrogate models, physics-informed NNs, and some ML-based turbulence models.

- Limitations: Requires the training dataset to cover a wide range of flow conditions to generalize effectively. 

$\underline{\textbf{Generality in a Diffusion Model (Junior Level)}}$

Definition: A trained diffusion model can generate corresponding flow realizations under different conditioning inputs (e.g., RANS, sparse sensor data, low-resolution LES) without retraining.

1. Key Strength:
- Zero-shot conditioned generation: The model, trained unconditionally, can take in different types of constraints (e.g., RANS, sparse data) and generate the correct corresponding realizations without needing new training.

2. Limitation:
- Flow cases must remain within the distribution of the training data.
- If a case is far outside the training set (e.g., different Re, AoA, or flow regime), the model may fail to generate accurate results.

- Example: If trained on LES of a single airfoil shape at AoA = 5°, the model can reconstruct corresponding LES-quality flow given RANS inputs but may not generalize well to a different airfoil or AoA = 30°.


$\underline{\textbf{High-Level Generality in a Diffusion Model (Senior Level)}}$

Definition: A trained diffusion model not only produces flow realizations under different conditioning inputs but also generalizes to entirely different flow regimes (e.g., laminar-to-turbulent transitions, wakes, new geometries, high Re, different angles of attack) without retraining.

- A vast and diverse training dataset, covering multiple flow regimes, geometries, Re, and AoA values (as you mentioned, Jianxun Wang suggests 10^3 subsequences for large-scale generalization).
- A more advanced architecture, possibly incorporating latent space regularization, physics constraints, and multi-scale representations to ensure generality across vastly different flows.

- Very challenging: Difficult to achieve in practice because the model needs to learn a broad probabilistic distribution that encompasses many different flow regimes. It requires huge computational resources for training to cover high-dimensional, multi-regime turbulence physics.

|Level | What It Means CFD solvers | Example |Limitation 
|:--------:|:--------:|:--------:|:--------:|
|  Basic Generality (Traditional ML)  |  Model generalizes across different flow cases (Re, AoA, geometries) without retraining.    |  Surrogate models trained across multiple AoA values predict unseen cases.  |Requires diverse training data across all expected conditions.
| Junior Level Generality (Diffusion Model)  | Generates correct flow realizations given different conditioning inputs without retraining.  |  Trained model reconstructs LES fields from RANS input but only for trained airfoil at specific AoA. |Limited to flow conditions similar to the training data.
| Senior Level Generality (Diffusion Model)   |  Generates flow fields across completely different flow regimes, geometries, Re, and AoA without retraining.  |   Single model handles laminar-to-turbulent transitions, wakes, compressible flows, and different geometries.  |Requires vast training data and advanced model architecture.

