# Advanced ML Diagnostics and Interpretability

## Detailed Analysis

This script is used to perform rigorous validation and interpretability analysis on the trained machine learning models (specifically Model 'Geo' and Model 'A'). It goes beyond simple accuracy metrics to assess whether the models have learned valid physical correlations, how robust they are to observational noise, and how their performance scales with data volume.

The analysis is divided into four main components: data efficiency (Learning Curves), stress testing (Noise Robustness), physical interpretation (Probability Correlations), and feature attribution (SHAP analysis).

## Physics and Math

### 1. Observational Noise Simulation
To test the robustness of the classifier against real-world measurement errors (e.g., from NICER), Gaussian noise is injected into the Radius feature:

$$
R_{noisy} = R_{true} + \delta, \quad \delta \sim \mathcal{N}(0, \sigma^2)
$$

The model accuracy is then evaluated as a function of the noise strength $\sigma$.

### 2. Feature Attribution (SHAP)
Shapley Additive Explanations (SHAP) are used to quantify the contribution of each feature to the model's prediction. For a specific prediction, the Shapley value $\phi_i$ represents the change in the log-odds of the classification output attributable to feature $i$, averaged over all possible feature coalitions.

### 3. Probability Correlations
The classifier outputs a probability $P(\text{Quark} | \mathbf{x})$. To verify that the model is learning physics rather than artifacts, this probability is correlated against internal microphysical parameters $\theta$ (such as central density $\epsilon_c$ or sound speed $c_s^2$) that were **not** provided as inputs to Model A.

$$
\text{Correlation Map}: \quad \theta \longleftrightarrow P(\text{Quark})
$$

If the model is physically sound, $P(\text{Quark})$ should show structured correlations with these hidden variables (e.g., transitions in slope or density).

## Code Walkthrough

### 1. Analysis Orchestrator
The `run_advanced_analysis` function serves as the driver. It accepts the dictionary of trained models and the test dataset. It selectively runs diagnostics for Model 'Geo' (Mass-Radius only) and Model 'A' (Observables with Tidal Deformability).

```python
def run_advanced_analysis(df, models_dict, X_test, y_test):
    # ...
    # 1. Learning Curves
    plot_learning_curve(models_dict['Geo'], df, 'Geo', ...)
    
    # 2. Noise Robustness
    plot_noise_robustness(models_dict['A'], X_test, y_test)
    
    # 3. Physics Correlations
    # ...
    
    # 4. SHAP Analysis
    plot_shap_analysis(...)
```

### 2. Physical Correlation Plots
The function `plot_probability_kde` visualizes the relationship between the predicted probability of a star being a Quark star and its underlying physical properties.

*   **Logic:** It creates a Kernel Density Estimation (KDE) contour plot.
*   **Purpose:** To reveal if the model implicitly learns boundaries associated with the speed of sound limit ($c_s^2=1/3$) or the topological slope transition ($dR/dM=0$), even if those features were not explicit inputs.

### 3. Learning Curves
The `plot_learning_curve` function assesses overfitting and data saturation.

*   **Cross-Validation:** It uses `GroupKFold` to ensure that training and validation splits do not share points from the same EoS curve.
*   **Metric:** It plots training accuracy and cross-validation accuracy as the training set size increases. A converging gap indicates a well-generalized model.

### 4. Noise Robustness Test
The `plot_noise_robustness` function simulates the degradation of Model A's performance under increasingly poor observational conditions.

```python
noise = np.random.normal(0, sigma, size=len(X_noisy))
X_noisy['Radius'] += noise
acc = model.score(X_noisy[required_cols], y_test)
```

This loop evaluates the model accuracy for noise levels $\sigma$ ranging from 0.0 to 2.0 km.

### 5. SHAP Analysis
The `plot_shap_analysis` function generates a "beeswarm" plot using the `shap` library (if available). This visualization shows:
*   **Feature Importance:** Which features (Mass, Radius, Lambda) drive the decision.
*   **Directionality:** Whether high or low values of a feature push the prediction toward "Quark" or "Hadronic".

## Visualization Output

This script generates four types of figures in the `plots/` directory:

1.  **`fig_ml_learning_curve_{model}.pdf`**: Shows the training and validation accuracy vs. the number of training samples.
2.  **`fig_ml_noise_robustness.pdf`**: A line plot showing how Model A's accuracy decays as radius noise increases. Reference lines for NICER errors ($\approx 0.5$ km) are included.
3.  **`fig_corr_{model}_{tag}.pdf`**: Contour maps showing the density of predictions. The x-axis represents a physical quantity (e.g., Log Tidal, Central Density), and the y-axis represents the predicted $P(\text{Quark})$.
4.  **`fig_ml_shap_beeswarm_{model}.pdf`**: A summary plot where each dot is a test sample. The x-position represents the SHAP value (impact on model output), and the color represents the feature value (Red=High, Blue=Low).