# Interactive G2 Structure Learning Workflow

This notebook provides an interactive alternative to running the command-line scripts described in the README. Each cell corresponds to a step in the workflow for learning G2 structures on Calabi-Yau Links.

**Workflow Overview:**
1. Train CY metric model
2. Generate G2 sample data
3. Train G2 models (3-form and metric)
4. Validate models against G2 identities

Execute cells in order, modifying parameters as needed for your experiments.

## Step 1: Train CY Metric Model

Train a neural network to learn the Ricci-flat Kähler metric on the Calabi-Yau threefold using the cymetric package. The trained model will be saved to `models/cy_models/cy_metric_model_run{N}.keras`.

**Optional arguments:**
- `--n-points` - Number of training points to generate (default: 100000)
- `--n-layers` - Number of hidden layers (default: 5)
- `--n-hidden` - Hidden units per layer (default: 64)
- `--activation` - Activation function (default: gelu)
- `--n-epochs` - Number of training epochs (default: 500)
- `--data-dir` - Directory for training data (default: ./samples/cy_data)
- `--save-dir` - Directory to save trained model (default: ./models/cy_models)
- `--run-number` - Run number for model naming (default: auto-increment)

In [None]:
!python run_cy.py

## Step 2: Generate G2 Sample Data

Create training data by sampling points on the CY and computing analytical G2 forms. Data will be saved to `samples/link_data/`.

**Optional arguments:**
- `--n-points` - Number of base points to sample (default: all training points)
- `--n-rotations` - Number of random rotations per base point (default: 4)
- `--cy-run-number` - CY model run number to use (default: most recent)
- `--cy-model` - Path to trained CY metric model (overrides --cy-run-number)
- `--cy-config` - Path to CY model config file (overrides --cy-run-number)
- `--cy-data-dir` - Directory containing CY training data (default: ./samples/cy_data)
- `--output-dir` - Directory to save G2 dataset (default: ./samples/link_data)

In [None]:
!python sampling.py --n-points 5000

## Step 3: Train G2 3-form Model

Train a neural network to predict the G2 defining 3-form φ. Model saved to `models/link_models/3form_run{N}.keras`.

**Optional arguments:**
- `--task` - Task to train: '3form' or 'metric' (overrides hps.yaml setting)
- `--n-epochs` - Number of training epochs (overrides hps.yaml setting)
- `--hps` - Path to hyperparameters YAML file (default: ./hyperparameters/hps.yaml)
- `--train-data` - Path to training data NPZ file (default: ./samples/link_data/g2_train.npz)
- `--val-data` - Path to validation data NPZ file (default: ./samples/link_data/g2_val.npz)
- `--output-dir` - Directory to save trained models (default: ./models/link_models)
- `--plots-dir` - Directory to save plots (default: ./plots)

In [None]:
!python run_g2.py --task 3form

## Step 4: Train G2 Metric Model

Train a neural network to predict the G2 metric. Model saved to `models/link_models/metric_run{N}.keras`.

**Optional arguments:** (same as Step 3)
- `--task` - Task to train: '3form' or 'metric' (overrides hps.yaml setting)
- `--n-epochs` - Number of training epochs (overrides hps.yaml setting)
- `--hps` - Path to hyperparameters YAML file (default: ./hyperparameters/hps.yaml)
- `--train-data` - Path to training data NPZ file (default: ./samples/link_data/g2_train.npz)
- `--val-data` - Path to validation data NPZ file (default: ./samples/link_data/g2_val.npz)
- `--output-dir` - Directory to save trained models (default: ./models/link_models)
- `--plots-dir` - Directory to save plots (default: ./plots)

In [None]:
!python run_g2.py --task metric

## Step 5: Validate CY Metric Kählerity

Check that the learned CY metric satisfies dω = 0 (Kählerity condition). Results saved to `plots/`.

**Optional arguments:**
- `--cy-run-number` - CY model run number to use (default: most recent)
- `--n-train` - Number of training points to check (default: 0)
- `--n-val` - Number of validation points to check (default: all)
- `--epsilon` - Epsilon for numerical derivative (default: 1e-12)
- `--data-dir` - Directory containing dataset.npz and basis.pickle (default: ./samples/cy_data)
- `--output-dir` - Directory to save output plots (default: ./plots)

In [None]:
!python analysis/cy_kahlerity.py --cy-run-number 1

## Step 6: Validate G2 Identities (Analytic)

Check that the analytical G2 construction satisfies the three identities: φ∧ψ = 7·Vol(g), dψ = 0, and dφ = ω². This validates the data generation pipeline.

**Optional arguments:**
- `--cy-run-number` - CY model run number to use (default: auto-detect from dataset)
- `--n-points` - Number of points to check (default: all; randomly samples if less than dataset size)
- `--epsilon` - Epsilon for numerical derivative (default: 1e-12)
- `--rotation-epsilon` - Epsilon for global U(1) phase rotation when computing derivatives (default: 1e-12)
- `--g2-data` - Path to G2 dataset (default: ./samples/link_data/g2_test.npz)
- `--cy-data-dir` - Directory containing CY data (default: ./samples/cy_data)
- `--output-dir` - Directory to save output plots (default: ./plots)

In [None]:
!python analysis/g2_identities_analytic.py --cy-run-number 1

## Step 7: Validate G2 Identities (Model)

Check that the learned G2 models satisfy the three identities using model predictions. This is the ultimate validation of model accuracy.

**Optional arguments:**
- `--g2-run-number` - G2 model run number to use (default: most recent)
- `--cy-run-number` - CY model run number to use (default: auto-detect from dataset)
- `--n-points` - Number of points to check (default: all; randomly samples if less than dataset size)
- `--epsilon` - Epsilon for numerical derivative (default: 1e-12)
- `--rotation-epsilon` - Epsilon for global U(1) phase rotation when computing derivatives (default: 1e-12)
- `--g2-data` - Path to G2 test dataset (default: ./samples/link_data/g2_test.npz)
- `--cy-data-dir` - Directory containing CY data (default: ./samples/cy_data)
- `--output-dir` - Directory to save output plots (default: ./plots)

In [None]:
!python analysis/g2_identities_model.py --cy-run-number 1 --g2-run-number 1