# Geko Fitting Demo: Simplified Usage

This notebook demonstrates how to use `geko` to fit JWST grism spectroscopy data using a simple Python interface.

## Overview

`geko` is a Python package for analyzing JWST grism spectroscopy and morphology data. It uses:
- **JAX** for accelerated numerical computation
- **Numpyro** for Bayesian inference via MCMC
- **Kinematic models** to fit galaxy rotation curves

## Required Data Structure

All data files should be organized in a single base directory (specified by `save_runs_path`). The code expects the following structure:

### Directory Structure

```
<save_runs_path>/                     # Base directory (e.g., 'fitting_results/')
├── <output_name>/                    # Subfolder for your specific galaxy/run
│   ├── spec_2d_*_ID<ID>_comb.fits    # 2D grism spectrum (required)
│   └── <ID>_output                   # Fit results (generated after run)
├── morph_fits/                       # Morphology results directory
│   └── summary_<ID>_image_F150W_svi.cat  # PySersic Sersic profile fits
├── psfs/                             # PSF files directory
│   ├── mpsf_jw018950.gn.f444w.fits   # PSF for GOODS-N
│   ├── mpsf_jw035770.f356w.fits      # PSF for GOODS-N-CONGRESS
│   └── mpsf_jw018950.gs.f444w.fits   # PSF for GOODS-S-FRESCO
└── catalogs/                         # Optional: catalog directory
    └── <master_catalog>.cat          # Master catalog (can be anywhere)
```

**Example**: If you set:
- `save_runs_path='/path/to/data/'`
- `output_name='my_galaxy'`
- `source_id=12345`
- `field='GOODS-N'`

The code will look for:
- Grism spectrum: `/path/to/data/my_galaxy/spec_2d_GDN_F444W_ID12345_comb.fits`
- Morphology: `/path/to/data/morph_fits/summary_12345_image_F150W_svi.cat`
- PSF: `/path/to/data/psfs/mpsf_jw018950.gn.f444w.fits`
- Results saved to: `/path/to/data/my_galaxy/12345_output`

### Required Files

#### 1. Master Catalog File

An ASCII table containing source properties. Path specified as `master_cat` parameter (can be located anywhere).

Required columns:
- `ID`: Source identifier (must match your `source_id`)
- `zspec`: Spectroscopic redshift
- `<line>_lambda`: Observed wavelength of emission line (e.g., `H_alpha_lambda` for H-alpha at 6562.8 Å)
- `fit_flux_cgs`: Log of integrated emission line flux (log erg/s/cm²)
- `fit_flux_cgs_e`: Error on log flux

#### 2. Grism Spectrum FITS File

Located at: `<save_runs_path>/<output_name>/spec_2d_*_ID<source_id>_comb.fits`

File naming convention depends on the field:
- **GOODS-N**: `spec_2d_GDN_F444W_ID<source_id>_comb.fits`
- **GOODS-N-CONGRESS**: `spec_2d_GDN_F356W_ID<source_id>_comb.fits`
- **GOODS-S-FRESCO**: `spec_2d_FRESCO_F444W_ID<source_id>_comb.fits`

The FITS file should contain:
- Extension 0: 2D spectrum data (flux vs wavelength and spatial position)
- Extension 1: Error/uncertainty map
- WCS information for wavelength calibration

#### 3. PySersic Morphology File

Located at: `<save_runs_path>/morph_fits/summary_<source_id>_image_F150W_svi.cat` (or `F182M`)

ASCII catalog from [PySersic](https://github.com/astropath/pysersic) fits containing:
- Sersic index (n)
- Effective radius (r_eff)
- Position angle (PA)
- Axis ratio (q)
- Centroid positions (x0, y0)

These morphological parameters are used to set priors for the kinematic fitting.

#### 4. PSF Files

Located at: `<save_runs_path>/psfs/mpsf_*.fits`

Field-specific point spread function FITS files:
- `mpsf_jw018950.gn.f444w.fits` for GOODS-N
- `mpsf_jw035770.f356w.fits` for GOODS-N-CONGRESS  
- `mpsf_jw018950.gs.f444w.fits` for GOODS-S-FRESCO

The code automatically selects the appropriate PSF based on the `field` parameter.

## Running the Fit

Once you have prepared all required files, running the fit is straightforward:

In [1]:
# Import required modules
from geko.fitting import run_geko_fit
from geko.config import FitConfiguration

# JAX configuration
import jax
jax.config.update('jax_enable_x64', True)

print("Imports successful!")

Imports successful!


### Basic Usage

In [2]:
# Define parameters
source_id = 191250                      # Source ID in your catalog
field = 'GOODS-S-FRESCO'                      # Field name
output_name = 'my_galaxy'              # Name of output folder
master_catalog = '/Users/lola/ASTRO/JWST/grism_project/testing_geko_demo/catalogs/my_galaxies_cat'  # Path to master catalog
emission_line = 'H_alpha'                   # Emission line wavelength (Angstroms, rest frame)
parametric = True                      # Use parametric Sersic morphology
save_runs_path = '/Users/lola/ASTRO/JWST/grism_project/testing_geko_demo/'    # Where to save results

# Optional parameters (with defaults)
grism_filter = 'F444W'                 # Grism filter
delta_wave_cutoff = 0.005              # Wavelength bin size (microns)
factor = 3                             # Spatial oversampling factor
wave_factor = 4                       # Wavelength oversampling factor
model_name = 'Disk'                    # Kinematic model type

# MCMC parameters
num_chains = 1                       # Number of MCMC chains
num_warmup = 5                       # Warmup iterations
num_samples = 20                     # Sampling iterations

In [2]:
# Run the fit (no config file needed!)
inference_data = run_geko_fit(
    output=output_name,
    master_cat=master_catalog,
    line=emission_line,
    parametric=parametric,
    save_runs_path=save_runs_path,
    num_chains=num_chains,
    num_warmup=num_warmup,
    num_samples=num_samples,
    source_id=source_id,               # NEW: Direct parameter
    field=field,                        # NEW: Direct parameter
    grism_filter=grism_filter,         # Optional
    delta_wave_cutoff=delta_wave_cutoff,  # Optional
    factor=factor,                      # Optional
    wave_factor=wave_factor,            # Optional
    model_name=model_name,              # Optional
    config=None                         # Optional: custom configuration
)

Cont sub done
Flipping map! (Mod B)
(15, 15)
Disk model created
Disk object created
Rotating the prior by  0.0  radians, from  2.964  radians to  2.964  radians
Setting parametric priors:  10.175309523224001 59.206811076802936 3.342 1.297 0.18970094750436658 15.165 15.0425


  kin_model.disk.set_parametric_priors(pysersic_summary, [int_flux, int_flux_err], z_spec, wavelength, delta_wave, theta_rot = theta_rot, shape = obs_map.shape[0])


MCMC settings: 1 chains, 20 samples, 5 warmup, max_tree_depth=10, target_accept=0.8
step size:  0.001
warmup:  5
samples:  20


sample: 100%|██████████| 25/25 [03:51<00:00,  9.26s/it, 127 steps of size 2.57e-02. acc. prob=0.60]


done

                          mean       std    median      5.0%     95.0%     n_eff     r_hat
         unscaled_PA     -1.97      0.86     -1.73     -3.08     -0.39      5.79      1.59
   unscaled_PA_morph     -0.03      0.76     -0.21     -0.90      1.12     10.80      0.95
         unscaled_Va      0.75      0.03      0.75      0.71      0.81     17.57      0.95
  unscaled_amplitude      0.19      0.05      0.19      0.14      0.30      8.15      0.99
          unscaled_i      1.67      1.20      1.56     -0.29      3.70     36.31      0.96
          unscaled_n     -0.86      0.11     -0.90     -0.92     -0.56     10.66      1.03
      unscaled_r_eff      0.86      0.06      0.87      0.79      0.98      8.37      0.96
        unscaled_r_t      0.82      0.11      0.85      0.59      0.94     10.76      0.95
     unscaled_sigma0      0.07      0.05      0.05      0.01      0.14      9.37      0.95
         unscaled_v0      0.66      0.90      0.41     -0.36      2.39      8.64    



Error in make_mask, trying with lower sigma_rms


<Figure size 1000x1000 with 0 Axes>

### Using Custom Configuration (Optional)

You can set custom priors using the `FitConfiguration` class. 

**Config contains:**
- **Morphology priors**: No defaults - must come from PySersic or manual specification
- **Kinematic priors**: Have defaults but can be overridden
- **MCMC settings**: Have defaults but can be overridden

**Scenario 1: You have PySersic fits (typical)**
- PySersic priors are loaded automatically for morphology
- You can optionally provide a config to override kinematic priors (Va, sigma0 ranges)
- Morphology stays from PySersic unless you explicitly set it in config

**Scenario 2: You don't have PySersic fits (rare)**
- You **must** provide a config with all morphology priors explicitly set
- Error will be raised if morphology priors are missing
- You can still override kinematic priors if desired

In [4]:
## Example 1: Override kinematic priors (you have PySersic fits)
# Only override the kinematic priors, keep PySersic morphology
from geko.config import FitConfiguration, KinematicPriors

config = FitConfiguration(
    kinematics=KinematicPriors(
        Va_min=50.0,        # Minimum asymptotic velocity (km/s)
        Va_max=300.0,       # Maximum asymptotic velocity (km/s)
        sigma0_min=10.0,    # Minimum velocity dispersion (km/s)
        sigma0_max=150.0    # Maximum velocity dispersion (km/s)
    )
)

# Morphology is None - will use PySersic values automatically
print("Kinematic priors will be overridden, morphology will come from PySersic")
config.print_summary()

# Run fit - PySersic morphology + custom kinematic priors
# inference_data_selective = run_geko_fit(
#     output=output_name,
#     master_cat=master_catalog,
#     line=emission_line,
#     parametric=parametric,
#     save_runs_path=save_runs_path,
#     num_chains=num_chains,
#     num_warmup=num_warmup,
#     num_samples=num_samples,
#     source_id=source_id,
#     field=field,
#     config=config  # Kinematic override only
# )

## Example 2: Complete config with morphology (no PySersic available)
# Set all morphology manually but keep the default kinematic priors
from geko.config import MorphologyPriors

config_full = FitConfiguration(
    morphology=MorphologyPriors(
        # Position angle (degrees) - normal prior
        PA_mean=90.0,
        PA_std=30.0,
        # Inclination (degrees) - truncated normal prior
        inc_mean=55.0,
        inc_std=15.0,
        # Effective radius (pixels) - truncated normal
        r_eff_mean=3.0,
        r_eff_std=1.0,
        r_eff_min=0.5,
        r_eff_max=10.0,
        # Sersic index - truncated normal
        n_mean=1.0,
        n_std=0.5,
        n_min=0.5,
        n_max=4.0,
        # Central coordinates (pixels) - normal
        xc_mean=0.0,
        xc_std=2.0,
        yc_mean=0.0,
        yc_std=2.0,
        # Amplitude - truncated normal
        amplitude_mean=100.0,
        amplitude_std=50.0,
        amplitude_min=1.0,
        amplitude_max=1000.0
    )
)

print("\nComplete config set - can run without PySersic file")
config_full.print_summary()

# This would work even without a PySersic file
# Run fit - PySersic morphology + custom kinematic priors
inference_data_selective = run_geko_fit(
    output=output_name,
    master_cat=master_catalog,
    line=emission_line,
    parametric=parametric,
    save_runs_path=save_runs_path,
    num_chains=num_chains,
    num_warmup=num_warmup,
    num_samples=num_samples,
    source_id=source_id,
    field=field,
    config=config_full  # Kinematic override only
)

Kinematic priors will be overridden, morphology will come from PySersic
Geko Configuration Summary

Morphology Priors:
  Not set (will use PySersic priors)

Kinematic Priors:
  Va_min: 50.0 (default: -1000)
  Va_max: 300.0 (default: 1000)
  sigma0_min: 10.0 (default: 0)
  sigma0_max: 150.0 (default: 500.0)

MCMC Settings:
  num_chains: 4
  num_warmup: 500
  num_samples: 1000
  target_accept_prob: 0.8
  max_tree_depth: 10
  step_size: 0.1

Complete config set - can run without PySersic file
Geko Configuration Summary

Morphology Priors:
  PA_mean: 90.0
  PA_std: 30.0
  inc_mean: 55.0
  inc_std: 15.0
  r_eff_mean: 3.0
  r_eff_std: 1.0
  r_eff_min: 0.5
  r_eff_max: 10.0
  n_mean: 1.0
  n_std: 0.5
  n_min: 0.5
  n_max: 4.0
  xc_mean: 0.0
  xc_std: 2.0
  yc_mean: 0.0
  yc_std: 2.0
  amplitude_mean: 100.0
  amplitude_std: 50.0
  amplitude_min: 1.0
  amplitude_max: 1000.0

Kinematic Priors:
  Va_min: -1000
  Va_max: 1000
  sigma0_min: 0
  sigma0_max: 500.0

MCMC Settings:
  num_chains: 4
  nu

  kin_model.disk.set_parametric_priors(pysersic_summary, [int_flux, int_flux_err], z_spec, wavelength, delta_wave, theta_rot = theta_rot, shape = obs_map.shape[0])


MCMC settings: 1 chains, 20 samples, 5 warmup, max_tree_depth=10, target_accept=0.8
step size:  0.1
warmup:  5
samples:  20


sample:  44%|████▍     | 11/25 [05:20<21:58, 94.15s/it, 1023 steps of size 2.30e-03. acc. prob=1.00]

## What Happens During the Fit

The `run_geko_fit` function automatically:

1. **Preprocessing** (`run_full_preprocessing`):
   - Loads 2D grism spectrum and error maps
   - Loads PSF for the appropriate field
   - Creates wavelength grid
   - Initializes `Grism` object for dispersion modeling
   - Initializes kinematic model (e.g., `DiskModel`)

2. **Prior Setup** (if `parametric=True`):
   - Loads PySersic morphology results from `morph_fits/` directory
   - Extracts integrated emission line flux from master catalog
   - Sets morphological priors (position angle, inclination, etc.) based on imaging data
   - Applies field-specific rotation corrections to align coordinate systems

3. **MCMC Sampling**:
   - Creates `Fit_Numpyro` instance with observation data
   - Runs NUTS (No-U-Turn Sampler) with specified chains, warmup, and samples
   - Applies custom configuration if provided via `config` parameter
   - Creates source mask to focus on high S/N regions

4. **Postprocessing** (`process_results`):
   - Computes best-fit kinematic model
   - Calculates velocity at effective radius (v_re)
   - Generates diagnostic plots and summary statistics
   - Saves fit results and plots to output directory

5. **Output**:
   - Returns `arviz.InferenceData` object with posterior samples
   - Saves MCMC results to `<save_runs_path>/<output_name>/<source_id>_output`
   - Saves plots and summary tables to the same directory

## Analyzing Results

After the fit completes, you can analyze the results:

In [None]:
import arviz as az
import matplotlib.pyplot as plt

# Summary statistics
print(az.summary(inference_data))

# Trace plots
az.plot_trace(inference_data)
plt.tight_layout()
plt.show()

# Corner plot
import corner
samples = az.extract(inference_data, num_samples=1000)
# Convert to numpy and create corner plot
# corner.corner(samples)

## Common Emission Lines

For the `line` parameter, use rest-frame wavelengths in Angstroms:

- **H-alpha**: 6562.8 Å
- **H-beta**: 4861.3 Å  
- **[OIII]**: 5006.8 Å
- **[OII]**: 3727.0 Å

The code will automatically calculate the observed wavelength based on the redshift in your master catalog.

## Tips for Success

1. **Check your data quality**: Ensure grism spectra have good S/N
2. **Set appropriate priors**: Use photometric measurements to constrain morphology
3. **Run sufficient samples**: Typically 1000-5000 samples after warmup
4. **Monitor convergence**: Check R-hat values < 1.01
5. **Validate results**: Inspect model residuals and posterior distributions