# Geko Fitting Demo: Simplified Usage

This notebook demonstrates how to use `geko` to fit JWST grism spectroscopy data **without needing a YAML config file**.

## Overview

`geko` is a Python package for analyzing JWST grism spectroscopy and morphology data. It uses:
- **JAX** for accelerated numerical computation
- **Numpyro** for Bayesian inference via MCMC
- **Kinematic models** to fit galaxy rotation curves

## Required Data Structure

**IMPORTANT**: All data files should be placed in a single base directory (specified by `save_runs_path`). By default, this is `fitting_results/`, but you can use any directory.

### Directory Structure

```
<save_runs_path>/                     # Base directory (e.g., 'fitting_results/' or 'data/')
├── <output_name>/                    # Subfolder for your galaxy
│   └── spec_2d_*_ID<ID>_comb.fits    # 2D grism spectrum (auto-located based on field)
├── morph_fits/                       # Morphology results directory
│   └── summary_<ID>_image_F150W_svi.cat  # Sersic fits (F150W or F182M)
├── psfs/                             # PSF directory
│   ├── mpsf_jw018950.gn.f444w.fits   # PSF for GOODS-N
│   ├── mpsf_jw035770.f356w.fits      # PSF for GOODS-N-CONGRESS
│   └── mpsf_jw018950.gs.f444w.fits   # PSF for GOODS-S-FRESCO
└── <master_catalog>.cat              # Master catalog (can be anywhere, specified separately)
```

**Example**: If you set `save_runs_path='my_data/'` and `source_id=12345`, the code will look for:
- `my_data/my_galaxy/spec_2d_GDN_F444W_ID12345_comb.fits` (for GOODS-N)
- `my_data/morph_fits/summary_12345_image_F150W_svi.cat`
- `my_data/psfs/mpsf_jw018950.gn.f444w.fits`

**Note**: No config file needed! All parameters are set directly in Python.

### Required Files

#### 1. Master Catalog File

Path specified as `master_cat` parameter (can be anywhere on your system).

An ASCII table with columns:
- `ID`: Source identifier (must match your `source_id`)
- `zspec`: Spectroscopic redshift
- `<line>_lambda`: Observed wavelength of emission line (e.g., `6562_lambda` for H-alpha)
- `fit_flux_cgs`: Log of integrated emission line flux (log erg/s/cm²)
- `fit_flux_cgs_e`: Error on log flux

#### 2. Grism Spectrum FITS File

Located at: `<save_runs_path>/<output_name>/spec_2d_*_ID<source_id>_comb.fits`

File naming depends on the field:
- **GOODS-N**: `spec_2d_GDN_F444W_ID<source_id>_comb.fits`
- **GOODS-N-CONGRESS**: `spec_2d_GDN_F356W_ID<source_id>_comb.fits`
- **GOODS-S-FRESCO**: `spec_2d_FRESCO_F444W_ID<source_id>_comb.fits`

The FITS file should contain:
- Extension with 2D spectrum data
- Extension with error/uncertainty map
- WCS information for wavelength calibration

#### 3. PySersic Morphology File

Located at: `<save_runs_path>/morph_fits/summary_<source_id>_image_F150W_svi.cat` (or F182M)

ASCII catalog from PySersic fits containing columns:
- Sersic profile parameters (n, r_eff, etc.)
- Position angle
- Axis ratio
- Centroid positions

#### 4. PSF Files

Located at: `<save_runs_path>/psfs/mpsf_*.fits`

Field-specific PSF FITS files (automatically selected based on field parameter)

## Running the Fit

Once you have prepared all required files, running the fit is straightforward:

In [3]:
# Import required modules
from geko.fitting import run_geko_fit
from geko.config import FitConfiguration

# JAX configuration
import jax
jax.config.update('jax_enable_x64', True)

print("Imports successful!")

Imports successful!


### Basic Usage

In [10]:
# Define parameters
source_id = 191250                      # Source ID in your catalog
field = 'GOODS-S-FRESCO'                      # Field name
output_name = 'my_galaxy'              # Name of output folder
master_catalog = '/Users/lola/ASTRO/JWST/grism_project/catalogs/fresco_Ha_cat'  # Path to master catalog
emission_line = 'H_alpha'                   # Emission line wavelength (Angstroms, rest frame)
parametric = True                      # Use parametric Sersic morphology
save_runs_path = '/Users/lola/ASTRO/JWST/grism_project/testing_geko_demo/'    # Where to save results

# Optional parameters (with defaults)
grism_filter = 'F444W'                 # Grism filter
delta_wave_cutoff = 0.005              # Wavelength bin size (microns)
factor = 5                             # Spatial oversampling factor
wave_factor = 10                       # Wavelength oversampling factor
model_name = 'Disk'                    # Kinematic model type

# MCMC parameters
num_chains = 2                         # Number of MCMC chains
num_warmup = 500                       # Warmup iterations
num_samples = 1000                     # Sampling iterations

# Run the fit (no config file needed!)
inference_data = run_geko_fit(
    output=output_name,
    master_cat=master_catalog,
    line=emission_line,
    parametric=parametric,
    save_runs_path=save_runs_path,
    num_chains=num_chains,
    num_warmup=num_warmup,
    num_samples=num_samples,
    source_id=source_id,               # NEW: Direct parameter
    field=field,                        # NEW: Direct parameter
    grism_filter=grism_filter,         # Optional
    delta_wave_cutoff=delta_wave_cutoff,  # Optional
    factor=factor,                      # Optional
    wave_factor=wave_factor,            # Optional
    model_name=model_name,              # Optional
    config=None                         # Optional: custom configuration
)

Using direct parameters (no config file)
Cont sub done
Flipping map! (Mod B)
(15, 15)
Disk model created
Disk object created
Rotating the prior by  0.0  radians, from  2.964  radians to  2.964  radians
Setting parametric priors:  10.175309523224001 59.206811076802936 3.342 1.297 0.18970094750436658 15.165 15.0425


  kin_model.disk.set_parametric_priors(pysersic_summary, [int_flux, int_flux_err], z_spec, wavelength, delta_wave, theta_rot = theta_rot, shape = obs_map.shape[0])


MCMC settings: 2 chains, 1000 samples, 500 warmup, max_tree_depth=10, target_accept=0.8
step size:  0.001
warmup:  500
samples:  1000


  self.mcmc = MCMC(self.nuts_kernel, num_samples=num_samples,


KeyboardInterrupt: 

### Using Custom Configuration (Optional)

You can override default priors using the `FitConfiguration` class:

In [None]:
# Create custom configuration
config = FitConfiguration()

# Set custom priors for specific parameters
config.set_prior('Va', prior_type='uniform', bounds=[50, 300])
config.set_prior('r_t', prior_type='uniform', bounds=[0.5, 5.0])
config.set_prior('sigma0', prior_type='uniform', bounds=[10, 150])

# Print configuration summary
config.print_summary()

# Run fit with custom config
inference_data = run_geko_fit(
    output=output_name,
    master_cat=master_catalog,
    line=emission_line,
    parametric=parametric,
    save_runs_path=save_runs_path,
    num_chains=num_chains,
    num_warmup=num_warmup,
    num_samples=num_samples,
    config=config  # Use custom configuration
)

## What Happens During the Fit

The `run_geko_fit` function automatically:

1. **Preprocessing** (`run_full_preprocessing`):
   - Loads 2D grism spectrum and error maps
   - Loads PSF for the appropriate field
   - Creates wavelength grid
   - Initializes `Grism` object for dispersion modeling
   - Initializes kinematic model (e.g., `DiskModel`)

2. **Prior Setup** (if `parametric=True`):
   - Loads PySersic morphology results
   - Extracts integrated emission line flux from master catalog
   - Sets morphological priors based on imaging data

3. **MCMC Sampling**:
   - Creates `Fit_Numpyro` instance
   - Runs NUTS sampler with specified chains, warmup, and samples
   - Applies custom configuration if provided

4. **Output**:
   - Returns `arviz.InferenceData` object with posterior samples
   - Saves results to `<save_runs_path>/<source_id>_output`

## Analyzing Results

After the fit completes, you can analyze the results:

In [None]:
import arviz as az
import matplotlib.pyplot as plt

# Summary statistics
print(az.summary(inference_data))

# Trace plots
az.plot_trace(inference_data)
plt.tight_layout()
plt.show()

# Corner plot
import corner
samples = az.extract(inference_data, num_samples=1000)
# Convert to numpy and create corner plot
# corner.corner(samples)

## Common Emission Lines

For the `line` parameter, use rest-frame wavelengths in Angstroms:

- **H-alpha**: 6562.8 Å
- **H-beta**: 4861.3 Å  
- **[OIII]**: 5006.8 Å
- **[OII]**: 3727.0 Å

The code will automatically calculate the observed wavelength based on the redshift in your master catalog.

## Tips for Success

1. **Check your data quality**: Ensure grism spectra have good S/N
2. **Set appropriate priors**: Use photometric measurements to constrain morphology
3. **Run sufficient samples**: Typically 1000-5000 samples after warmup
4. **Monitor convergence**: Check R-hat values < 1.01
5. **Validate results**: Inspect model residuals and posterior distributions