# Creating a Feature Array

The first part in training an model with Synference is to create a feature array from the generated model library. This feature array will be used as the training data for the inference model.

This can included any of the following:
- Normalization: Scaling the features to a common range.
- Noise modelling: Adding realistic noise to the features to simulate observational conditions.
- Customising features: Selecting specific features or combinations of features that are most relevant for the inference task.
- Adding new features: E.g. colors, supplementary parameters, or moving parameters from being inferred to being part of the features.
- Simulating missing data in the training set. 
- Transforming parameters or features to improve model performance.

First lets load the library and create a basic feature array.

In [5]:
from synference import SBI_Fitter

fitter = SBI_Fitter.init_from_hdf5(model_name="test",
                                   hdf5_path="../example_grids/test_model_grid.hdf5")

The ```create_feature_array``` method exposes some of the basic functionality for creating a feature array, but more complex feature array creation can be achieved with the ```create_feature_array_from_raw_photometry``` method.


In [2]:
?fitter.create_feature_array

[0;31mSignature:[0m
[0mfitter[0m[0;34m.[0m[0mcreate_feature_array[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mflux_units[0m[0;34m:[0m [0mstr[0m [0;34m=[0m [0;34m'AB'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mextra_features[0m[0;34m:[0m [0mlist[0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m**[0m[0mkwargs[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Create a feature array from the raw observation grid.

A simpler wrapper for
`create_feature_array_from_raw_photometry` with default values.
This function will create a feature array from the raw observation grid
with no noise, and all photometry in mock catalogue used.
[0;31mFile:[0m      ~/Documents/PhD/synference/src/synference/sbi_runner.py
[0;31mType:[0m      method

So we can see that the basic functionality will create a feature array with noiseless fluxes in AB magnitudes for all filters in the library, and the parameters to be inferred will be the full set of model parameters.

In [4]:
fitter.observation_type

In [3]:
fitter.create_feature_array();

ValueError: Observation type None not supported. Please use 'photometry' or 'spectra'.

In [None]:
fitter.plot_histogram_feature_array();

Now there are some easy changes we could try out here. First, let's try changing the flux unit to log10 nJy.

In [None]:
fitter.create_feature_array(flux_units="log10 nJy")
fitter.plot_histogram_feature_array();

We can also make it a unyt quantity by passing in ```flux_units=nJy```.

In [None]:
from unyt import nJy

fitter.create_feature_array(flux_units=nJy);

Finally we can also use asinh magnitudes, but we will need to specify a softening parameter for the asinh magnitudes. Here we will use 1 nJy.

In [None]:
fitter.create_feature_array(flux_units="asinh", asinh_softening_parameters=1 * nJy);

We can also add features from the parameter array or colors. We can also remove some features such as fluxes in certain filters. Here we will add 'redshift' and 'F444W-F356W' color and remove the F090W filter.


In [None]:
fitter.create_feature_array(
    extra_features=["redshift", "F444W-F356W"], photometry_to_remove=["JWST/NIRCam.F090W"]
);

Since most SED fitting parameters are sensitive only to colors rather than absolute fluxes, we may wish to normalise the fluxes in some way. Here we will normalise the fluxes to the F200W filter.

In [None]:
fitter.create_feature_array(normalize_method="JWST/NIRCam.F200W");

## Modelling Noise

We can apply a simple scatter model to the feature array to simulate observational noise. 

In [None]:
depths = 3 * nJy  # 5 sigma depth of 30.2 AB magnitudes

fitter.create_feature_array(scatter_fluxes=True, depths=depths)

## Missing Fluxes

## Changing the Parameter Array

We can also change our parameter array, which is created automatically when we create the feature array. Here we will only infer 'stellar_mass'. 

In [None]:
fitter.create_feature_array(
    parameters_to_remove=["redshift", "tau_v", "tau", "peak_age", "log10metallicity"]
)

fitter.plot_histogram_parameter_array();