# Spectral simulations with gammapy

## Studying systematic effects on spectral parameters

### Objective

**Quantify the systematic errors on spectral parameters caused by a possible absolute energy scale bias using simulations.**

### Steps

* Create a fake observation of the Crab using CTA alpha configuration and build a 1D spectrum dataset for this observation
* Fake multiple times a log-parabola source spectrum with ON-OFF statistic and fit it and to measure the statistcal uncertainty expected
* Build a custom spectral model to take into account possible systematic bias in absolute energy scale
* Perform simulations of the observation taking into account the systematic effect of the energy scale bias on the fitted spectral parameters. Measure the systematic uncertainty introduced.
* Perform the same thing for an uncertainty of the alpha parameter 

In [None]:
import numpy as np
from astropy.coordinates import SkyCoord
from astropy.table import Table
import astropy.units as u
from gammapy.data import Observation, observatory_locations, FixedPointingInfo, PointingMode
from gammapy.irf import load_irf_dict_from_file
from gammapy.maps import MapAxis, Map, RegionGeom
from gammapy.datasets import SpectrumDataset, Datasets, SpectrumDatasetOnOff
from gammapy.makers import SpectrumDatasetMaker, SafeMaskMaker
from gammapy.modeling import Fit
from gammapy.modeling.models import SkyModel, LogParabolaSpectralModel
import matplotlib.pyplot as plt 

In [None]:
irfs = load_irf_dict_from_file("$GAMMAPY_DATA/cta-caldb/Prod5-South-20deg-AverageAz-14MSTs37SSTs.180000s-v0.1.fits.gz")

#### Define pointing positions

Here we use a wobble position around the Crab.

In [None]:
target = SkyCoord(83.6333, 22.0133, unit="deg", frame="icrs")

pointing_position = target.directional_offset_by(90*u.deg, 1*u.deg)
pointing = XX
print(pointing)

We assume a 3 hours long observation (for simplicity we don't create a list of shorter runs).

In [None]:
livetime = 3 * u.h
location = observatory_locations["cta_south"]

obs = XX

In [None]:
print(obs)

#### Defining the reduced dataset geometry

We prepare the 1D spectra geometry. We need to provide the binning in reco and true energy, for the counts, counts_off and the IRFs.

We take a region of 0.1 degree around the Crab nebula

In [None]:
energy = MapAxis.from_energy_bounds(0.05, 100, 5, unit='TeV', per_decade=True)
energy_true = XX

geom = RegionGeom.create("icrs;circle(83.633, 22.014, 0.1)", axes=[energy])

In [None]:
reference_dataset = XX

#### Instantiate the data reduction Makers
- SpectrumDatasetMaker projects the events and IRF in the correct geometry. Here we require that PSF leakage is to be corrected.
- We don't create a background maker since there is no counts information to rely on here. We are just interested in getting the expected background in the ON region.
- The SafeMaskMaker will create a boolean mask stored on the Dataset to deal with the safe energy range. Here we impose that the energy bias be less than 10%. 


In [None]:
maker = XXX
safe_mask_maker = XXX

#### Data reduction 
Now we can perform the data reduction to create the reference `Dataset`. 

In [None]:
reference_dataset = XXX
reference_dataset = XXX

As expected the reference dataset contains only the number of expected background counts but no signal.

In [None]:
reference_dataset.peek();

#### Model definition

Source models in gammapy are `SkyModel`. They are usually the combination of a `SpectralModel`, a `SpatialModel` and possibly a `TemporalModel`.

Here we only need a `SpectralModel`. It is the only mandatory component.

In [None]:
spectral_model = LogParabolaSpectralModel(
        alpha=2.5,
        beta=0.25, 
        amplitude="3.8e-11 cm-2s-1TeV-1",
        reference="1 TeV",
    )

model = XXX

In [None]:
reference_dataset.models = XXX

#### Create ON-OFF datasets

Since background maker was applied the datasets are regular datasets: i.e. they contain a background model but no acceptance ON and OFF nor OFF counts. 

We have therefore to convert them to the proper format adding the required information. The OFF counts will be simulted later.

#### Create acceptance vectors

we assume energy independent alpha_onoff=1/10: 

In [None]:
acceptance = Map.from_geom(geom, data=1, unit="")
acceptance_off = XXX

#### Fake counts

Here we create the ON-OFF datasets and fake their content.

We perform a large number of simulations to explore the disctribution of fitted parameters.


Have a look on the documentation to create a SpectrumDatasetOnOff from a SpectrumDataset (called here "reference_dataset") with the function `SpectrumDatasetOnOff.from_spectrum_dataset`.


In [None]:
%%time
n_sim = 100
simulated_datasets = []

for i in range(n_sim):
    dataset_on_off = SpectrumDatasetOnOff.from_spectrum_dataset( XXX )
    dataset_on_off.fake( XXX )
    simulated_datasets.append(dataset_on_off)

#### Perform the fit

Here we fit a log parabola on the data and explore the distribution of parameters

In [None]:
%%time

results = []
fit = Fit()
for dataset in simulated_datasets:
    XXXX


We convert the list of dictionnaries into an astropy `Table`.

In [None]:
fitted_params = Table(results).to_pandas()

We compute the statistical errors from the distribution of the MC realizations.

In [None]:
mean = fitted_params.mean()
uncertainty = fitted_params.std()

In [None]:
for name in ['amplitude', 'alpha', 'beta']:
    print(f"{name} :\t {mean[name]:.2e} -+ {uncertainty[name]:.2e}")

#### Looking at the simulation results

We can use the corner package to represent the fitted parameters correlations.

In [None]:
import corner

fitted_params['amplitude'] *= 1e11

figure = corner.corner(fitted_params,quantiles=[0.16, 0.5, 0.84],
              show_titles=True, title_kwargs={"fontsize": 12})

## Second exercise: Exploring systematic effects: energy scale bias

#### Creating a biased log-parabola model

A systematic bias in the energy reconstruction that is not accounted for by the energy dispersion could have a significant impact of the fitted spectral parameters of the Crab nebula.

We propose to explore this idea in the code below.

We first use the possibility offered by gammapy to create a custom model to add a log parabola model with a systematic constant energy bias. This bias will come in addition to the one stored in the `edisp` and quantified by the shower simulations.

In [None]:
from gammapy.modeling import Parameter
from gammapy.modeling.models import SpectralModel, LogParabolaSpectralModel

class BiasedLogParabolaSpectralModel(SpectralModel):
    tag = "BiasedLogParabolaSpectralModel"
    amplitude = Parameter("amplitude", "1e-12 cm-2 s-1 TeV-1", min=0, is_norm=True)
    alpha = Parameter("alpha", 2.5, min=0)
    beta = Parameter("beta", 0.5)
    reference = Parameter("reference", "1 TeV", frozen=True)
    bias = Parameter("bias", 1, min=0)
        
    @staticmethod
    def evaluate(energy, amplitude, alpha, beta, reference, bias):
        energy = bias.value * energy
        logpwl = LogParabolaSpectralModel.evaluate(
            energy=energy,
            alpha=alpha,
            beta=beta,
            amplitude=amplitude,
            reference=reference,
        )
        return logpwl

We can look at the resulting spectra. Obviously the impact on the amplitude is very significant.

In [None]:
biased_spectral_model = BiasedLogParabolaSpectralModel()
biased_spectral_model.bias.value=0.9
ax = biased_spectral_model.plot([0.01,40]*u.TeV, energy_power=2)
biased_spectral_model.bias.value=1.1
biased_spectral_model.plot([0.01,40]*u.TeV, energy_power=2, ax=ax)
biased_spectral_model.bias.value=1.0
biased_spectral_model.plot([0.01,40]*u.TeV, energy_power=2, ax=ax);

In [None]:
biased_spectral_model = BiasedLogParabolaSpectralModel()

biased_spectral_model.alpha.value = spectral_model.alpha.value
biased_spectral_model.beta.value = spectral_model.beta.value
biased_spectral_model.amplitude.value = spectral_model.amplitude.value

In [None]:
biased_model = SkyModel(spectral_model=biased_spectral_model, name="biased_crab")

#### Fake counts taking bias into account

Here we use the biased model and randomize the bias values. We then create the ON-OFF datasets and fake their content.

We assume a typical uncertainty on the energy scale of 3 percent.

In [None]:
%%time
simulated_biased_datasets = []

for i in range(n_sim):
    XXX

#### Perform the fit

Here we fit a log parabola without a bias on the data and explore the distribution of parameters

In [None]:
%%time

results_biased = []
fit = Fit()
for dataset in simulated_biased_datasets:
    XXX

In [None]:
fitted_params_with_energy_bias = Table(results_biased).to_pandas()

In [None]:
total_mean = fitted_params_with_energy_bias.mean()
total_uncertainty = fitted_params_with_energy_bias.std()

In [None]:
for name in ['amplitude', 'alpha', 'beta']:
    print(f"{name} :\t {total_mean[name]:.2e} -+ {total_uncertainty[name]:.2e}")

To separate statistics from systematics we assume the toatl uncertainty is the quadratic sum of statistic and systematic errors.

In [None]:
systematic_uncertainty = np.sqrt(total_uncertainty**2 - uncertainty**2)

In [None]:
for name in ['amplitude', 'alpha', 'beta']:
    print(f"{name} :\t {total_mean[name]:.2e} -+\t"
          f" {uncertainty[name]:.2e} (stat) -+\t"
          f" {systematic_uncertainty[name]:.2e} (sys)")

## Third exercise: Exploring systematic effects by adding uncertainty of alpha_onoff (ratio of ON/OFF acceptance)

#### Simulated data with an additional bias of alpha_onoff

A systematic bias in the estimation of alpha_onoff can also occur. 
In the following we will assume that the ON acceptance is uncertain by a factor of 5%.


In [None]:
%%time
simulated_biased_datasets = []

for i in range(n_sim):
    XXX

In [None]:
%%time

results_biased = []
fit = Fit()
for dataset in simulated_biased_datasets:
    XXX

In [None]:
fitted_params_with_alpha_bias = Table(results_biased).to_pandas()
total_mean = fitted_params_with_alpha_bias.mean()
total_uncertainty = fitted_params_with_alpha_bias.std()

In [None]:
for name in ['amplitude', 'alpha', 'beta']:
    print(f"{name} :\t {total_mean[name]:.2e} -+ {total_uncertainty[name]:.2e}")

In [None]:
systematic_uncertainty = np.sqrt(total_uncertainty**2 - uncertainty**2)

In [None]:
for name in ['amplitude', 'alpha', 'beta']:
    print(f"{name} :\t {total_mean[name]:.2e} -+\t"
          f" {uncertainty[name]:.2e} (stat) -+\t"
          f" {systematic_uncertainty[name]:.2e} (sys)")

## Going further

- Plot the average fitted spectrum with the butterflies of errors
- Combine the two effects. Notice that the fitted parameters become biased.
- Introduce an uncertainty on the effective area by adding an uncertainty on the spectral index and amplitude of the simulated source (e.g. by multplying it by a PowerLawNormSpectralModel)