# 3. Estimation of the flux of a variable source, Mrk421

In the first two introductory notebooks, we considered Crab Nebula for its property of being the brightest **steady** emitter in the gamma-ray sky. Let us examine now a source whose flux is not constant in time, Mrk421.

In [None]:
# - basic imports (numpy, astropy, regions, matplotlib)
import numpy as np
import astropy.units as u
from astropy.time import Time 
from astropy.coordinates import SkyCoord
from astropy.io.fits.verify import VerifyWarning
from regions import PointSkyRegion, CircleSkyRegion
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import logging
import warnings

# - Gammapy's imports
from gammapy.maps import Map, MapAxis, WcsGeom, RegionGeom
from gammapy.data import DataStore, Observation
from gammapy.datasets import SpectrumDataset, Datasets, FluxPointsDataset
from gammapy.makers import (
    SpectrumDatasetMaker,
    WobbleRegionsFinder,
    ReflectedRegionsBackgroundMaker,
)
from gammapy.modeling.models import (
    PowerLawSpectralModel,
    LogParabolaSpectralModel,
    ConstantSpectralModel,
    SkyModel,
    LinearTemporalModel,
    SineTemporalModel
)
from gammapy.modeling import Fit
from gammapy.estimators import (
    FluxPointsEstimator,
    LightCurveEstimator
)
from gammapy.stats import WStatCountsStatistic

# - setting up logging and ignoring warnings
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

warnings.filterwarnings("ignore")
with warnings.catch_warnings():
    warnings.simplefilter('ignore', VerifyWarning)
    # do stuff here

## 3.1. Data Reduction
As before, let us load all the Mrk421 observations and reduce the data.   
Let us focus on the days between the 10th and the 20th of April, where the most intense variability.

In [None]:
# load observations
datastore = DataStore.from_dir("../acme_magic_odas_data/Mrk421")
observations = datastore.get_observations(required_irf=["rad_max", "aeff", "edisp"])
print(f"total observations : {len(observations)}")

# select observations between 10th and 20th of April
times = ["2013-04-10T00:00:00", "2013-04-20T00:00:00"]
time_intervals = Time(times, format='isot', scale='utc')
observations = observations.select_time(time_intervals)
print(f"selected observations : {len(observations)}")

In [None]:
# let us define the parameters of the spectrum extraction
# - we need to specify only the center of the on region,
# its radius will be fetched from the RAD_MAX table.
crab_coords = SkyCoord.from_name("Mrk421")
on_region = PointSkyRegion(crab_coords)

# - let us define the energy axes over which we want:
# -- to bin the counts (estimated energies) and
energy_axis = MapAxis.from_energy_bounds(
    10, 1e5, nbin=20, per_decade=False, unit="GeV", name="energy"
)
# -- to interpolate the IRFs (true energies)
energy_axis_true = MapAxis.from_energy_bounds(
    10, 1e5, nbin=28, per_decade=False, unit="GeV", name="energy_true"
)

# let us create an empty dataset with this spatial and energy structure / binning
geom = RegionGeom.create(region=on_region, axes=[energy_axis])
dataset_empty = SpectrumDataset.create(geom=geom, energy_axis_true=energy_axis_true)

In [None]:
dataset_maker = SpectrumDatasetMaker(
    containment_correction=False, selection=["counts", "exposure", "edisp"]
)
# use 3 off regions to estimate the background
region_finder = WobbleRegionsFinder(n_off_regions=3)
bkg_maker = ReflectedRegionsBackgroundMaker(region_finder=region_finder)

datasets = Datasets()

for observation in observations:
    dataset = dataset_maker.run(
        dataset_empty.copy(name=str(observation.obs_id)), observation
    )
    dataset_on_off = bkg_maker.run(dataset, observation)
    datasets.append(dataset_on_off)

In [None]:
datasets[0].peek()

We obtained the one-dimensional data sets as before.

## 3.2. Light curve: flux vs time estimation

We are now concerned with studying the flux as a function of time: how could we measure it?      
One option would certainly be to perform the spectrum estimation with the method previously demostrated per each of the observations separately, to measure spectral changes day by day or even run by run. To measure variability, it is common practice in astronomy to compile a _light curve_, a graph of the light intensity as a function of time. To represent the brightness in gamma-ray of our source we will consider its integral flux above a certain threshold, $E_{\rm thr}$, that can depend on the analysis, or on the physics we are interested in

$$
    \phi(E > E_{\rm thr})\,[{\rm cm}^{-2}\,{\rm s}^{-1}] = 
    \int_{E_{\rm thr}}^{\infty} \frac{{\rm d}\phi}{{\rm d}E}(E; \hat{\theta})\,{\rm d}E
$$

where we $\hat{\theta}$ represent the parameters of the spectral model. How are these treated if we aim to estimate the integral flux? One could in principle perform the likelihood fit of the previous notebook to determine $\frac{{\rm d}\phi}{{\rm d}E}(E; \hat{\theta})$ for all the events in a given interval of time (might be an observation, a day, a week), and then simply integrate the spectrum obtained.

It is more common practice to fix the parameters of the spectrum adopted, and then simply adjust the normalisation $\phi_0$, exactly as we did when we computed flux points. In that case, we readjusted the best-fit $\phi_0$ using all the events within an estimated energy bin. We will now re-adjust it to the events in a single time interval.
Given the parallel between the processes of flux points and light curve estimation, `Gammapy` provides a `LightCurveEstimator`  that works very simlarly to the `FluxPointsEstimator`, i.e. repeating the likelihood fit in each of the time bin (instead of the energy ones).

Let us then compute the light curve for Mrk421. Which model should we assume. In the paper presenting this dataset ([MAGIC Collaboration (2020)](https://ui.adsabs.harvard.edu/abs/2020ApJS..248...29A/abstract)) we read:


> [...] the VHE gamma-ray spectrum from the full nine day data set [...] is well represented by the following log-parabola function:    
> $$ \frac{{\rm d}\phi}{{\rm d}E} = \phi_0 \left( \frac{E}{0.3\,{\rm TeV}} \right)^{- 2.14 - 0.45 \log_{10}(\frac{E}{0.3\,{\rm TeV}})} $$

The amplitude parameter is not specified. Let us see if we can recover the same value fitting all the data together, let us stack and fit them!

In [None]:
dataset_stacked = datasets.stack_reduce()

# let us use the LP model, use the same reference energy of the paper
spectral_model = LogParabolaSpectralModel(
    amplitude=5e-12 * u.Unit("TeV-1 cm-2 s-1"),
    reference=0.3 * u.TeV,
    alpha=2.3 * u.Unit(""),
    beta=0.1 * u.Unit(""),
)
model = SkyModel(spectral_model=spectral_model, name="Mrk421")

# let us use a reasonable energy range for fitting
e_min = 0.08 * u.TeV
e_max = 10 * u.TeV
dataset_stacked.counts.geom.energy_mask(e_min, e_max)

# assign the model to the
dataset_stacked.models = [model]

# run the fit!
fit = Fit()
results = fit.run(datasets=dataset_stacked)
print(results)
print(spectral_model)

In [None]:
print(f"alpha = {spectral_model.alpha.value:.2f} +/- {spectral_model.alpha.error:.2f}")
# the log in Gammapy's log parabola is in base e
# the one in the paper is in base 10, make the conversion
beta = spectral_model.beta.value / np.log10(np.e)
beta_err = spectral_model.beta.error / np.log10(np.e)
print(f"beta = {beta:.2f} +/- {beta_err:.2f}")

We, obtained values quite similar to the average spectrum presented in the paper, let us go ahead and compute the LC with the `LightCurveEstimator`, if no time interval is specified, then a flux point per each observation will be estimated. Let us compute the light curve above 1 TeV, as done in the paper.

In [None]:
# energy range in which the flux has to be integrated
energy_edges = [1 * u.TeV, 100 * u.TeV]

# rember to assing the model to the datasets
datasets.models = [model]

light_curve_estimator_run_wise = LightCurveEstimator(
    energy_edges=energy_edges,
    source="Mrk421",
    reoptimize=False,
    n_sigma_ul=3,
)

light_curve_run_wise = light_curve_estimator_run_wise.run(datasets)

In [None]:
fig, ax = plt.subplots(figsize=(12, 5)) 
light_curve_run_wise.plot(
    ax=ax,
    sed_type="flux",
    marker="."
)
ax.set_yscale("linear")
plt.show()

Let us compute daily and weekly fluxes as well.

In [None]:
daily_time_intervals = [
    Time([56391.5, 56392.5], format="mjd", scale="utc"),
    Time([56392.5, 56393.5], format="mjd", scale="utc"),
    Time([56393.5, 56394.5], format="mjd", scale="utc"),
    Time([56394.5, 56395.5], format="mjd", scale="utc"),
    Time([56395.5, 56396.5], format="mjd", scale="utc"),
    Time([56396.5, 56397.5], format="mjd", scale="utc"),
    Time([56397.5, 56398.5], format="mjd", scale="utc"),
    Time([56398.5, 56399.5], format="mjd", scale="utc"),
    Time([56399.5, 56400.5], format="mjd", scale="utc"),
    Time([56400.5, 56401.5], format="mjd", scale="utc"),
    Time([56401.5, 56402.5], format="mjd", scale="utc"),
]

light_curve_estimator_daily = LightCurveEstimator(
    energy_edges=energy_edges,
    time_intervals=daily_time_intervals,
    source="Mrk421",
    reoptimize=False,
    n_sigma_ul=3,
)

light_curve_estimator_day = LightCurveEstimator(
    energy_edges=energy_edges,
    time_intervals=[Time([56394.5, 56395.5], format="mjd", scale="utc")],
    source="Mrk421",
    reoptimize=False,
    n_sigma_ul=3,
)

light_curve_daily = light_curve_estimator_daily.run(datasets)
light_curve_day = light_curve_estimator_day.run(datasets)

In [None]:
fig, ax = plt.subplots(figsize=(12, 5))
light_curve_run_wise.plot(
    ax=ax,
    sed_type="flux",
    marker=".",
    label="run-wise binning",
    alpha=0.6,
)
light_curve_day.plot(
    ax=ax,
    sed_type="flux",
    marker=",",
    label="one night",
)
light_curve_daily.plot(
    ax=ax,
    sed_type="flux",
    marker=",",
    label="nightly binning",
    elinewidth=2,
    
)
ax.set_yscale("linear")
ax.set_ylim([0, 4e-10])
plt.show()

The 13 of April definitely looks the day with the highest flux, and the one with most variability.         
Therefore it might be interesting to check the flux on smaller time scales than the day or the 20 minutes of the run. 
We can define even smaller time intervals, for example 10 minutes, even smaller than the run duration.   

We will use again the `select_time` function of the observations.

In [None]:
# first select only the 13 of April from the observations
time_day_start = Time("2013-04-12T12:00")
time_day_end = Time("2013-04-13T12:00")
time_interval_day = Time([time_day_start, time_day_end])

observations_day = observations.select_time(time_interval_day)
print(f"selected observations : {len(observations_day)}")

In [None]:
# let us now split this observation in even smaller chunks of 10 minutes
duration = 10 * u.min

# let us split the day in 10 minutes interval
time_intervals_10min = [time_day_start]

while time_intervals_10min[-1] <= time_day_end:
    time_intervals_10min.append(time_intervals_10min[-1] + duration)

time_intervals_10min = [
    Time([tstart, tstop]) for tstart, tstop in zip(time_intervals_10min[:-1], time_intervals_10min[1:])
]

# and now cut the observations in these time intervals
short_observations = observations_day.select_time(time_intervals_10min)
# check that observations have been filtered
print(f"observations after time filtering: {len(short_observations)}")
print(short_observations[1].gti)

We have to make data sets out of the new observations

In [None]:
short_datasets = Datasets()

for observation in short_observations:
    dataset = dataset_maker.run(
        dataset_empty.copy(), observation
    )
    dataset_on_off = bkg_maker.run(dataset, observation)
    short_datasets.append(dataset_on_off)

In [None]:
# rember to assing the model to the datasets
short_datasets.models = [model]

light_curve_estimator_10min = LightCurveEstimator(
    energy_edges=energy_edges,
    time_intervals=time_intervals_10min,
    source="Mrk421",
    reoptimize=False,
    n_sigma_ul=3,
)

light_curve_10min = light_curve_estimator_10min.run(short_datasets)

In [None]:
fig, ax = plt.subplots(figsize=(12, 5))
light_curve_10min.plot(
    ax=ax,
    sed_type="flux",
    marker=".",
    label="10 min binning",
    alpha=0.6,
    time_format="mjd"
)
light_curve_run_wise.plot(
    ax=ax,
    sed_type="flux",
    marker=".",
    label="run-wise binning",
    alpha=0.6,
    time_format="mjd"
)
light_curve_daily.plot(
    ax=ax,
    sed_type="flux",
    marker=",",
    label="nightly binning",
    elinewidth=2,
    time_format="mjd",
    alpha=0.2
    
)
ax.set_xlim([56394.8, 56395.2])
ax.set_yscale("linear")
#ax.set_ylim([0, 4e-10])
plt.show()

Let us make a random fit, let us try a linear model, plus a sine

In [None]:
# Create the datasets by iterating over the returned lightcurve
dataset_fp = FluxPointsDataset(data=light_curve_day, name="dataset_lc")

In [None]:
# let us use, define the temporal model, midnight of that day is t_ref
linear_time_model = LinearTemporalModel(
    alpha=2, beta=5 / u.d, t_ref=56395 * u.d
)
linear_time_model.alpha.frozen = True
# let also add a constant spectral model and let us fit
spectral_model = ConstantSpectralModel(const=1e-10 * u.Unit("TeV-1 cm-2 s-1"))

lc_model = SkyModel(
    spectral_model=spectral_model,
    temporal_model=linear_time_model,
    name="time_model",
)

dataset_fp.models = lc_model
print(dataset_fp)

In [None]:
fit = Fit()
result = fit.run(dataset_fp)
display(result.parameters.to_table())

In [None]:
dataset_fp.plot_spectrum(axis_name="time")