Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background systematics as a nuisance parameter #3955

Open
moralejo opened this issue May 17, 2022 · 14 comments
Open

Background systematics as a nuisance parameter #3955

moralejo opened this issue May 17, 2022 · 14 comments
Labels
Milestone

Comments

@moralejo
Copy link

At low energies, particularly in single-telescope analysis (or also in stereo, for weak sources observed in long periods) the gamma excess may be just a few percent of the residual background. In such cases, systematic errors in the background normalization factor (of, say, ~1%) would have a large effect in the estimated gamma-ray flux. This is currently not taken into account.

It would be good to include this uncertainty in gammapy's calculations. A possible way is to include the normalization factor as a nuisance parameter (with e.g. gaussian distribution, adding a term in the likelihood). It could be a energy-bin-wise nuisance parameter (although we might expect some correlations among neighboring bins).

@github-actions
Copy link

Merci! We will respond to your issue shortly. In the meantime, try import gammapy; gammapy.song()

@adonath
Copy link
Member

adonath commented May 20, 2022

Thanks @moralejo for the feature proposal!

Alternatively one can also fit the background normalization and spectral shape using various built-in models, such as:

This effectively treats the background norm as a nuisance parameter, however without the prior information. Taking into account the a e.g. Gaussian prior distribution is technically already possible, by implementing a custom MapDataset or SpectrumDataset, along the lines of:

class MyMapDataset:
    """My custom map dataset"""
    def stat_sum(self):
        stat = super().stat_sum() # Summed -2 * log Poisson likelihood
        stat += (self.background_model.norm.value - mean) ** 2 / sigma ** 2
        return stat

Handling priors on parameters is definitely on the feature list for upcoming versions of Gammapy.

@moralejo
Copy link
Author

Thanks, @adonath. But I am not sure I understand (I have quite limited experience with gammapy, sorry): is that background normalization independent for on and off? (in point-like analysis). That is what we really need - but then the prior (defining how much the backg can differ between the on and off, e.g. 1%) is absolutely needed. Without prior for that background difference, the excess, no matter how large, could always be fitted by a different background in on and off, right?

@adonath
Copy link
Member

adonath commented May 20, 2022

Thanks @moralejo, I now understand better what you are trying to achieve.

Is that background normalization independent for on and off?

No it isn't. In case of a on-off analysis, the background in the on region is "hardwired" to alpha * n_off and the "wstat" (on/off) Poisson likelihood. Currently there is no simple mechanism to introduce an additional scaling factor to the background model in the SpectrumDatasetOnOff and the "wstat" likelihood.

As the likelihood function including the priors looks different I assume the approach of analytically "marginalizing" over the background as done in "wstat" does not work anymore either. Do you have an example / write up of what the likelihood function looks like in your use case?

BTW: You can now write math expressions in GitHub Markdown https://github.blog/changelog/2022-05-19-render-mathematical-expressions-in-markdown/

@moralejo
Copy link
Author

I was thinking on the typical point-like analysis we do for LST-1 data, using wobble data, with one or more off positions.

As of now, the Poisson parameter of the background (say, mu_bg) is a nuisance parameter which enters both the on and off likelihood terms, as $\mu_{bg}$ in the term containing Non , and $\mu_{bg} / \alpha$ in the "Noff" term.

The idea is to make the normalization $\alpha$ also a nuisance parameter, with a certain prior, say a gaussian of std dev 1%. There would be an additional gaussian term in the likelihood, i.e. gauss($\alpha_{0}$, 0.01*$\alpha_{0}$). With e.g. $\alpha_{0}$ = 1 in case of one off region. The normalization could then take values of $\alpha$ different from $\alpha_{0}$, but with a penalization from the gaussian term.

Of course this will complicate finding the best value for $\mu_{bg}$ (which for fixed normalization can be obtained analytically from a 2nd degree equation).

@adonath
Copy link
Member

adonath commented May 24, 2022

Thanks @moralejo, now I think I understood what you are trying to achieve.

As far as I can tell there is currently no simple way to achieve something like this in Gammapy, however it is probably still possible. Maybe @registerrier has a more concrete idea here.

@moralejo
Copy link
Author

Thanks @moralejo, now I think I understood what you are trying to achieve.

As far as I can tell there is currently no simple way to achieve something like this in Gammapy, however it is probably still possible. Maybe @registerrier has a more concrete idea here.

Indeed, we had a conversation in Bologna about this - I should have mentioned it. I think it can be done, but since the two nuisance parameters $\alpha$ and $\mu_{bg}$ are somehow degenerate , determining both of them analytically becomes a bit more complicated.

For non-point-like analysis I guess the equivalent feature would be a 2D background model in which a certain (relative) level of deviations w.r.t. a (possibly symmetric) template are allowed, with those deviations being limited by a prior (e.g. 1% std dev gaussian or whatever). It is just a way of introducing a realistic limit to our knowledge of the background (which otherwise could be made, in terms of relative uncertainty, arbitrarily small just by adding more statistics).

@SeiyaNozaki
Copy link

I tried to apply energy-dependent background normalization factor with existing functions. Here I introduced a function to create dataset using CircleAnnulusSkyRegion as a background region.

def make_bg_dataset(target_position, energy_axis, energy_axis_true, observation):

    from regions import CircleAnnulusSkyRegion
    
    region = CircleAnnulusSkyRegion(
        center=target_position, 
        inner_radius= 0.3 * u.deg,
        outer_radius= 0.4 * u.deg
    )
    
    geom = RegionGeom.create(region=region, axes=[energy_axis])
    
    dataset_empty = SpectrumDataset.create(
        geom=geom, energy_axis_true=energy_axis_true
    )
    
    dataset = dataset_maker.run(
        dataset_empty, observation
    )
    
    return dataset

Then, energy-dependent normalization factors are computed using a number of counts in defined background region for both ON and OFF sources.

for observation in observations:
    dataset = dataset_maker.run(
        dataset_empty.copy(name=str(observation.obs_id)), observation
    )
    counts.fill_events(observation.events)
    dataset_on_off = bkg_maker.run(dataset, observation)
    dataset_on_off = safe_mask_masker.run(dataset_on_off, observation)
    
    # background region for ON source
    dataset_bg_on = make_bg_dataset(
        target_position, 
        energy_axis, 
        energy_axis_true, 
        observation
    )
    
    # background region for OFF source 
    dataset_bg_off = make_bg_dataset(
        dataset_on_off.counts_off.geom.center_skydir, 
        energy_axis, 
        energy_axis_true, 
        observation
    )
    
    # apply background normalization factor on alpha
    dataset_on_off.acceptance_off *= dataset_bg_off.counts / dataset_bg_on.counts
    
    datasets.append(dataset_on_off)

These circle annulus are defined background regions:
image

I think it should be fine in principle if I understand codes correctly although I have not tried with real data yet.
How about such an idea itself to compute energy-dependent background normalization?

@adonath
Copy link
Member

adonath commented Jun 14, 2022

Thanks @SeiyaNozaki! As far as I understood you basically implemented a ring background estimation method for spectral analysis. This is a long open feature request in Gammapy (#1112). It would be great to support a RegionGeom on the RingBackgroundMaker, basically following your implementation!

However in general it is different from the what @moralejo proposed, which requires an additional prior on the alpha parameter during fitting. This cannot simply be achieved with the existing SpectrumDatasetOnOff.

@SeiyaNozaki
Copy link

Hi @adonath

As far as I understood you basically implemented a ring background estimation method for spectral analysis. This is a long open feature request in Gammapy (#1112).

Good to know, let me implement it based on this implementation later!

However in general it is different from the what @moralejo proposed, which requires an additional prior on the alpha parameter during fitting. This cannot simply be achieved with the existing SpectrumDatasetOnOff.

Yes, let's say this is an alternative way to estimate background level based on signal-less region for on/off samples. Please let me explain this way again next Monday at the LST-focused slot during the coding sprint.

@moralejo
Copy link
Author

moralejo commented Jul 13, 2022

Hi @adonath,

below is a proposal on how to introduce both the background rate and the on/off normalization as nuisance parameters. I have tested the approach on a notebook (where can I share it?) and it seems to work well.

We measure $N_\text{on}$ and $N_\text{off}$ events in the ON and OFF regions, within a given bin of E$_\text{reco}$.

$\mu$ is the Poisson parameter of the background in the ON region. It is a nuisance parameter (no prior)

$\alpha$ is the ratio of the background acceptance in the ON region to that in the OFF region (i.e. the $\alpha$ in the Li & Ma formula). The Poisson parameter of the background in the OFF is $\mu / \alpha$

We will consider $\alpha$ as nuisance, but with a gaussian prior, e.g. $P(\alpha) = G(\alpha | \alpha_0, \sigma_\alpha)$. The value $\sigma_\alpha$ represents the systematic uncertainty of $\alpha$, which may come e.g. from non-uniformities of the camera response or whatever.

$\theta$ refers to all parameters which describe the energy spectrum of the gamma source (including also EBL absorption of needed). The value $g(\theta)$ is the Poisson parameter of the gamma signal in the E$_\text{reco}$ bin in question, and is obtained using the instrument IRF (for whatever values of $\theta$ we have during the likelihood maximization).

$P(N_\text{on}, N_\text{off} \ | \theta, \mu, \alpha, \alpha_0, \sigma_\alpha) = Poisson(N_\text{on}|\mu+g(\theta)) \times Poisson(N_\text{off}|\mu/\alpha) \times G(\alpha | \alpha_0, \sigma_\alpha)$

$-\log(P) = \mu + g(\theta) -N_\text{on} \log(\mu + g(\theta)) + log(N_\text{on}!) + \mu/\alpha - N_\text{off} \log(\mu/\alpha) + \log(N_\text{off}!) + \log(\sigma_\alpha \sqrt{2\pi}) + \frac{(\alpha - \alpha_0)^2}{2 \sigma_\alpha^2}$

Given $\theta$, $N_\text{on}$ and $N_\text{off}$, we now want to find the values of $\alpha$ and $\mu$ that maximize $P$, i.e. minimize $-\log(P)$. Let's keep only the relevant terms:

$-\log(P) = \mu -N_\text{on} \log(\mu + g(\theta)) + \mu/\alpha - N_\text{off} \log(\mu) + N_\text{off} \log(\alpha) + \frac{\alpha^2 - 2 \alpha \alpha_0}{2 \sigma_\alpha^2} + \text{constants}$

Deriving w.r.t. $\mu$:

$\frac{\partial (-\log(P))}{\partial \mu} = 1 - \frac{N_\text{on}}{\mu + g(\theta)} + \frac{1}{\alpha} - \frac{N_\text{off}}{\mu} = 0$

we just use "g" below to simplify the notation:

$\mu \alpha (\mu + g) - N_\text{on} \mu \alpha + \mu (\mu + g) - N_\text{off} \alpha (\mu + g) = 0 $

$(1 + \alpha) \mu^2 + [g (1+\alpha) - \alpha (N_\text{on} + N_\text{off})] \mu - N_\text{off} \alpha g = 0$, which is a second-order equation (we keep the $\mu >0$ solution)

Solving the equation above with a fixed $\alpha$ (for example, $\alpha = \alpha_0$) we obtain the $\mu$ that maximizes the contribution of the given E$_\text{reco}$ bin to the likelihood

Now derive w.r.t. $\alpha$:

$\frac{\partial (-\log(P))}{\partial \alpha} = -\frac{\mu}{\alpha^2} + \frac{N_\text{off}}{\alpha} + \frac{\alpha - \alpha_0}{\sigma_\alpha^2} = 0$

$-\mu + \alpha N_\text{off} + \alpha^2 \frac{\alpha - \alpha_0}{\sigma_\alpha^2} = 0$

$\alpha^3 - \alpha^2 \alpha_0 + \alpha N_\text{off} \sigma^2_\alpha - \mu \sigma^2_\alpha = 0$, which is a cubic equation in $\alpha$

In both equations only one of the solutions is sensible.

To find both alpha and mu we can do a few steps (find $\mu$ for a starting fixed $\alpha$, then fix $\mu$ and find $\alpha$, and so on until a certain convergence (small change in -2logL between steps) is achieved.

@adonath
Copy link
Member

adonath commented Sep 23, 2022

Form the Gammapy side this would require adapting our likelihood API. I think we should give us at least 0.5 year for that. I'll set the 1.2 milestone here.

@adonath adonath added this to the 1.2 milestone Sep 23, 2022
@moralejo
Copy link
Author

I think that perhaps this might be implemented in a simpler (and probably nearly equivalent) way.

If the likelihood terms for well-populated bins (= those in which the background systematics are relevant) are written in gaussian approximation, wouldn't it be enough to replace the standard deviation of the gaussian terms for the on region (which would originally be sqrt(mu + g(theta)) by a larger value, say, sqrt(1.01*mu + g(theta))? - for a ~1% background systematic uncertainty. The mean of the gaussian should stay unchanged, i.e. mu + g(theta)

@adonath
Copy link
Member

adonath commented Nov 19, 2022

Thanks @moralejo! We have started to work on collecting use-cases for extended likelihood functions / priors. @nbiederbeck will present some first ideas for a new API at the next coding sprint and then we will write the proposal up in a PIG (proposal for improving Gammapy). I hope this will happen within the next few weeks. Once we have a draft we will come back to you and ask for your feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants