Shapesys modifier correlated between samples #1967

alexander-held · 2022-08-28T00:06:04Z

Summary

Is there a meaningful way to correlate shapesys modifiers across samples? If not, models where this is done should be flagged as invalid.

This is somewhat related to #1899.

OS / Environment

n/a

Steps to Reproduce

spec = {
    "channels": [
        {
            "name": "SR",
            "samples": [
                {
                    "data": [50],
                    "modifiers": [
                        {
                            "data": [9],
                            "name": "abc",
                            "type": "shapesys",
                        },
                        {
                            "data": None,
                            "name": "Signal strength",
                            "type": "normfactor",
                        },
                    ],
                    "name": "Signal",
                },
                {
                    "data": [150],
                    "modifiers": [
                        {
                            "data": [7],
                            "name": "abc",
                            "type": "shapesys",
                        }
                    ],
                    "name": "Background",
                },
            ],
        }
    ],
    "measurements": [{"config": {"parameters": [], "poi": ""}, "name": "meas"}],
    "observations": [{"data": [160], "name": "SR"}],
    "version": "1.0.0",
}

import pyhf

ws = pyhf.Workspace(spec)
model = ws.model()
data = ws.data(model)
pyhf.set_backend("numpy", "minuit")
fit_result = pyhf.infer.mle.fit(data, model, return_uncertainties=True)
for par_name, par_res in zip(model.config.par_names(), fit_result):
    print(f"{par_name}: {par_res[0]:.3f} +/- {par_res[1]:.3f}")

File Upload (optional)

No response

Expected Results

The script above prints

Signal strength: 0.200 +/- 0.291
abc[0]: 1.000 +/- 0.047

as result. A change to the shapesys data for the signal sample has no impact on the result at all, while changing it for the background does change the result.

I believe there is just a single Poisson rate to be set, as there is just a single parameter controlling both modifiers that is being constrained. The parameter does seem to correctly scale both samples, but there is just a single constraint term. I do not know whether it would be more sensible to create one constraint term per sample and keep the parameter effect correlated, or to catch this scenario and raise a warning.

Actual Results

no warnings raised about model being potentially invalid

pyhf Version

pyhf-0.7.0rc2.dev30

Code of Conduct

I agree to follow the Code of Conduct

The text was updated successfully, but these errors were encountered:

kratsg · 2023-12-05T13:42:49Z

Re-opening due to a user request from Luis's talk during the 2023 pyhf workshop.

alexander-held · 2023-12-05T13:46:06Z

After thinking some more about this following a talk at the pyhf workshop https://indico.cern.ch/event/1294577/contributions/5677127/, I think there is a meaningful way to correlate these modifiers across samples. Conceptually, this would be similar to staterror, but with some important difference in behavior.

A staterror term in a bin only needs a single float to keep track of auxdata (constraint term width for the Gaussian essentially). This is because all the per-sample uncertainties are summed together, and then all samples vary with that total MC statistical uncertainty.

For shapesys, the way I am thinking about this would be to not combine uncertainties per sample in the same way, but only correlate the nuisance parameter. That would be similar to e.g. histosys in a single bin, but histosys is always a unit Gaussian, so the relevant auxdata is always the same no matter which different histosys modifiers across samples are correlated. The data that is per-sample is the data in the histosys modifier itself. For a shapesys, the (Poisson) constraint term width would differ per sample, so we would need a sample-specific auxdata to track that. Currently this does not exist conceptually within pyhf as far as I know.

alexander-held added bug needs-triage labels Aug 28, 2022

kratsg mentioned this issue Sep 1, 2022

fix: Add guards against shared shapesys paramsets #1977

Merged

4 tasks

matthewfeickert removed the needs-triage label Sep 7, 2022

matthewfeickert assigned kratsg Sep 7, 2022

matthewfeickert closed this as completed in #1977 Sep 7, 2022

kratsg reopened this Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shapesys modifier correlated between samples #1967

Shapesys modifier correlated between samples #1967

alexander-held commented Aug 28, 2022

kratsg commented Dec 5, 2023 •

edited

Loading

alexander-held commented Dec 5, 2023

Shapesys modifier correlated between samples #1967

Shapesys modifier correlated between samples #1967

Comments

alexander-held commented Aug 28, 2022

Summary

OS / Environment

Steps to Reproduce

File Upload (optional)

Expected Results

Actual Results

pyhf Version

Code of Conduct

kratsg commented Dec 5, 2023 • edited Loading

alexander-held commented Dec 5, 2023

kratsg commented Dec 5, 2023 •

edited

Loading