Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding custom modifiers to cabinetry #382

Open
rmnmllr opened this issue Dec 16, 2022 · 30 comments · May be fixed by #385
Open

Adding custom modifiers to cabinetry #382

rmnmllr opened this issue Dec 16, 2022 · 30 comments · May be fixed by #385
Labels
enhancement New feature or request

Comments

@rmnmllr
Copy link
Contributor

rmnmllr commented Dec 16, 2022

Hey @alexander-held

For our analysis we would like to have a sqrt(mu) dependence of our LQ interference term. We found these custom modfiers of @lukasheinrich here that enables pyhf to accept definitions of custom functions (see also this pyhf feature branch). I tried a simple example with pyhf to verify that it also works with sqrt:

import pyhf
import numpy as np

from add_custom_modifier import add_custom_modifier

new_params = {
    'm1': { 'inits': (1.0,), 'bounds': ((-5.0, 5.0),) },
    'm2': { 'inits': (1.0,), 'bounds': ((-5.0, 5.0),) }
}
expanded_pyhf = add_custom_modifier('customfunc', ['m1', 'm2'], new_params)
model_pyhf = pyhf.Model(
    {
        'channels': [
            {
                'name': 'singlechannel',
                'samples': [
                    {
                        'name': 'signal', 'data': [10] * 20,
                        'modifiers': [
                            { 'name': 'f_s', 'type': 'customfunc', 'data': { 'expr': 'm1' } },
                        ],
                    },
                    {
                        'name': 'background', 'data': [100] * 20, 'modifiers': [
                            { 'name': 'f_b', 'type': 'customfunc', 'data': { 'expr': 'm2' } },      # m2 should be set to 1.0, model doesn't work if I set 'expr': ''
                        ]
                    },
                    {
                        'name': 'interference', 'data': [-1] * 20,
                        'modifiers': [
                            { 'name': 'f_i', 'type': 'customfunc', 'data': { 'expr': '-sqrt(m1)' } },
                        ],
                    },                
                    ],
            }
        ]
    },
    modifier_set = expanded_pyhf,
    poi_name = 'm1',
    validate = False,
    batch_size = 1
)

print(f'\nmodel_pyhf.expected_actualdata([[4.0, 1.0]]): {model_pyhf.expected_actualdata([[4.0, 1.0]])}\n')

which gives the desired output:

{'name': 'f_b', 'type': 'customfunc', 'data': {'expr': 'm2'}}
{'name': 'f_i', 'type': 'customfunc', 'data': {'expr': '-sqrt(m1)'}}
{'name': 'f_s', 'type': 'customfunc', 'data': {'expr': 'm1'}}

model_pyhf.expected_actualdata([[4.0, 1.0]]): [[142. 142. 142. 142. 142. 142. 142. 142. 142. 142. 142. 142. 142. 142.
  142. 142. 142. 142. 142. 142.]]

Now it would be great if we can also use this feature in cabinetry. I tried already by manipulating a workspace JSON created by cabinetry and reading it back in, but obviously while trying to validate the workspace it fails since it can't find the modifiers.
I know that there might be some development (or merging) necessary on pyhf but it would be very benificial for us if we can start a harmonisation somewhere/somehow!

Many thanks already!
Roman

(more on my studies to custom modifiers can be found on this branch, our working cabinetry setup that we want to modfiy with srqt can be found on this branch)

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Dec 19, 2022

@pariRieck kindly notified me that I didn't post any code related to the validation problem with cabinetry.

With the modified workspace:

ws_test.json

{
    "channels": [
        {
            "name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
            "samples": [
                {
                    "name": "LQ",
                    "data": [
                        -0.05606459826231003,
                        -0.12818728387355804,
                        -0.2890488803386688,
                        -0.32937803864479065,
                        -0.31295812129974365,
                        -0.19933012127876282,
                        -0.37242090702056885,
                        -0.17492030560970306,
                        0.04373674467206001,
                        -0.06333830207586288,
                        -0.15742823481559753,
                        -0.22937218844890594,
                        -0.2610972225666046,
                        -0.2189781814813614,
                        -0.3174850046634674,
                        -0.4120495915412903,
                        -0.15686370432376862,
                        -0.021678276360034943,
                        -0.0045187827199697495,
                        -0.01344318501651287,
                        -0.02579873986542225,
                        -0.04982074350118637,
                        -0.07583940774202347,
                        0.004755964037030935,
                        -0.003164847381412983,
                        -0.023233694955706596,
                        4.7682831905149214e-07
                    ],
                    "modifiers": [
                        {
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
                            "type": "staterror",
                            "data": [
                                0.008668658155208378,
                                0.02022405434559139,
                                0.09189963471064777,
                                0.055041314193191596,
                                0.05193619411764286,
                                0.05250817228062847,
                                0.12196702526253103,
                                0.09533493472770015,
                                0.04304259063548334,
                                0.008735758281095813,
                                0.021724028820482494,
                                0.07366337499255725,
                                0.13916412487069324,
                                0.05304763182865001,
                                0.0740552658748865,
                                0.09788409139292459,
                                0.0856684157637054,
                                0.031515757273781884,
                                0.0028551501550830223,
                                0.00452889907684713,
                                0.01178630152077521,
                                0.01785971625171565,
                                0.029409669632190254,
                                0.008514955090679089,
                                0.005638608568332566,
                                0.022561821378473186,
                                3.3716854266200553e-07
                            ]
                        },
                        {
                            "data": null,
                            "name": "Signal_norm",
                            "type": "normfactor"
                        },
                        {
                            "name": "lumi",
                            "type": "normsys",
                            "data": {
                                "hi": 1.02,
                                "lo": 0.98
                            }
                        },
                        {
                            "name": "f1",
                            "type": "customfunc",
                            "data": {
                                "expr": "srqt(LQ_sig)"
                            }
                        }
                    ]
                },
                {
                    "name": "Ztautau",
                    "data": [
                        436.16748046875,
                        488.24713134765625,
                        592.1552124023438,
                        689.1245727539062,
                        541.2816772460938,
                        382.540283203125,
                        399.01214599609375,
                        159.18698120117188,
                        86.46419525146484,
                        7.16304349899292,
                        13.451751708984375,
                        19.981090545654297,
                        17.612442016601562,
                        11.224842071533203,
                        8.581535339355469,
                        9.507404327392578,
                        3.3369927406311035,
                        1.5719773769378662,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        0.1462988257408142,
                        1.0335540771484375,
                        1.1950773000717163,
                        0.6874598264694214,
                        0.5342414975166321,
                        0.7435778975486755,
                        0.34987324476242065
                    ],
                    "modifiers": [
                        {
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
                            "type": "staterror",
                            "data": [
                                20.5831328383989,
                                12.685052357006015,
                                20.653157956521852,
                                12.434290751560727,
                                7.3875447209423735,
                                5.693878050538804,
                                4.438010832782596,
                                2.27962780062575,
                                1.7460472807358358,
                                3.2293445779617964,
                                2.937584300663341,
                                3.8563300760376307,
                                2.5811459297683035,
                                2.132442170617347,
                                1.1302369417272697,
                                1.158050378970395,
                                0.4328057657996733,
                                0.293585895025791,
                                1.2549671742314705,
                                0.6123342689795961,
                                0.42450101635581405,
                                0.6192830685386423,
                                0.6034584276428155,
                                0.2628000515705097,
                                0.26773556125933473,
                                0.5429703570495382,
                                0.20594280974269555
                            ]
                        },
                        {
                            "name": "lumi",
                            "type": "normsys",
                            "data": {
                                "hi": 1.02,
                                "lo": 0.98
                            }
                        },
                        {
                            "name": "Zjetsnorm",
                            "type": "normsys",
                            "data": {
                                "hi": 1.1,
                                "lo": 0.9
                            }
                        }
                    ]
                },
                {
                    "name": "ttbar",
                    "data": [
                        3.1341841220855713,
                        12.720019340515137,
                        19.068552017211914,
                        23.117305755615234,
                        20.081254959106445,
                        18.199512481689453,
                        21.11097526550293,
                        5.927351474761963,
                        2.1456751823425293,
                        9.777939796447754,
                        36.38203048706055,
                        58.69355773925781,
                        55.95661926269531,
                        55.57005310058594,
                        38.96144485473633,
                        45.38947296142578,
                        14.888962745666504,
                        5.933634281158447,
                        7.021153926849365,
                        30.105974197387695,
                        41.13555145263672,
                        38.76812744140625,
                        40.98802185058594,
                        26.757863998413086,
                        23.775829315185547,
                        9.067926406860352,
                        3.494324207305908
                    ],
                    "modifiers": [
                        {
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
                            "type": "staterror",
                            "data": [
                                0.7048574962698888,
                                1.447747788335472,
                                1.934288774132403,
                                2.036424385977694,
                                2.7116461600003383,
                                1.6235258929340821,
                                1.7616352447362071,
                                0.937151105504632,
                                0.5434934672310043,
                                1.2492619788650297,
                                2.364247978851889,
                                3.534628792227429,
                                4.882973770900739,
                                3.3520027176752483,
                                2.5269571859100153,
                                2.539433692589492,
                                1.4886337537773227,
                                1.3047660819005515,
                                1.0198372166654062,
                                2.1343265167845056,
                                3.6262796121476577,
                                2.765231829052504,
                                3.5557478234557185,
                                1.9370341684741987,
                                3.161288352499607,
                                1.1018118503488556,
                                0.7184444459774411
                            ]
                        },
                        {
                            "name": "lumi",
                            "type": "normsys",
                            "data": {
                                "hi": 1.02,
                                "lo": 0.98
                            }
                        },
                        {
                            "name": "ttnorm",
                            "type": "normsys",
                            "data": {
                                "hi": 1.1,
                                "lo": 0.9
                            }
                        },
                        {
                            "name": "ttbar-syst-ME",
                            "type": "normsys",
                            "data": {
                                "hi": 1.2540423950845079,
                                "lo": 0.7459576049154921
                            }
                        },
                        {
                            "name": "ttbar-syst-ME",
                            "type": "histosys",
                            "data": {
                                "hi_data": [
                                    5.589620550961276,
                                    13.99669448298217,
                                    23.05127199221935,
                                    26.189907309427298,
                                    22.649043407791225,
                                    17.228079477101723,
                                    18.26573635509351,
                                    5.676765489357286,
                                    2.122144502273853,
                                    10.905925305753641,
                                    39.10644624315509,
                                    57.07296865144896,
                                    62.12938274594555,
                                    57.49210270903877,
                                    34.77925045594372,
                                    39.80091365068339,
                                    14.733990610146979,
                                    6.3252451428130225,
                                    8.138022528148616,
                                    30.679739376006957,
                                    39.23730053717147,
                                    42.34591514099523,
                                    34.866447106988176,
                                    22.082967859637037,
                                    23.489714640163406,
                                    6.983240303823213,
                                    3.2344820494254867
                                ],
                                "lo_data": [
                                    0.6787476932098668,
                                    11.443344198048104,
                                    15.08583204220448,
                                    20.04470420180317,
                                    17.513466510421665,
                                    19.170945486277184,
                                    23.95621417591235,
                                    6.17793746016664,
                                    2.1692058624112054,
                                    8.649954287141867,
                                    33.657614730966,
                                    60.314146827066665,
                                    49.78385577944508,
                                    53.64800349213311,
                                    43.14363925352894,
                                    50.97803227216817,
                                    15.04393488118603,
                                    5.542023419503872,
                                    5.904285325550115,
                                    29.532209018768434,
                                    43.033802368101966,
                                    35.19033974181727,
                                    47.1095965941837,
                                    31.432760137189135,
                                    24.061943990207688,
                                    11.15261250989749,
                                    3.7541663651863297
                                ]
                            }
                        },
                        {
                            "name": "ttbar-syst-ISR",
                            "type": "normsys",
                            "data": {
                                "hi": 1.220732355329648,
                                "lo": 0.7792676446703519
                            }
                        },
                        {
                            "name": "ttbar-syst-ISR",
                            "type": "histosys",
                            "data": {
                                "hi_data": [
                                    2.498763994006645,
                                    13.236306487707676,
                                    19.29230991126219,
                                    22.833228247251583,
                                    24.941273081593412,
                                    18.464820350901913,
                                    20.110985620174425,
                                    6.736669155932718,
                                    3.6911596602628554,
                                    9.457252447743462,
                                    33.866664046268596,
                                    57.86392692014527,
                                    61.54608874899057,
                                    51.63878535918566,
                                    40.261401546171015,
                                    43.00489522688136,
                                    14.475398692042585,
                                    6.010084991843683,
                                    5.240978017858997,
                                    29.013586505569137,
                                    38.59096029569867,
                                    43.213121490802266,
                                    34.89366445424466,
                                    26.35626576140594,
                                    26.585479022782987,
                                    10.249787376359388,
                                    4.099461211408759
                                ],
                                "lo_data": [
                                    3.7696042501644977,
                                    12.203732193322598,
                                    18.844794123161638,
                                    23.401383263978886,
                                    15.221236836619479,
                                    17.934204612476993,
                                    22.110964910831434,
                                    5.1180337935912075,
                                    0.6001907044222032,
                                    10.098627145152046,
                                    38.8973969278525,
                                    59.52318855837036,
                                    50.367149776400055,
                                    59.501320841986214,
                                    37.66148816330164,
                                    47.7740506959702,
                                    15.302526799290423,
                                    5.857183570473212,
                                    8.801329835839734,
                                    31.198361889206254,
                                    43.680142609574766,
                                    34.323133392010234,
                                    47.08237924692722,
                                    27.159462235420232,
                                    20.966179607588106,
                                    7.8860654373613155,
                                    2.8891872032030577
                                ]
                            }
                        },
                        {
                            "name": "ttbar-syst-PS",
                            "type": "normsys",
                            "data": {
                                "hi": 1.2050235431816265,
                                "lo": 0.7949764568183735
                            }
                        },
                        {
                            "name": "ttbar-syst-PS",
                            "type": "histosys",
                            "data": {
                                "hi_data": [
                                    3.3534550281978777,
                                    12.033856595625718,
                                    21.30899227415003,
                                    24.509509817305503,
                                    25.8481115859143,
                                    18.5719450620786,
                                    21.83413761556733,
                                    7.246702915986389,
                                    2.957846598160289,
                                    7.25676576413713,
                                    38.48344633986547,
                                    56.8197268584179,
                                    59.926051710955605,
                                    59.05327869163685,
                                    41.963464931389396,
                                    42.0285541082128,
                                    15.447970060940204,
                                    7.209872812613123,
                                    5.943613148226668,
                                    25.23561613233863,
                                    35.01958528729912,
                                    41.148430421680196,
                                    34.2366790814458,
                                    24.59702612672,
                                    24.277325919171936,
                                    7.490115417310103,
                                    4.371238319149527
                                ],
                                "lo_data": [
                                    2.914913215973265,
                                    13.406182085404556,
                                    16.8281117602738,
                                    21.725101693924966,
                                    14.314398332298591,
                                    17.827079901300305,
                                    20.38781291543853,
                                    4.608000033537537,
                                    1.3335037665247698,
                                    12.299113828758378,
                                    34.28061463425562,
                                    60.567388620097724,
                                    51.98718681443502,
                                    52.08682750953503,
                                    35.95942477808326,
                                    48.75039181463876,
                                    14.329955430392804,
                                    4.657395749703771,
                                    8.098694705472063,
                                    34.976332262436756,
                                    47.251517617974315,
                                    36.387824461132304,
                                    47.739364619726075,
                                    28.918701870106172,
                                    23.274332711199158,
                                    10.645737396410599,
                                    2.617410095462289
                                ]
                            }
                        }
                    ]
                },
                {
                    "name": "Zll",
                    "data": [
                        0.8972583413124084,
                        7.064095973968506,
                        3.542963981628418,
                        0.8430342674255371,
                        2.118805170059204,
                        3.334984064102173,
                        1.5606133937835693,
                        0.050484806299209595,
                        0.05545007809996605,
                        9.999999717180685e-10,
                        0.013289040885865688,
                        0.08560466766357422,
                        0.061958350241184235,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        0.004126517102122307,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        0.0003875246911775321,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        9.999999717180685e-10,
                        9.999999717180685e-10
                    ],
                    "modifiers": [
                        {
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
                            "type": "staterror",
                            "data": [
                                0.3401330686174674,
                                4.852262569144132,
                                2.2531419087598037,
                                0.3680943155314295,
                                1.1551888083782842,
                                2.360606035437923,
                                1.551998700276235,
                                0.03246505830827842,
                                0.0518583933421947,
                                0.0,
                                0.010456572068689982,
                                0.0491447290895443,
                                0.043815842545414034,
                                4.777328058235764,
                                0.0,
                                0.007661113437351681,
                                0.0,
                                0.0,
                                0.0,
                                0.000387524686374085,
                                0.0,
                                0.0,
                                0.0,
                                0.0,
                                0.0,
                                0.0,
                                0.0
                            ]
                        },
                        {
                            "name": "lumi",
                            "type": "normsys",
                            "data": {
                                "hi": 1.02,
                                "lo": 0.98
                            }
                        },
                        {
                            "name": "Zjetsnorm",
                            "type": "normsys",
                            "data": {
                                "hi": 1.1,
                                "lo": 0.9
                            }
                        }
                    ]
                },
                {
                    "name": "Wjets",
                    "data": [
                        25.98953628540039,
                        148.0875701904297,
                        466.211181640625,
                        292.91748046875,
                        119.51535034179688,
                        112.79962921142578,
                        104.71670532226562,
                        17.376272201538086,
                        37.127410888671875,
                        0.8891265392303467,
                        4.487545013427734,
                        0.7926654815673828,
                        5.391210079193115,
                        3.3131606578826904,
                        9.999999717180685e-10,
                        5.14890718460083,
                        9.999999717180685e-10,
                        0.5045974850654602,
                        9.999999717180685e-10,
                        0.6733123660087585,
                        0.3701707720756531,
                        0.19689325988292694,
                        0.17677272856235504,
                        0.23742935061454773,
                        0.010014912113547325,
                        9.999999717180685e-10,
                        9.999999717180685e-10
                    ],
                    "modifiers": [
                        {
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
                            "type": "staterror",
                            "data": [
                                18.95457349884378,
                                70.00297990502952,
                                177.35522110928375,
                                91.98004964094874,
                                33.965511736785935,
                                26.088142090666643,
                                12.079689416039482,
                                12.11578110729692,
                                24.444370096270514,
                                0.9677067665549353,
                                1.974704473289153,
                                2.0124259685957564,
                                2.2705996231545655,
                                1.746895662738218,
                                19.601065862247825,
                                1.9193211214366754,
                                0.6241604219430951,
                                0.3235011961482161,
                                0.0,
                                0.4626067886785151,
                                0.2209095013802317,
                                0.09795756807217645,
                                0.1699487101937834,
                                0.4260972795655261,
                                0.010014912457366916,
                                0.0,
                                0.0
                            ]
                        },
                        {
                            "name": "lumi",
                            "type": "normsys",
                            "data": {
                                "hi": 1.02,
                                "lo": 0.98
                            }
                        },
                        {
                            "name": "Wjetsnorm",
                            "type": "normsys",
                            "data": {
                                "hi": 1.1,
                                "lo": 0.9
                            }
                        }
                    ]
                },
                {
                    "name": "singletop",
                    "data": [
                        0.4650433659553528,
                        3.586678981781006,
                        5.3690314292907715,
                        6.5188493728637695,
                        6.794899940490723,
                        5.230123996734619,
                        6.455552101135254,
                        1.416122555732727,
                        0.9780453443527222,
                        2.19258451461792,
                        5.467511177062988,
                        9.99152946472168,
                        12.938706398010254,
                        11.11521053314209,
                        6.745467185974121,
                        8.185778617858887,
                        2.5264368057250977,
                        1.9584273099899292,
                        0.30111247301101685,
                        1.7843446731567383,
                        5.020862579345703,
                        3.012359142303467,
                        3.426671266555786,
                        1.3434967994689941,
                        2.187044620513916,
                        0.59363853931427,
                        0.6312496066093445
                    ],
                    "modifiers": [
                        {
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
                            "type": "staterror",
                            "data": [
                                0.2718062271031703,
                                0.7923469562463225,
                                0.9118307475001273,
                                1.002149339337283,
                                0.9966891769592305,
                                0.9332358116647141,
                                1.0168364172907858,
                                0.4762117878606902,
                                0.3712258868717275,
                                0.5923058979371326,
                                0.8984444292653833,
                                1.2518221626946582,
                                1.3480464625279092,
                                1.3001132731189493,
                                1.016808287743095,
                                1.1206117848615749,
                                0.6771452482618175,
                                0.6090799639166785,
                                0.2144043880922945,
                                0.529493511178739,
                                0.8710017207447484,
                                0.6197534547258692,
                                0.7132123735532212,
                                0.418982653955289,
                                0.5525910152642597,
                                0.39046022026138144,
                                0.3224762788264719
                            ]
                        },
                        {
                            "name": "lumi",
                            "type": "normsys",
                            "data": {
                                "hi": 1.02,
                                "lo": 0.98
                            }
                        },
                        {
                            "name": "singletopnorm",
                            "type": "normsys",
                            "data": {
                                "hi": 1.1,
                                "lo": 0.9
                            }
                        }
                    ]
                },
                {
                    "name": "multiboson",
                    "data": [
                        5.772249698638916,
                        13.322222709655762,
                        19.44859504699707,
                        20.207860946655273,
                        18.15636444091797,
                        14.918745994567871,
                        20.061100006103516,
                        8.208643913269043,
                        6.3627095222473145,
                        0.16463185846805573,
                        0.9877825975418091,
                        0.8409641981124878,
                        0.5360387563705444,
                        0.9228699207305908,
                        1.1696816682815552,
                        0.6535868048667908,
                        0.44904208183288574,
                        0.678593635559082,
                        0.022182032465934753,
                        0.10862438380718231,
                        0.08747895807027817,
                        0.09262492507696152,
                        0.039744794368743896,
                        0.03034309111535549,
                        0.07780832052230835,
                        0.009849405847489834,
                        0.008091447874903679
                    ],
                    "modifiers": [
                        {
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
                            "type": "staterror",
                            "data": [
                                0.7273612161108358,
                                1.120670974516059,
                                1.251605905903741,
                                1.406438716475429,
                                1.2567012483720807,
                                1.1372110327058536,
                                1.1542905228691978,
                                0.7829390382587853,
                                0.6832217717405377,
                                0.14013336880554994,
                                0.1979738831978867,
                                0.22565851796019412,
                                0.2772155648937874,
                                0.21525601444031867,
                                0.30849609485526763,
                                0.14112416269366374,
                                0.12435971213698106,
                                0.43622231998384436,
                                0.0068668539426901066,
                                0.06510054634385384,
                                0.028647073491025403,
                                0.03377145617810225,
                                0.019076840232356063,
                                0.027789509477681693,
                                0.026461215915870768,
                                0.010656429555943602,
                                0.006163517185353037
                            ]
                        },
                        {
                            "name": "lumi",
                            "type": "normsys",
                            "data": {
                                "hi": 1.02,
                                "lo": 0.98
                            }
                        },
                        {
                            "name": "multibosonnorm",
                            "type": "normsys",
                            "data": {
                                "hi": 1.1,
                                "lo": 0.9
                            }
                        }
                    ]
                }
            ]
        }
    ],
    "measurements": [
        {
            "name": "rec_smim_vs_Nbjet_HighMassOS",
            "config": {
                "parameters": [
                    {
                        "name": "LQ_sig",
                        "inits": [
                            0
                        ],
                        "bounds": [
                            [
                                0,
                                100
                            ]
                        ]
                    }
                ],
                "poi": "LQ_sig"
            }
        }
    ],
    "observations": [
        {
            "name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
            "data": [
                472.42572021484375,
                673.0277099609375,
                1105.7955322265625,
                1032.72900390625,
                707.9483032226562,
                537.0233764648438,
                552.9171142578125,
                192.1658935546875,
                133.13348388671875,
                20.18732452392578,
                60.78990936279297,
                90.38542175292969,
                92.4969711303711,
                77.36933898925781,
                38.831180572509766,
                68.88927459716797,
                20.546157836914062,
                10.64723014831543,
                7.0030951499938965,
                32.448814392089844,
                46.76036071777344,
                43.103553771972656,
                45.82628631591797,
                29.05659294128418,
                26.584938049316406,
                10.414992332458496,
                4.483538627624512
            ]
        }
    ],
    "version": "1.0.0"
}

and executing:

import json

import pyhf
from add_custom_modifier import add_custom_modifier

### LQ signal sqrt attempt
new_params = {
    'LQ': { 'inits': (1.0,), 'bounds': ((-5.0, 5.0),) },
}
expanded_pyhf = add_custom_modifier('customfunc', ['LQ'], new_params)

with open('ws_test.json') as fin:
    ws_test = json.load(fin)

model_test = pyhf.Model(ws_test, modifier_set=expanded_pyhf, poi_name = 'LQ', validate=True, batch_size=1)

I get the following error:

Traceback (most recent call last):
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/pyhf/schema/validator.py", line 93, in validate
    return validator.validate(spec)
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/jsonschema/validators.py", line 304, in validate
    raise error
jsonschema.exceptions.ValidationError: {'name': 'f1', 'type': 'customfunc', 'data': {'expr': 'srqt(LQ)'}} is not valid under any of the given schemas

Failed validating 'anyOf' in schema['properties']['channels']['items']['properties']['samples']['items']['properties']['modifiers']['items']:
    {'anyOf': [{'$ref': '#/definitions/modifier/histosys'},
               {'$ref': '#/definitions/modifier/lumi'},
               {'$ref': '#/definitions/modifier/normfactor'},
               {'$ref': '#/definitions/modifier/normsys'},
               {'$ref': '#/definitions/modifier/shapefactor'},
               {'$ref': '#/definitions/modifier/shapesys'},
               {'$ref': '#/definitions/modifier/staterror'}]}

On instance['channels'][0]['samples'][0]['modifiers'][3]:
    {'data': {'expr': 'srqt(LQ)'}, 'name': 'f1', 'type': 'customfunc'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/terabig/muellerr/Leptoquarks/fitting/test_modifier.py", line 45, in <module>
    model_test = pyhf.Model(ws_test, modifier_set=expanded_pyhf, poi_name = 'LQ', validate=True, batch_size=1)
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/pyhf/pdf.py", line 766, in __init__
    schema.validate(self.spec, self.schema, version=self.version)
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/pyhf/schema/validator.py", line 95, in validate
    raise pyhf.exceptions.InvalidSpecification(err, schema_name)
pyhf.exceptions.InvalidSpecification: {'name': 'f1', 'type': 'customfunc', 'data': {'expr': 'srqt(LQ)'}} is not valid under any of the given schemas.
        Path: channels[0].samples[0].modifiers[3]
        Instance: {'name': 'f1', 'type': 'customfunc', 'data': {'expr': 'srqt(LQ)'}} Schema: model.json

@alexander-held
Copy link
Member

Hi @rmnmllr, thanks for all the information provided here! This issue looks to me independent of cabinetry (at least at this stage of the error you post) and on the side of pyhf. The default schema against which you validate does not contain support for custom modifiers, so you could either validate against a custom schema or skip validation (validate=False). I imagine @kratsg may also have ideas about how to best approach this workflow.

@kratsg
Copy link

kratsg commented Dec 19, 2022

You can also pass in your own schema to validate against, but we should have custom function expressions allowable at some point.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Dec 19, 2022

Hi @rmnmllr, thanks for all the information provided here! This issue looks to me independent of cabinetry (at least at this stage of the error you post) and on the side of pyhf. The default schema against which you validate does not contain support for custom modifiers, so you could either validate against a custom schema or skip validation (validate=False). I imagine @kratsg may also have ideas about how to best approach this workflow.

Thank you! Yes I'm aware that this relates to pyhf and as you wrote with the pyhf example above I can validate=False. I know this also doesn't relate to cabinetry but I tried just now cabinetry's:

model_cabi, data_cabi = cabinetry.model_utils.model_and_data(ws_test)

and I can only disable the validation going into the pyhf module here and comment it out. But it creates more problems and I don't like editing modules.

You can also pass in your own schema to validate against, but we should have custom function expressions allowable at some point.
That would be great for us @kratsg! Is there a timeline on this? :)

@alexander-held alexander-held added the enhancement New feature or request label Dec 19, 2022
@alexander-held
Copy link
Member

cabinetry currently doesn't support passing through the setting to turn validation off, or to provide a custom schema:

model = workspace.model(
modifier_settings={
"normsys": {"interpcode": "code4"},
"histosys": {"interpcode": "code4p"},
}
) # use HistFactory InterpCode=4 (default in pyhf since v0.6.0)

We can add a validate argument to model_utils.model_and_data to turn off validation completely. I do not remember the details of how exactly pyhf handles custom schemas, but I vaguely recall that you can set where to find them through pyhf API? If so, that may already work at the moment and nothing for that would be needed on the cabinetry side but I would need to confirm (@kratsg do you think this can / should be handled fully within pyhf?).

@kratsg
Copy link

kratsg commented Dec 19, 2022

details of how exactly pyhf handles custom schemas

it's in our docs: https://scikit-hep.org/pyhf/_generated/pyhf.schema.Schema.html#pyhf.schema.Schema

so you can always pass in the path to your custom schema and have it pass validation.

@kratsg
Copy link

kratsg commented Dec 19, 2022

That would be great for us @kratsg! Is there a timeline on this? :)

Timeline: unclear. We expect it to be in for 0.8.0 but there's a few things in the way first (multi-POI, etc..)

@alexander-held
Copy link
Member

@rmnmllr please have a look at #385, which adds the relevant arguments to model_utils.model_and_data to support the example you provided above. To turn validation back on, you would need to have a suitable JSON specification to use, and that would be set purely via pyhf API as far as I can tell. We could consider providing this also through cabinetry for convenience, but it is not clear to me whether that makes it easier to use or more confusing (as then there would be more than one way to do this, direct pyhf or via an intermediate cabinetry layer).

I added a poi_name argument as well for now, but I am not completely happy with it. The workspace specification you provide is in my view "broken" in the sense that the POI you set does not exist. That could be fixed in the future by moving the custom modifier into the workspace specification (not supported by pyhf yet), and then this issue would not exist regardless, or you could skip the POI for now in the specification (set it to "") and only set it dynamically later as needed. The cabinetry fit API is built around the idea of not needed to set it in the model itself, which makes handling multiple APIs easier for now that pyhf does not support this directly.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Dec 21, 2022

@rmnmllr please have a look at #385, which adds the relevant arguments to model_utils.model_and_data to support the example you provided above. To turn validation back on, you would need to have a suitable JSON specification to use, and that would be set purely via pyhf API as far as I can tell. We could consider providing this also through cabinetry for convenience, but it is not clear to me whether that makes it easier to use or more confusing (as then there would be more than one way to do this, direct pyhf or via an intermediate cabinetry layer).

I added a poi_name argument as well for now, but I am not completely happy with it. The workspace specification you provide is in my view "broken" in the sense that the POI you set does not exist. That could be fixed in the future by moving the custom modifier into the workspace specification (not supported by pyhf yet), and then this issue would not exist regardless, or you could skip the POI for now in the specification (set it to "") and only set it dynamically later as needed. The cabinetry fit API is built around the idea of not needed to set it in the model itself, which makes handling multiple APIs easier for now that pyhf does not support this directly.

Cool, thank you! I hope I can take a look before the holidays starting the day after tomorrow!
Just as a sanity check for testing, I pull the feature branch and pip install <local-path-to-repo>? Anything else I need to be aware of?

@alexander-held
Copy link
Member

Yes, that should work. You can also install it directly via:

pip install git+https://github.com/scikit-hep/cabinetry.git@feat/custom-modifier-models

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Dec 22, 2022

Ok I cleaned up/simplified my workspace:

{
    "channels": [
        {
            "name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
            "samples": [
                {
                    "name": "LQ",
                    "data": [
                        -0.05606459826231003,
                        -0.12818728387355804,
                        -0.2890488803386688,
                        -0.32937803864479065,
                        -0.31295812129974365
                    ],
                    "modifiers": [
                        {
                            "name": "f1",
                            "type": "customfunc",
                            "data": {
                                "expr": "-srqt(m1)"
                            }
                        }
                    ]
                }
            ]
        }
    ],
    "measurements": [
        {
            "name": "rec_smim_vs_Nbjet_HighMassOS",
            "config": {
                "parameters": [
                    {
                        "name": "LQ",
                        "inits": [
                            0
                        ],
                        "bounds": [
                            [
                                0,
                                10
                            ]
                        ]
                    }
                ],
                "poi": "m1"
            }
        }
    ],
    "observations": [
        {
            "name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
            "data": [
                472.42572021484375,
                673.0277099609375,
                1105.7955322265625,
                1032.72900390625,
                707.9483032226562
            ]
        }
    ],
    "version": "1.0.0"
}

and I can confirm that I can create the model with cabinetry by doing:

import json

import cabinetry

from add_custom_modifier import add_custom_modifier

new_params = { 'm1': { 'inits': (1.0,), 'bounds': ((-5.0, 5.0),) } }
expanded_pyhf = add_custom_modifier('customfunc', ['m1'], new_params)

with open('test_mod_ws.json') as fin:
    ws_test = json.load(fin)

model, data = cabinetry.model_utils.model_and_data(ws_test, validate=False, modifier_set=expanded_pyhf)

Now if I try:

cabinetry.fit.fit(model, data)

I get a long list of errors:

Eval failed for data [472.42572021484375, 673.0277099609375, 1105.7955322265625, 1032.72900390625, 707.9483032226562] pars: [1.0]
Traceback (most recent call last):
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 942, in logpdf
    result = self.make_pdf(pars).log_prob(data)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 904, in make_pdf
    mainpdf = self.main_model.make_pdf(pars)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 633, in make_pdf
    lambdas_data = self.expected_data(pars)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 695, in expected_data
    deltas, factors = self._modifications(pars)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 660, in _modifications
    [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 660, in <listcomp>
    [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
  File "/terabig/muellerr/Leptoquarks/fitting/add_custom_modifier.py", line 108, in apply
    results = tensorlib.astensor([f(deps) for f in self.funcs])
  File "/terabig/muellerr/Leptoquarks/fitting/add_custom_modifier.py", line 108, in <listcomp>
    results = tensorlib.astensor([f(deps) for f in self.funcs])
  File "/terabig/muellerr/Leptoquarks/fitting/add_custom_modifier.py", line 11, in func
    return ne.evaluate(expression,local_dict=dict(zip(deps,d)))
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/numexpr/necompiler.py", line 817, in evaluate
    _names_cache[expr_key] = getExprNames(ex, context)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/numexpr/necompiler.py", line 704, in getExprNames
    ex = stringToExpression(text, {}, context)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/numexpr/necompiler.py", line 289, in stringToExpression
    ex = eval(c, names)
  File "<expr>", line 1, in <module>
TypeError: 'VariableNode' object is not callable
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/fit/__init__.py", line 482, in fit
    fit_results = _fit_model(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/fit/__init__.py", line 293, in _fit_model
    fit_results = _fit_model_pyhf(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/fit/__init__.py", line 91, in _fit_model_pyhf
    result, corr_mat, best_twice_nll, result_obj = pyhf.infer.mle.fit(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/infer/mle.py", line 131, in fit
    return opt.minimize(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/mixins.py", line 194, in minimize
    result = self._internal_minimize(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/mixins.py", line 53, in _internal_minimize
    result = self._minimize(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/opt_minuit.py", line 117, in _minimize
    minimizer.migrad(ncall=maxiter)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/iminuit/minuit.py", line 694, in migrad
    fm = migrad(ncall, self._tolerance)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/opt_numpy.py", line 30, in func
    return objective(constrained_pars, data, pdf)[0]
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/infer/mle.py", line 53, in twice_nll
    return -2 * pdf.logpdf(pars, data)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 942, in logpdf
    result = self.make_pdf(pars).log_prob(data)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 904, in make_pdf
    mainpdf = self.main_model.make_pdf(pars)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 633, in make_pdf
    lambdas_data = self.expected_data(pars)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 695, in expected_data
    deltas, factors = self._modifications(pars)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 660, in _modifications
    [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 660, in <listcomp>
    [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
  File "/terabig/muellerr/Leptoquarks/fitting/add_custom_modifier.py", line 108, in apply
    results = tensorlib.astensor([f(deps) for f in self.funcs])
  File "/terabig/muellerr/Leptoquarks/fitting/add_custom_modifier.py", line 108, in <listcomp>
    results = tensorlib.astensor([f(deps) for f in self.funcs])
  File "/terabig/muellerr/Leptoquarks/fitting/add_custom_modifier.py", line 11, in func
    return ne.evaluate(expression,local_dict=dict(zip(deps,d)))
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/numexpr/necompiler.py", line 817, in evaluate
    _names_cache[expr_key] = getExprNames(ex, context)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/numexpr/necompiler.py", line 704, in getExprNames
    ex = stringToExpression(text, {}, context)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/numexpr/necompiler.py", line 289, in stringToExpression
    ex = eval(c, names)
  File "<expr>", line 1, in <module>
TypeError: 'VariableNode' object is not callable

which I guess is expected because add_custom_modifier is not implemented in pyhf yet. At this point, is there anything else I can do besides waiting for the feature to be implemented in pyhf?

@alexander-held
Copy link
Member

alexander-held commented Dec 22, 2022

Hi @rmnmllr, this issue is due to a typo: instead of "expr": "-srqt(m1)" it should be "expr": "-sqrt(m1)" in your specification.

For completeness (and since I may want to find this again in the future myself), here's another small setup that works with #385 and scikit-hep/pyhf#1991:

import cabinetry
from pyhf.experimental.modifiers import add_custom_modifier

cabinetry.set_logging()

spec = {
    "channels": [
        {
            "name": "singlechannel",
            "samples": [
                {
                    "name": "signal",
                    "data": [10, 10],
                    "modifiers": [
                        {
                            "name": "f1",
                            "type": "customfunc_sq",
                            "data": {"expr": "sqrt_mu**2"},
                        },
                    ],
                }
            ],
        },
    ],
    "measurements": [{"name": "meas", "config": {"parameters": [], "poi": ""}}],
    "observations": [{"name": "singlechannel", "data": [10.0, 10.0]}],
    "version": "1.0.0",
}

new_params = {
    "sqrt_mu": {"inits": (1.0,), "bounds": ((-5.0, 5.0),)},
}
expanded_pyhf = add_custom_modifier("customfunc_sq", ["sqrt_mu"], new_params)
model, data = cabinetry.model_utils.model_and_data(
    spec, validate=False, modifier_set=expanded_pyhf
)
data = [15] * 2 + model.config.auxdata
fit_results = cabinetry.fit.fit(model, data)

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Dec 22, 2022

Hi @rmnmllr, this issue is due to a typo: instead of "expr": "-srqt(m1)" it should be "expr": "-sqrt(m1)" in your specification.

Arghh, thanks for spotting this - works now!

@alexander-held
Copy link
Member

It may be a good idea to catch this in pyhf with an informative error message (assuming this is possible, haven't looked at the details). Could you perhaps comment on that in scikit-hep/pyhf#1991 (can probably just link to your comment in this issue) so it does not get forgotten?

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 11, 2023

Hi @alexander-held,

How do I implement the custom modifier function with the configuration schema of cabinetry?

I assume it's in the NormFactors section of the YAML but from the docs I'm not sure if I can provide data and type like in the example workspaces above. I tried nevertheless but it fails validation here by comparing to the config schema config.json.

I tried adding:

"Data": {
    "type": "object"
},
"Type": {
    "type": "string"
}

to the config.json which makes it past validation but doesn't affect my workspace as shown as following.

With myconfig.yaml:

General:
  Measurement: "rec_smim_vs_Nbjet_HighMassOS"
  HistogramFolder: "outputs/"
  POI: "Signal_norm"
  InputPath: "fitter-2022-12-14-rebin3/{SamplePath}{VariationPath}.root:{RegionPath}"

Regions:
  - Name: "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets"
    RegionPath: "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets"

Samples:
  - Name: "Data"
    SamplePath: "data"
    Data: True

  - Name: "LQ"
    SamplePath: "LQ-gen-betaL33_1p5TeV"

NormFactors:
  - Name: "f1"
    Samples: "LQ"
    Data: { "expr": "-srqt(m1)" }
    Type: "customfunc"
    Nominal: 0
    Bounds: [0, 100]

and executing:

import cabinetry

config = cabinetry.configuration.load("myconfig.yml")
ws = cabinetry.workspace.build(config) 
cabinetry.workspace.save(ws, workspace_path)

I get for the saved ws:

{
    "channels": [
        {
            "name": "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets",
            "samples": [
                {
                    "data": [
                        -2.8630847930908203,
                        -0.28795331716537476,
                        0.019545821473002434
                    ],
                    "modifiers": [
                        {
                            "data": [
                                0.26342490191721557,
                                0.09063038118573631,
                                0.05747892970426467
                            ],
                            "name": "staterror_rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets",
                            "type": "staterror"
                        },
                        {
                            "data": null,
                            "name": "f1",
                            "type": "normfactor"
                        }
                    ],
                    "name": "LQ"
                }
            ]
        }
    ],
    "measurements": [
        {
            "config": {
                "parameters": [
                    {
                        "bounds": [
                            [
                                0,
                                100
                            ]
                        ],
                        "inits": [
                            0
                        ],
                        "name": "f1"
                    }
                ],
                "poi": "Signal_norm"
            },
            "name": "rec_smim_vs_Nbjet_HighMassOS"
        }
    ],
    "observations": [
        {
            "data": [
                109277.0,
                10338.0,
                1582.0
            ],
            "name": "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets"
        }
    ],
    "version": "1.0.0"
}

Maybe I'm making a mistake at implementation, any help would be greatly appriciated!

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 11, 2023

Another problem I encounter. If I use your example above and modify:
"data": [10, 10] to "data": [-1, -1]
"data": {"expr": "sqrt_mu**2"} to "data": {"expr": "sqrt(sqrt_mu)"}

import cabinetry
from add_custom_modifier import add_custom_modifier

cabinetry.set_logging()

spec = {
    "channels": [
        {
            "name": "singlechannel",
            "samples": [
                {
                    "name": "signal",
                    "data": [
                        -1,
                        -1
                    ],
                    "modifiers": [
                        {
                            "name": "f1",
                            "type": "customfunc_sq",
                            "data": {"expr": "sqrt(sqrt_mu)"},
                        },
                    ],
                }
            ],
        },
    ],
    "measurements": [{"name": "meas", "config": {"parameters": [], "poi": ""}}],
    "observations": [{"name": "singlechannel", "data": [
                15,
                15
    ]}],
    "version": "1.0.0",
}

new_params = {
    "sqrt_mu": {"inits": (1.0,), "bounds": ((-5.0, 5.0),)},
}
expanded_pyhf = add_custom_modifier("customfunc_sq", ["sqrt_mu"], new_params)
model, data = cabinetry.model_utils.model_and_data(
    spec, validate=False, modifier_set=expanded_pyhf
)
# data = [15] * 2 + model.config.auxdata
fit_results = cabinetry.fit.fit(model, data)

I get the "estimated distance to minimum too large" error:

ERROR - pyhf.optimize.mixins -      corr: None
      fun: nan
 hess_inv: None
  message: 'Optimization failed. Estimated distance to minimum too large.'
   minuit: <FMin algorithm='Migrad' edm=nan edm_goal=0.0002 errordef=1.0 fval=nan has_accurate_covar=False has_covariance=True has_made_posdef_covar=False has_parameters_at_limit=False has_posdef_covar=False has_reached_call_limit=False has_valid_parameters=False hesse_failed=False is_above_max_edm=True is_valid=False nfcn=165 ngrad=0 reduced_chi2=nan time=0.05489987274631858>
(Param(number=0, name='sqrt_mu', value=1.0, error=0.0, merror=None, is_const=False, is_fixed=False, lower_limit=-5.0, upper_limit=5.0),)
     nfev: 165
     njev: 0
  success: False
      unc: None
        x: <ValueView sqrt_mu=1.0>
Traceback (most recent call last):
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/mixins.py", line 64, in _internal_minimize
    assert result.success
AssertionError
Traceback (most recent call last):
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/mixins.py", line 64, in _internal_minimize
    assert result.success
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/terabig/muellerr/Leptoquarks/fitting/test_modifier_alex.py", line 44, in <module>
    fit_results = cabinetry.fit.fit(model, data)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/fit/__init__.py", line 482, in fit
    fit_results = _fit_model(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/fit/__init__.py", line 293, in _fit_model
    fit_results = _fit_model_pyhf(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/fit/__init__.py", line 91, in _fit_model_pyhf
    result, corr_mat, best_twice_nll, result_obj = pyhf.infer.mle.fit(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/infer/mle.py", line 131, in fit
    return opt.minimize(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/mixins.py", line 194, in minimize
    result = self._internal_minimize(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/mixins.py", line 67, in _internal_minimize
    raise exceptions.FailedMinimization(result)
pyhf.exceptions.FailedMinimization: Optimization failed. Estimated distance to minimum too large.

Are negative values in the samples not working?
And for my understanding, why did you choose sqrt_mu**2 as your expression? Aren't you scaling by the quadratic term then?

@alexander-held
Copy link
Member

Hi @rmnmllr,

How do I implement the custom modifier function with the configuration schema of cabinetry?

this is currently not supported. The settings that cabinetry supports can be found here. We would have to add support for this, which may be a bit easier once the feature is finalized on the side of pyhf to avoid duplication of work in case anything changes there. For now you would have to build a workspace without that functionality in cabinetry and then post-process it (open the spec and you can modify it manually).

"data": [10, 10] to "data": [-1, -1]

Where are you using -1 in your example? I don't see it, but certainly a negative observation does not make sense physically and cannot work.

pyhf.exceptions.FailedMinimization: Optimization failed. Estimated distance to minimum too large.

One likely source of this issue is that the model predicts negative yields somewhere, resulting in a NaN likelihood (see also scikit-hep/pyhf#2093). MIGRAD (which we use to minimize) is not very good at recovering from that. You can clip the yield to prevent that (see the pyhf issue) or (ideally) define a model that cannot run into this problem to begin with.

And for my understanding, why did you choose sqrt_mu**2 as your expression? Aren't you scaling by the quadratic term then?

This was just a toy example with no deep meaning, I probably picked that to easily compare to an expression-less setup where I scale by mu directly.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 11, 2023

this is currently not supported. The settings that cabinetry supports can be found here. We would have to add support for this, which may be a bit easier once the feature is finalized on the side of pyhf to avoid duplication of work in case anything changes there. For now you would have to build a workspace without that functionality in cabinetry and then post-process it (open the spec and you can modify it manually).

Ok, makes sense - thank you for your answers!

Where are you using -1 in your example? I don't see it, but certainly a negative observation does not make sense physically and cannot work.

Ah sorry, I copy-pasted from the wrong file - it's in the channels section and not observations (I edited the above code).
We have interference terms in the samples that can be both positive and negative in the same histogram, this works fine without the custom modifier.

You can clip the yield to prevent that (see the pyhf issue) or (ideally) define a model that cannot run into this problem to begin with.

You mean by setting the bracket and/or bound?

@alexander-held
Copy link
Member

You mean by setting the bracket and/or bound?

bound for normalization factors, and clipping is done in the Model construction, see scikit-hep/pyhf#1845.

I have not tried out samples with negative yields, I could not point spontaneously to any issue (as long as the total stays positive) but it may be worth some sanity checks (e.g. positive yield with negative normalization attached instead).

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 11, 2023

I have not tried out samples with negative yields, I could not point spontaneously to any issue (as long as the total stays positive) but it may be worth some sanity checks (e.g. positive yield with negative normalization attached instead).

I just manage to do it by using our values:

import cabinetry
from add_custom_modifier import add_custom_modifier

cabinetry.set_logging()

spec = {
    "channels": [
        {
            "name": "singlechannel",
            "samples": [
                {
                    "name": "signal",
                    "data": [
                        -0.05606459826231003,
                        -0.12818728387355804,
                        -0.2890488803386688,
                        -0.32937803864479065,
                        -0.31295812129974365
                    ],
                    "modifiers": [
                        {
                            "name": "f1",
                            "type": "customfunc",
                            "data": {"expr": "-sqrt(sqrt_mu)"},
                        },
                    ],
                }
            ],
        },
    ],
    "measurements": [{"name": "meas", "config": {"parameters": [], "poi": ""}}],
    "observations": [
        {
            "name": "singlechannel",
            "data": [
                472.42572021484375,
                673.0277099609375,
                1105.7955322265625,
                1032.72900390625,
                707.9483032226562
            ]
        }
    ],
    "version": "1.0.0",
}

new_params = {
    "sqrt_mu": {"inits": (1.0,), "bounds": ((-5.0, 5.0),)},
}
expanded_pyhf = add_custom_modifier("customfunc", ["sqrt_mu"], new_params)
model, data = cabinetry.model_utils.model_and_data(
    spec, validate=False, modifier_set=expanded_pyhf
)
fit_results = cabinetry.fit.fit(model, data)

which gives:

{'name': 'f1', 'type': 'customfunc', 'data': {'expr': '-sqrt(sqrt_mu)'}}
INFO - pyhf.pdf - adding modifier sqrt_mu (1 new nuisance parameters)
INFO - cabinetry.fit - performing maximum likelihood fit
INFO - cabinetry.fit - Migrad status:
┌─────────────────────────────────────────────────────────────────────────┐
│                                Migrad                                   │
├──────────────────────────────────┬──────────────────────────────────────┤
│ FCN = 5.152e+04                  │              Nfcn = 30               │
│ EDM = 2.21e-07 (Goal: 0.0002)    │                                      │
├──────────────────────────────────┼──────────────────────────────────────┤
│          Valid Minimum           │       SOME Parameters at limit       │
├──────────────────────────────────┼──────────────────────────────────────┤
│ Below EDM threshold (goal x 10)  │           Below call limit           │
├───────────────┬──────────────────┼───────────┬─────────────┬────────────┤
│  Covariance   │     Hesse ok     │ Accurate  │  Pos. def.  │ Not forced │
└───────────────┴──────────────────┴───────────┴─────────────┴────────────┘
DEBUG - cabinetry.fit - -2 log(L) = 51518.694084 at best-fit point
INFO - cabinetry.fit - fit results (with symmetric uncertainties):
INFO - cabinetry.fit - sqrt_mu =  5.0000 +/- 0.0013

I can't really expalin why the trial before didn't work.

Working towards the end game of getting a fit.limit plot, adding

limit_results = cabinetry.fit.limit(model, data)

produces the following error:

INFO - cabinetry.fit - calculating 95% confidence level upper limit for sqrt_mu
DEBUG - cabinetry.fit - setting lower parameter bound for POI to 0
INFO - cabinetry.fit - determining observed upper limit
W VariableMetricBuilder No free parameters.
/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/optimize/opt_minuit.py:134: IMinuitWarning: Hesse called with all parameters fixed
  minimizer.hesse()
W VariableMetricBuilder No free parameters.
W VariableMetricBuilder No free parameters.
DEBUG - cabinetry.fit - sqrt_mu = 0.1000, observed CLs = 0.6254
W VariableMetricBuilder No free parameters.
W VariableMetricBuilder No free parameters.
W VariableMetricBuilder No free parameters.
DEBUG - cabinetry.fit - sqrt_mu = 5.0000, observed CLs = 0.5065
ERROR - cabinetry.fit - CLs values at 0.1000 and 5.0000 do not bracket CLs=0.0500, try a different starting bracket
Traceback (most recent call last):
  File "/terabig/muellerr/Leptoquarks/fitting/test_modifier_alex.py", line 56, in <module>
    limit_results = cabinetry.fit.limit(model, data)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/fit/__init__.py", line 959, in limit
    res = scipy.optimize.root_scalar(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/scipy/optimize/_root_scalar.py", line 249, in root_scalar
    r, sol = methodc(f, a, b, args=args, **kwargs)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/scipy/optimize/_zeros_py.py", line 783, in brentq
    r = _zeros._brentq(f, a, b, xtol, rtol, maxiter, args, full_output, disp)
ValueError: f(a) and f(b) must have different signs

But maybe this is taking it a step too far for the moment..?

@alexander-held
Copy link
Member

DEBUG - cabinetry.fit - sqrt_mu = 0.1000, observed CLs = 0.6254
W VariableMetricBuilder No free parameters.
W VariableMetricBuilder No free parameters.
W VariableMetricBuilder No free parameters.
DEBUG - cabinetry.fit - sqrt_mu = 5.0000, observed CLs = 0.5065
ERROR - cabinetry.fit - CLs values at 0.1000 and 5.0000 do not bracket CLs=0.0500, try a different starting bracket

means you need a bigger starting bracket, it has to be chosen such that CLs=0.05 lies within the POI values you are starting with but 5 is too low.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 25, 2023

Hi @alexander-held,
Sorry for the silence, but I can report some progress. I was able to make cabinetry work with custom modifiers. My method was generating a JSON workspace with the normal cabinetry.config.load and then adjusting the JSON normfactor block in the ["channels"][i]["samples"]["modifiers"] block:

{
    "data": null,
    "name": "Signal_norm",
    "type": "normfactor"
},

to:

{
    "name": "f0",
    "type": "customfunc",
    "data": {
        "expr": "sqrt(mu)"
    }
}

However this method is quite hacky and I struggle a lot with minimization errors. You can check my cabinetry setup here with the run_cabinetry.py and the modified workspace betaL33_1p5TeV_allhh_mod.json. I also gave a summery presentation at the high mass ditau meeting here (both links are ATLAS internal).

@alexander-held
Copy link
Member

We can make this easier in cabinetry by supporting it in the config. I would ideally like to first have this merged into pyhf (ideally with a corresponding release), then merge #385, and then merge such support in a separate PR to keep the individual changes more limited in scope. In the meantime I could open another PR for such changes and we can have a branch that tracks the combined changes to cabinetry.

How would you like to be specifying these things in the config? If we can find a good syntax, it should not be too much work to propagate that through for workspace building. Perhaps something like the following?

NormFactors:
  - Name: "mu"
    Samples: "sample_1"
    Nominal: 1
    Bounds: [0, 10]
  - Name: "f0"
    Samples: "sample_2"
    Expression: "sqrt(mu)"

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 26, 2023

That sounds and looks great to me!

If you create your model like this, would the Name: "f0" stay fixed throughout the pyhf workspace? I noticed in my custom JSON workspace that if I have multiple channels and the custom function modifier has the same name e.g. f0 then I get an error:

Traceback (most recent call last):
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 345, in <module>
    run_cabinetry(args["inputs"], args["variables"], args["couplings"], gen_input, plot, limit, ranking, args["custom_modifiers"])
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 146, in run_cabinetry
    model, data = cabinetry.model_utils.model_and_data(ws_mod, validate=False, modifier_set=expanded_pyhf)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/model_utils.py", line 79, in model_and_data
    model = workspace.model(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/workspace.py", line 443, in model
    return Model(modelspec, **config_kwargs)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 771, in __init__
    modifiers, _nominal_rates = _nominal_and_modifiers_from_spec(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/pyhf/pdf.py", line 133, in _nominal_and_modifiers_from_spec
    raise exceptions.InvalidModel(
pyhf.exceptions.InvalidModel: Trying to add paramset customfunc/f0 on LQ sample in rec_smim_vs_Nbjet_HighMassOSs2thh_1bjets channel but other paramsets exist with the same name.

but I guess this will be handled by pyhf as it's not directly related to cabinetry.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Mar 1, 2023

hey @alexander-held,
maybe it's worth updating you with the latest progress.

I managed to implement the custom scaling functions by modifying the workspace (as suggested by you above) as follows. In our analysis we have two terms that need to be scaled differently: interference by g2 and BSM by g2**2. The following code does that in the workspace:

def add_custom_func(workspace, equation="g2**2"):
    """Adds a custom function `equation` to the `pyhf` workspace.

    Note: POI not changed here, change happens in `run_cabinetry`

    Args:
        workspace (dict): model given by `cabinetry.workspace.build(config)`
        equation (str, optional): equation to scale the signal, POI needs to be called `g2`

    Returns:
        workspace (dict): modified workspace
    """

    for channel in workspace["channels"]:
        for sample in channel["samples"]:

            if sample["name"] == "LQ_BSM":
                for i, modifier in enumerate(sample["modifiers"]):
                    if modifier["type"] == "normfactor":
                        sample["modifiers"][i]["name"] = "customfunc_" + channel["name"] + "_BSM"
                        sample["modifiers"][i]["type"] = "customfunc"
                        sample["modifiers"][i]["data"] = { "expr": equation }

            elif sample["name"] == "LQ_Int":
                for i, modifier in enumerate(sample["modifiers"]):
                    if modifier["type"] == "normfactor":
                        sample["modifiers"][i]["name"] = "customfunc_" + channel["name"] + "_Int"
                        sample["modifiers"][i]["type"] = "customfunc"
                        sample["modifiers"][i]["data"] = { "expr": "g2" }
            else:
                continue

    return workspace

(from here)
I use then the following configuration (excerpt, full one):

General:
  Measurement: "rec_smim_vs_Nbjet_HighMassOS"
  HistogramFolder: "outputs/"
  POI: "g2"
  InputPath: "to_be_set_at_execution/{SamplePath}{VariationPath}.root:{RegionPath}"

Regions:
  - Name: "HighMassOSs2thh_0bjets"
    RegionPath: "rec_smim_vs_Nbjet_HighMassOSs2thh_0bjets"
  ...

Samples:
  - Name: "Data"
    SamplePath: "asimov-Sh2211"
    Data: True

  - Name: "LQ_Int"
    SamplePath: "to_be_filled_at_execution"

  - Name: "LQ_BSM"
    SamplePath: "to_be_filled_at_execution"
  ...

Systematics:
  ...

NormFactors:
  - Name: "IntAndBSM"
    Samples: ["LQ_Int", "LQ_BSM"]
    Nominal: 0
    Bounds: [-20, 20]

  ...

with the following code snippet from here:

    ...
            if custom_func:
                ws = add_custom_func(ws)

                inits = ws["measurements"][0]["config"]["parameters"][0]["inits"][0]
                bounds = ws["measurements"][0]["config"]["parameters"][0]["bounds"][0]

                new_params = {
                    "g2": {
                        'inits': (inits,),
                        'bounds': ((bounds[0], bounds[1]),)
                    }
                }
                expanded_pyhf = add_custom_modifier('customfunc', ['g2'], new_params)

                model, data = cabinetry.model_utils.model_and_data(ws, validate=False, modifier_set=expanded_pyhf)
    ...

First results are promising, and testing the individual terms separately also worked.

The only thing I'm uncertain of is the scaling of two input samples with one POI g2. Did I set up the config right in the NormFactors block? Or do you see any other dealbreaker? I got suspicious because I have yield tables with 0.0 $\pm$ 0.0 on the first entry of the Normfactors: Samples: list and 0.0 $\pm$ some value on the second.

@alexander-held
Copy link
Member

Are you removing the IntAndBSM normalization factor dynamically or is that meant to be there? As that is set to a nominal value of 0 and affects your leptoquark signal, you would get pre-fit yields of zero for that signal. If you don't need this normalization factor then you could skip it here, although I just realized that the cabinetry config validation would fail in that case since the minimum number of normalization factors is currently set to one. I don't know what will happen if you skip that validation. We could relax that requirement but in principle you would still want a normalization factor, just right now we cannot directly add the expressions there without having proper support implemented.

yield tables with 0.0 ± 0.0 on the first entry of the Normfactors: Samples: list and 0.0 ± some value on the second

I am not sure what you mean by this list?

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Mar 1, 2023

Are you removing the IntAndBSM normalization factor dynamically or is that meant to be there?

Isn't this adjusted here:

                new_params = {
                    "g2": {
                        'inits': (inits,),
                        'bounds': ((bounds[0], bounds[1]),)
                    }
                }
                expanded_pyhf = add_custom_modifier('customfunc', ['g2'], new_params)

At least in the model workspace I don't see any normalizing factor anymore e.g.:

    ...
                        {
                            "data": {
                                "expr": "g2"
                            },
                            "name": "customfunc_HighMassOSs0tem_0bjets_Int",
                            "type": "customfunc"
                        },
                        {
                            "data": {
                                "expr": "g2**2"
                            },
                            "name": "customfunc_HighMassOSs0tem_0bjets_BSM",
                            "type": "customfunc"
                        },
    ...

but I see a init value (another?) in:

    "measurements": [
        {
            "config": {
                "parameters": [
                    {
                        "bounds": [
                            [
                                -20,
                                20
                            ]
                        ],
                        "inits": [
                            0
                        ],
                        "name": "IntAndBSM"
                    },
    ...

Maybe I'm getting confused by the similar terms..

yield tables with 0.0 ± 0.0 on the first entry of the Normfactors: Samples: list and 0.0 ± some value on the second

I am not sure what you mean by this list?

I mean the list that gets saved under Samples:

NormFactors:
  - Name: "IntAndBSM"
    Samples: ["LQ_Int", "LQ_BSM"]   # <-
    Nominal: 0
    Bounds: [-20, 20]

Depending on the order of the terms in the list, the post-fit yield error is non-zero for the second entry but 0.0 for the first.

@alexander-held
Copy link
Member

alexander-held commented Mar 1, 2023

Isn't this adjusted here:

That is probably the case, at least I would not know where else the modifier disappears. In that case the name "add" is a bit misleading though (cc @kratsg / @lukasheinrich who can presumably confirm).

Having the parameter left over in the config should not be a problem as long as it appears nowhere else in the workspace anyway, so also the init of zero there should not be the problem.

What is the best-fit value for g? I thought before you were talking about the pre-fit yields, but for post-fit it only matters what you get out of the fit. If that is zero then it would explain what you see.

Depending on the order of the terms in the list, the post-fit yield error is non-zero for the second entry but 0.0 for the first.

You can get the samples in the right order from model.config.samples, but the yield tables should also show what each entry corresponds to. Does the behavior change depending on the ordering here? It should not, so I am not sure I understand what you mean.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Mar 2, 2023

What is the best-fit value for g? I thought before you were talking about the pre-fit yields, but for post-fit it only matters what you get out of the fit. If that is zero then it would explain what you see.

It's indeed the post-fit yields, sorry for not specifying earlier.
From the ML fit I get:

g2 = -0.0000 +/- 3.0680

I don't think they change, I couldn't fully test limits etc. due to some troubles with (most likely) our systematics

@alexander-held
Copy link
Member

That parameter being zero explains why you get zero yields for the samples it acts on. I imagine the lower bound for that parameter is also zero? This would make the uncertainty estimate unreliable, so you may want to relax the bounds in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants