Integrating systematics handling into nanoevents #529

lgray · 2021-05-13T20:41:32Z

Sketching out ideas for rich/extensible/efficient systematics handling in nanoevents.

note / warning: This is not meant to suggest a buy-in framework for systematics, but rather to engineer a naming convention for extensions by which systematics can be added to an analysis. This naming convention is then assumed by statistics tools to create variations and ensembles expecting common data. What is here right now is attempting to get to the point where a schema can be defined (doing this beforehand is self defeating). So while it may look like I'm writing a deeply opt-in framework, what I am opting-in to is agreeing to an interface that we should put systematics in a field in an physics object or event called "systematics". I am not trying to extend nanoevents such that everyone has to use it forever to calculate systematics. I am trying to alter nanoevents to suit the naming convention we are solving for.

New methods class: Systematic. If an object has systematics associated to it then it should inherit from this.

Also adds systematics subdir to nanoaod.methods, there is a simple "Up/Down" systematics class there right now.

Right now NanoEvents, and then the NanoAOD object classes.

It may make more sense to tie it directly to NanoCollection so that everything gets it.

There is one special keyword so far weight which tells the systematics to append weight_SystematicName to the class instead of altering the value of a feature in the object. It's clunky right now but it gets the job done correctly.

I'll try to keep systematics.ipynb working while I hack on this.

Guiding ideas:

This is to index and evaluate systematics in a numpythonic way
"fancy indexing for systematics"
specification for an array form that you can build any systematics evaluation / ensemble building machinery on top of.

lgray · 2022-02-24T18:00:16Z

To finish and merge:

improve speed of eventwise weights (right now it needs to read a lot for ... reasons?)
documentation

lgray · 2022-03-07T22:08:28Z

@nsmith- can you take a look?

lgray · 2022-03-10T14:29:33Z

@nsmith- ping?

lgray · 2022-03-11T04:14:42Z

@andrzejnovak too

andrzejnovak · 2022-03-11T16:34:33Z

Neat! Two UX questions. Should there be a specific one way syst option rather than filling up or down with nominal for example? Could add_systematics also take directly 2XN array or is it only function such that it can be lazy?

lgray · 2022-03-11T17:20:20Z

@andrzejnovak

So UpDownSystematic is meant to be an example and it's possible to implement any set of variations as a systematic when starting from the Systematic base class. I guess the real answer to this is I should include more example classes!
You could accomplish that by passing a lambda that returns the arrays, but yeah it's all built around lazy functions as it is implemented right now.

lgray · 2022-03-23T20:46:15Z

@nsmith- ping

nsmith-

I think it looks good, modulo users creating reference cycles in the user-supplied variation function.
I'm a bit confused how the UI looks w.r.t. looping over the weight systematic variations. I expected some function that multiplies all the nominal weights together, but it looks like here we only track up/down.

coffea/nanoevents/methods/base.py

nsmith- · 2022-03-25T20:00:06Z

coffea/nanoevents/methods/systematics/UpDownSystematic.py

+        )
+
+        self["__systematics__", f"__{name}__"] = awkward.virtual(
+            varying_function,


I'm worried it will be very hard for user-provided functions to not reference the array this virtual array is being attached to, and hence memory leaks.

Even though it's not a complete awkward schema, should we put the systematics instead in a dictionary or functor that's returned to the user instead?

Usually the pattern as it is right now almost always asks that you're referencing/altering an array that's in the object you're adding a systematic to.
The corrections or variations themselves are likely not a part of the record array though.

the what specification in add_systematic is usually some value in the record array. The input variations are not.

should we put the systematics instead in a dictionary or functor that's returned to the user instead

Can you sketch what this would look like for a user? I didn't quite follow

coffea/util.py

lgray marked this pull request as draft May 13, 2021 20:49

lgray force-pushed the systematics_work branch from 57e1e46 to cd7741c Compare February 24, 2022 17:57

lgray marked this pull request as ready for review February 24, 2022 17:59

lgray added 9 commits March 7, 2022 11:25

first working version of systematics toy schema

686f264

move systematics into nanoevents

31d2f95

move systematics into nanoevents, add to physics objects

070824b

correct up/down for resolution example

dfc130f

point to correct location of input file

1527893

weights and event weights

5a78301

fix errors in systematics notebook

3af6205

lint

46c1c49

do not go through the class for kinds

e9edfd7

lgray force-pushed the systematics_work branch from cd7741c to e9edfd7 Compare March 7, 2022 17:26

lgray added 2 commits March 7, 2022 14:29

systematics on event level variables are now fast

4dbe842

at least add docstrings

d4b6c07

lgray requested a review from nsmith- March 7, 2022 22:12

lgray changed the title ~~[WIP] Integrating systematics handling into nanoevents~~ Integrating systematics handling into nanoevents Mar 7, 2022

Merge branch 'master' into systematics_work

7b99ff1

nsmith- requested changes Mar 25, 2022

View reviewed changes

address comments

4b059dd

nsmith- approved these changes Apr 1, 2022

View reviewed changes

Merge branch 'master' into systematics_work

9f8be91

lgray merged commit 198dd99 into master Apr 1, 2022

nsmith- deleted the systematics_work branch April 14, 2022 21:24

alexander-held mentioned this pull request Apr 16, 2022

Object shape and values changing with nanoevents weight systematics #661

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating systematics handling into nanoevents #529

Integrating systematics handling into nanoevents #529

lgray commented May 13, 2021 •

edited

lgray commented Feb 24, 2022 •

edited

lgray commented Mar 7, 2022

lgray commented Mar 10, 2022

lgray commented Mar 11, 2022

andrzejnovak commented Mar 11, 2022

lgray commented Mar 11, 2022

lgray commented Mar 23, 2022

nsmith- left a comment

nsmith- Mar 25, 2022

lgray Mar 31, 2022

nsmith- Apr 6, 2022

Integrating systematics handling into nanoevents #529

Integrating systematics handling into nanoevents #529

Conversation

lgray commented May 13, 2021 • edited

lgray commented Feb 24, 2022 • edited

lgray commented Mar 7, 2022

lgray commented Mar 10, 2022

lgray commented Mar 11, 2022

andrzejnovak commented Mar 11, 2022

lgray commented Mar 11, 2022

lgray commented Mar 23, 2022

nsmith- left a comment

Choose a reason for hiding this comment

nsmith- Mar 25, 2022

Choose a reason for hiding this comment

lgray Mar 31, 2022

Choose a reason for hiding this comment

nsmith- Apr 6, 2022

Choose a reason for hiding this comment

lgray commented May 13, 2021 •

edited

lgray commented Feb 24, 2022 •

edited