Skip to content
This repository was archived by the owner on Jan 27, 2023. It is now read-only.
This repository was archived by the owner on Jan 27, 2023. It is now read-only.

Question on behavior of weights and variances #97

Closed
@matthewfeickert

Description

@matthewfeickert

From the discussions on Gitter today motivated by matthewfeickert/heputils#24 I think I am confused by the behavior of uproot3-methods treatment of weights and variances. Here is a short example

$ cat requirements.txt 
uproot~=4.0.6
uproot3~=3.14.4
$ pip list | grep "uproot"
uproot            4.0.6
uproot3           3.14.4
uproot3-methods   0.10.0
# issue.py
import numpy as np
import uproot
import uproot3

if __name__ == "__main__":
    bins = np.arange(0, 8)
    counts = np.array([2 ** x for x in range(len(bins[:-1]))])
    with uproot3.recreate("test.root", compression=uproot3.ZLIB(4)) as outfile:
        outfile["data"] = (counts, bins)

    root_file = uproot.open("test.root")
    hist = root_file["data"]

    print(f"hist has weights: {hist.weighted}")
    print(f"hist values: {hist.values()}")
    print("\nvariances are square of uncertainties\n")
    print(f"hist errors: {hist.errors()}")
    print(f"hist variances: {hist.variances()}")
    assert hist.variances().tolist() == np.square(hist.errors()).tolist()
    print("\nbut errors are not sqrt of values\n")

    print(f"expected errors to be: {np.sqrt(hist.values())}")
$ python issue.py 
hist has weights: True
hist values: [ 1  2  4  8 16 32 64]

variances are square of uncertainties

hist errors: [ 1.  2.  4.  8. 16. 32. 64.]
hist variances: [1.000e+00 4.000e+00 1.600e+01 6.400e+01 2.560e+02 1.024e+03 4.096e+03]

but errors are not sqrt of values

expected errors to be: [1.         1.41421356 2.         2.82842712 4.         5.65685425
 8.        ]

As help for .errors gives

Help on method errors in module uproot.behaviors.TH1:

errors(flow=False) method of uproot.dynamic.Model_TH1I_v3 instance
    Args:
        flow (bool): If True, include underflow and overflow bins before and
            after the normal (finite-width) bins.
    
    Errors (uncertainties) in the :ref:`uproot.behaviors.TH1.Histogram.values`
    as a 1, 2, or 3 dimensional ``numpy.ndarray`` of ``numpy.float64``.
    
    If ``fSumw2`` (weights) are available, they will be used in the
    calculation of the errors. If not, errors are assumed to be the square
    root of the values.
    
    Setting ``flow=True`` increases the length of each dimension by two.

I think(?) this is due to the behavior of

valuesarray = numpy.empty(len(content) + 2, dtype=content.dtype)
valuesarray[1:-1] = content
valuesarray[0] = 0
valuesarray[-1] = 0
out.extend(valuesarray)
out._fSumw2 = valuesarray ** 2

where it seems that the weights are taken to be the values of the NumPy array — this is not what I would have expected.

What is the proper way to create an uproot3 TH1 with Poisson uncertainties?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions