-
Notifications
You must be signed in to change notification settings - Fork 28
Question on behavior of weights and variances #97
Comments
The confusing thing about this (I believe this is what you're hitting) is that the TH1 is a NumPy array. That array corresponds to the values (bin contents) of the histogram. It might also have a The issue is that ROOT's TH1 is a subclass of TArray, and in Uproot 3 I made TArray a NumPy array subclass. This confusion motivated a less direct modeling of C++ classes in Uproot 4, in which Python objects representing C++ objects contain their superclasses instead of inheriting from them. Not only did this introduce the strange semantics in which the histogram values are not an attribute; they are the object, but it also made other subclasses of TH1, such as TProfile, have incorrect At least, that's what I think you're running into above. |
The problem is this line: out._fSumw2 = valuesarray ** 2 Which sets fSumw2 assuming that the values in the ndarray were made with a single fill with a large weight, rather than |
Hm, it seems like this will always happen though given that it is present here using only # uproot3_only.py
import numpy as np
import uproot3
if __name__ == "__main__":
bins = np.arange(0, 8)
counts = np.array([2 ** x for x in range(len(bins[:-1]))])
with uproot3.recreate("test.root", compression=uproot3.ZLIB(4)) as outfile:
outfile["data"] = (counts, bins)
root_file = uproot3.open("test.root")
hist = root_file["data"]
print(f"hist._fSumw2 ({type(hist._fSumw2)}): {hist._fSumw2}")
print(f"hist values ({type(hist.values)}): {hist.values}")
print(f"hist variances ({type(hist.variances)}): {hist.variances}")
So won't it always have
and then the variances are set from it? :/ I may be missing something obvious, but from what Henry has pointed out it seems like this will always be the case if you're writing a histogram from numpy arrays. |
@henryiii to be clear, is this just boiling down to whether one needs to do |
I think I now understand that this is also highlighting a difference in how ROOT and # example_pyroot.py
import uproot
from ROOT import TH1F, TFile, kFALSE, kTRUE
def write_root_file(filename):
n_bins = 7
hist = TH1F("data", "", n_bins, 0, n_bins)
hist.SetStats(kFALSE)
for bin_idx in range(0, n_bins):
hist.SetBinContent(bin_idx + 1, 2 ** bin_idx)
hist.Sumw2(kTRUE)
write_file = TFile(filename, "recreate")
hist.Write()
write_file.Close()
if __name__ == "__main__":
write_root_file("pyroot_output.root")
root_file = uproot.open("pyroot_output.root")
hist = root_file["data"]
print(f"hist has weights: {hist.weighted}")
print(f"hist values: {hist.values()}")
print(f"hist errors: {hist.errors()}")
print(f"hist variances: {hist.variances()}") gives
I need to think on this more, as I obviously don't understand the problem as well as everyone else, but would changing the behavior of |
The call If you have a bin with 5 entries, the usual thing to do is assume that it was filled 5 times, so |
I think the above answers @kratsg's question - yes, when it says "sumw2", it's each weight that's squared, not the sum of them, so setting this with |
@jpivarski If you're happy with the solution Henry poses above I have a branch on |
Sorry—just catching up. Yes: this sounds like a good solution, though I'll understand it better when I see it as a diff in a PR. Thanks! |
From the discussions on Gitter today motivated by matthewfeickert/heputils#24 I think I am confused by the behavior of
uproot3-methods
treatment of weights and variances. Here is a short exampleAs
help
for.errors
givesI think(?) this is due to the behavior of
uproot3-methods/uproot3_methods/classes/TH1.py
Lines 347 to 353 in b722ee6
where it seems that the weights are taken to be the values of the NumPy array — this is not what I would have expected.
What is the proper way to create an
uproot3
TH1
with Poisson uncertainties?The text was updated successfully, but these errors were encountered: