-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RF] Range not considered when computing integral of RooParamHistFunc #7182
Comments
The same problem exists also for the RooHIstFunc. |
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue root-project#7182. We still need to implement the range feature in RooHistFunc.
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue root-project#7182. We still need to implement the range feature in RooHistFunc.
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue root-project#7182. We still need to implement the range feature in RooHistFunc.
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue root-project#7182. We still need to implement the range feature in RooHistFunc.
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue root-project#7182. We still need to implement the range feature in RooHistFunc.
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue root-project#7182. We still need to implement the range feature in RooHistFunc.
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue #7182. We still need to implement the range feature in RooHistFunc.
While working on this issue, I noticed that also // g++ $(root-config --cflags) -o testHistPdf testHistPdf.cc $(root-config --libs) -lRooFitCore -lRooFit -g
#include "RooRealVar.h"
#include "RooHistPdf.h"
#include "RooDataHist.h"
#include "TH2D.h"
#include "TF2.h"
using namespace RooFit;
int main(int argc, char const *argv[]) {
RooRealVar x("x","x",0, 10);
RooRealVar y("y","y",0.05);
TH2D h2("h2","h2",20,0,10, 30, 0, 10);
TF2 f2("f2","y < 0.1");
h2.FillRandom("f2",1000);
RooArgSet argSet{x, y};
RooDataHist dh("dh","dh",argSet,&h2);
RooHistPdf phf("phf","",argSet,dh);
x.setRange("R1",0,5);
y.setRange("R1",0,10);
auto int1 = phf.createIntegral(argSet,argSet);
std::cout << int1->getVal() << std::endl;
auto int2 = phf.createIntegral(argSet,argSet, "R1");
std::cout << int2->getVal() << std::endl;
auto int3 = phf.createIntegral(x,x);
std::cout << int3->getVal() << std::endl;
auto int4 = phf.createIntegral(x,x,"R1");
std::cout << int4->getVal() << std::endl;
return 0;
} Obviously the integral of a constant function if you only take half of the x range should be half of the value you get for the full range, even if there is an additional variable in the slice set. However, this is what the program above outputs in ROOT master:
The first 3 integrals are correct, but the final one (custom range for x and slice of y) gives the wrong result (0.493 expected). I thought about how the integration should be done correctly for all of the |
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume As the documentation of RooDataHist::sum() states, the bin volume is "the M-dimensional bin volume, (M = N(sumSet))". So for the bin volume, one has take the product of the bin widths of each dimension in the sum arg set, and apparently not the slice dimenstions. This seems to be wrongly implemented in two overloads of `RooDataHist::sum()`. The second overload (separate sumSet and sliceSet but no custom ranges) calculates the bin volume of the sliceSet as opposed to the sumSet. The third overload with the custom ranges calculates the full volume considering all arguments, not only the sumSet. After implementing this correctly, `RooHistFunc::analyticalIntegral()` gives the correct results, but `RooHistPdf` is broken in the case where argSet is only a strict subset and there are no custom ranges (second overload of `RooDataHist::sum()`). This can be fixed by disabling a normalization of 1 over the bin volume that was only enabled in this case for reasons that are not clear to me. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now. Further improvements could be the reduction of code duplication and implementation of unit tests for the various analytical integral configurations.
After a0fa4fa, this fixes the remaining analytical integral problems with the RooHistPdf/RooHistFunc/RooParamHistFunc family that are reported in issue root-project#7182 and Jira ticket ROOT-7413. To use the same code for all three classes, the logic of RooDataHist::sum() that is already used for RooHistPdf needs to be changed/corrected in the following ways:
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
A new testRooParamHistFunc was introduced. The analytic integration of a RooParamHistFunc is tested both for trivial and non-trivial parameters, since the integration over subranges was problematic (as reported in issue root-project#7182).
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue #7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue #7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue #7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue #7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
The RooParamHistFunc story continues. It was noticed that the range is still not considered in the integral when you clone the integral just like in this example: void testRooParamHistFunc2() {
using namespace RooFit;
RooRealVar x("x","x",0,10);
x.setRange("R1",0,5);
TF1 f1("f1","1");
TH1D h1("h1","h1",10,0,10);
h1.FillRandom("f1",50);
TH1D h2("h2","h2",10,0,10);
h2.FillRandom("f1",50);
RooDataHist dh1("dh1","dh1",x,&h1);
RooDataHist dh2("dh2","dh2",x,&h2);
RooParamHistFunc ph("ph","",x,dh1);
RooUniform uni("uni", "uni", RooArgList(x));
RooRealVar frac("frac", "frac", 0.5, 0.0, 1.0);
RooRealSumPdf model{"model", "model", ph, uni, frac};
auto fitResult = ph.fitTo(dh2, PrintLevel(0), Save());
auto integral = model.createIntegral(x,x,"R1");
auto integralClone = static_cast<RooAbsReal*>(integral->cloneTree());
cout << integral->getVal(x) << endl;
cout << integralClone->getVal(x) << endl;
} So this issue can't be closed yet. |
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
A new testRooParamHistFunc was introduced. The analytic integration of a RooParamHistFunc is tested both for trivial and non-trivial parameters, since the integration over subranges was problematic (as reported in issue root-project#7182).
A new testRooParamHistFunc was introduced. The analytic integration of a RooParamHistFunc is tested both for trivial and non-trivial parameters, since the integration over subranges was problematic (as reported in issue #7182).
The logic for summing over histogram bins in different ranges used in RooHistPdf is also implemented in RooParamHistFunc. This means the range is now considered when computing integrals of RooParamHistFunc. RooParamHistFunc allows you to scale the counts in each bin with a parameter. The interface of RooDataHist::sum was extended with a function parameter to inject the logic of scaling the bin weight depending on the bin index. This commit partly fixes issue root-project#7182. We still need to implement the range feature in RooHistFunc.
After a0fa4fa, this commit fixes the remaining analytical integral problems with the RooHistFunc reported in issue root-project#7182 and Jira ticket ROOT-7413. The starting point of this commit is the following observation: usage of RooHistPdf is far more common than RooHistFunc and its analytical integration code supports more special cases. So one can simply take the hopefully bug free analytical integration code from RooHistPdf and substitute it for the existing code in RooHistFunc. However, there is one change that needs to be made when copy-pasting: the `correctForBinSize` in `RooDataHist::sum()` needs to be enabled because integrating the pdf and the function described by the histogram shape is slightly different: - RooHistPdf: the integral of an empirical PDF is simply the integral of histogram counts divided by a normalization factor - RooHistFunc/RooParamHistFunc: here, the bin counts need to be multiplied by the bin volume Code duplication is avoided by moving the integration code into static functions in `RooHistPdf` that `RooHistFunc` can also use. With all these changes, the problems reported in GitHub issue root-project#7182 and Jira ticket ROOT-7413 are fixed. Furthermore, the bin volume normalization capabilities of `RooDataHist::sum()` are also used in `RooParamHistFunc` now.
A new testRooParamHistFunc was introduced. The analytic integration of a RooParamHistFunc is tested both for trivial and non-trivial parameters, since the integration over subranges was problematic (as reported in issue root-project#7182).
The range is not used in RooParamHIstFunc::analyticalIntegralWN, see https://root.cern/doc/master/RooParamHistFunc_8cxx_source.html#l00247
as reported in https://root-forum.cern.ch/t/createintegral-giving-wrong-results/43508.
Simple code to reproduce:
The text was updated successfully, but these errors were encountered: