New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example Notebook #29
Example Notebook #29
Conversation
Hi @hammannr |
Hey @WolfXeHD , thanks a lot for checking! I wrote a sentence about how to use mybinder. The link of the badge will always open the git head of the repo (which I think makes sense). To open a specific branch you can specify it in the link. For the mybinder branch, you can e.g. just open https://mybinder.org/v2/gh/XENONnT/GOFevaluation/mybinder . Cheers! |
Hi @hammannr
This code snippet I would either make part as a function in the notebook or in the utils of the package as it might be useful for users as well. When thinking about it, the Is it expected that for the binned GOF-tests in the example, one gets a p-value of 0? Should we maybe fix some random seed when generating data? That's it for a quick first look. |
Hey @WolfXeHD , thanks a lot for catching this! No you are absolutely right, we should not expect a p-value of zero for the parent model.. I screwed up the transposing, reshaping and inverting of matrices. Now I think everything should be correct and the test produces a reasonable p-value. I put the alpha of the reference samples to 1 and only plot the first 1000 so one can see what's going on in the plot. I'm not so sure about the code snippet. It might be misleading as it only evaluates the pdf at the center of each bin. This is fine for a granular binning but introduces systematics in coarse binnings with large gradients of the PDF. I would prefer if the user takes care about getting the inputs right him/herself but we can discuss that! The conventions for the dunder methods that I tried to follow here is that You mean fixing the random seed of the toy data used to get the empirical distribution of the test statistic under the null hypotheis? I'm afraid that this will give rise to artifacts and if your statistics is sufficiently large, the p-values should be the same anyways. I want to look into the binomial error of the p-value in the future. Returning this together with the p-value will probably be a better option than just artificially removing the fluctuation I think. Cheers, |
Hi @hammannr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good!
@@ -84,27 +95,38 @@ gof_object.get_gofs(d_min=d_min) | |||
# OUTPUT: | |||
# OrderedDict([('ADTestTwoSampleGOF', 1.6301454042304904), | |||
# ('KSTestTwoSampleGOF', 0.14), | |||
# ('PointToPointGOF', 0.00048491049630050576)]) | |||
# ('PointToPointGOF', -0.7324060759792504)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does only PointToPointGOF change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of the test statistic was changed in #25 but I forgot to adapt the reader accordingly.
|
||
# PointToPointGOF | ||
# gof = -0.7324060759792504 | ||
# p-value = 0.128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New output is great!
@@ -2,5 +2,3 @@ numpy | |||
scipy | |||
sklearn | |||
matplotlib | |||
flake8 | |||
pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's nice to get rid of these dependencies as we only need them for CI checking.
This PR adds an example notebook that can be used as a guide for using the package for the first time.
Additionally a few more small changes:
Improve the
__str__
method of theGOFTest
class so thatprint(gof_test)
provides a more readable result like:Add mybinder to the repo. This way one can give the package a try without installing it
Include coveralls to monitor code coverage (and improve it in the future)
Make linting a bit pickier to make sure code remains readable in the future (and remove my formatting sins from the past..)
Update the readme