-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fisher's exact #373
Add fisher's exact #373
Conversation
1aedcb0
to
e6c8e72
Compare
* rework data drift metrics * fix format and imports * fix notebooks * add empty check after data clean for drift + some refactoring * fix imports * add threshold for DatasetDriftMetric add tails in DatasetDriftMetric visual * refactor data drift * refactor data drift * add tests for DatasetDriftMetric * fix checks and titles for drift * fix style * update title in ColumnDriftMetric * implement columns for DatasetDriftMetric and DataDriftTable * fix data structure and json output for DataDriftTable * fix data structure and json output for DatasetDriftMetric * fix after main merge * fix with black
* add reworked ColumnRegExpMetric * move ColumnRegExpMetric to a separate module, fix visual, add unittests * fix table in html view, update an example * fix ColumnRegExpMetric import in notebooks * fix notebook imports * add tabs for ColumnRegExpMetric * fix after main merge * fix after main merge * fix imports with isort
Added some examples of tests and test presets usage Removed outdated example with metrics
* fix warning about duplicated columns in correlation calculation in data drift * make a new list, do not modify num_feature_names
add fisher's exact, its test and doc
Hey @SangamSwadiK, The test looks very good! It is designed to work with binary data, and I have a corner case, which I believe worth supporting.
Could you please take a look and support this scenario? I also suggest to raise a ValueError that will appear if you have more than 2 unique values in the input data and explicitly tell the user that the test should be applied to the binary data only. I think it might be more straightforward comparing to |
d38be57
to
9e4eea2
Compare
Hi ! @emeli-dral Here's a summary of changes I made.
Please let me know if any changes are needed or if I missed something. Thanks for reviewing ! |
Hi @SangamSwadiK, Thank you so much for changes you made! I have a question about corner cases we discussed before.
I see that there are no fails, which is great! But I'm a little surprised that we do not detect a drift here. |
i.e if there are multiple zeros in the contingency table (in the format below) eg :
I think these cases have to be handled explicitly with some |
|
I think there were some bugs 😅 |
88c4575
to
a09e386
Compare
Hi @SangamSwadiK, I have a suggestion how to act on it. Indeed the test itself will not compute in case we have only 1 category in reference, and 1 (different) category in current. However this might indeed happen in practice when you deal with binary features. From the user standpoint, when this happens, it is something you want to know about. If all values are different between reference and current, we can safely assume that the feature has drifted. I would suggest to implement this case separately. Instead of returning the value error, we can return the “drift detected” outcome when all values are different (without actually computing the statistical test). |
I am also labelling the issues as HF-accepted to acknowledge all the work done! |
Hi @emeli-dral !! Apologies to keep tagging you. I think the tests cover the case when we have 1 category in reference, and 1 (different) category in current. I thought of handling cases where it is different, explicitly, but since the test covered it I stopped there. I didn't write them explicitly because the p-values vary a lot with different lengths of both reference and current even though they are flipped entirely This is because of permutation in fisher's test, the p-values are affected by the permutations of the table. Please check the below examples ;
I previously had thought of implementing Hamming Distance or complement to cover the cases explicitly as you just suggested. Since the p-values differ based on the length of the input arrays. Thanks for reviewing!! |
Hi @SangamSwadiK , I checked the latest version and I like it a lot! "I didn't write them explicitly because the p-values vary a lot with different lengths of both reference and current even though they are flipped entirely" - I agree with your approach. I think it is better to return p-value here, it might be useful for something like drift size comparison, when users need to order features by drift size and act on top of this order. Thank you a lot for your contribution! I'm happy to merge 🥳 @elenasamuylova this is a very nice example of how pytests should be implemented. |
Issue:
#345 (Add fisher's exact test for binary data)
What does this implement/fix ?
This adds fisher's exact test, its doc and necessary changes in the How to examples.
cc @mertbozkir