-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#330 add checks for metrics #358
#330 add checks for metrics #358
Conversation
Maybe we need to check that there are no Nans per row? I think that would be a good start! 😄 |
…y over all values
Awesome @LacombeLouis !! Thanks for the idea. |
Codecov ReportAll modified lines are covered by tests ✅
📢 Thoughts on this report? Let us know!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR ! Great job :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! It's great!
Thinking out loud here: do you think it would be smart to add a check for the metrics where we expect a score between 0 and 1. @JumpingDino @vincentblot28 @thibaultcordier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent PR, thank you for your contribution! A few suggestions to refine your code:
- Replace all
NDArray
withArrayLike
. - Delete some line breaks.
You can use the functionality directly on GitHub to commit/push directly if you wish.
Hi people, actually with the actual code we coverage all the steps, however when I change the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JumpingDino,
- Could someone explain the major differences of ArrayLike and NDArray as an argument please? Maybe you can give some reference and/or ideas of benefits of using the ArrayLike type.
ArrayLike
: https://numpy.org/devdocs/reference/typing.html#numpy.typing.ArrayLike
- A Union representing objects that can be coerced into an ndarray.
NDArray
: https://numpy.org/devdocs/reference/typing.html#numpy.typing.NDArray
- Can be used during runtime for typing arrays with a given dtype and unspecified shape.
ArrayLike
is more generic than NDArray
except that ArrayLike
must be cast in NDArray
to be used. In the end, we use the parameters as an Numpy array. Sometimes, you may have to manipulate pandas type arrays, which is why we want to take into account all types of array.
When I see the error you encountered, I think we can make the assumption that the parameters are already of type NDArray
(if this is not the case, we can simply cast it with numpy and use the check_array
function test to check).
I've suggested a modification. You can add it directly in GitHub (you have a function that lets you commit suggestions directly in the browser).
- Do you think it makes sense to implement these sanity checks (check_nan, inf, length) in a single function? By this way I expect to have less verbosity over our metrics.py functions, what do you think?
Indeed, it could make sense to implement them in a single function. But as it stands, your contribution fulfils the desired objective. You are free to make this change if you wish.
fixing typos Co-authored-by: Thibault Cordier <124613154+thibaultcordier@users.noreply.github.com>
fixing docstring from check_array_nan Co-authored-by: Thibault Cordier <124613154+thibaultcordier@users.noreply.github.com>
fixing docstring from check_array_inf Co-authored-by: Thibault Cordier <124613154+thibaultcordier@users.noreply.github.com>
fixing docstring from check_arrays_length Co-authored-by: Thibault Cordier <124613154+thibaultcordier@users.noreply.github.com>
I think we're good to go right? What do you think? I'm available for any refinements! and thanks for the knowledge :) |
I agree with you, we could finish this PR and open a new issue for more advanced checks.
Yes, just one more turn: can you add a new line in HISTORY.rst specifying that you are adding new checks, and a new line in CONTRIBUTING.rst adding your name (if you wish)? I'll be approving your PR next :) |
Description
Checks are good for metrics calculation. It's a good idea to assure the vectors (y_pred, y_probs) are the same size and don't have NaNs or infs. For this, two functions were created:
Fixes #330
Type of change
Create some checks on
utils.py
and these functions are used in the metrics in metrics.pyHow Has This Been Tested?
Creation of tests in
test_utils.py
Checklist
make lint
make type-check
make tests
make coverage
make doc