-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Describe signal.find_peaks and related functions behavior for NaNs #8705
Conversation
@larsoner These are only small changes to the docstring and I it would be nice to have this included in 1.1.0. |
The second commit addresses the behavior of the functions if NaNs are present in the signal. This came up in #8708. It might be worth discussing whether this needs to be addressed further. It might be worth it to raise a warning message if any NaNs are encountered (a candidate for 1.2?). |
I don't think we need to warn so long as it's documented clearly in the
docstring. Maybe put it first in the notes, and somewhere in an example?
… |
Yeah, I was debating that with myself as well. I'll change it. For the future, is there a general rule of thumb when to use a warning or a simple note? |
f97dbe6
to
ade2b61
Compare
+1 for I don't know of any rule of thumb. You don't want to warn too much or too little. To me this would fall into the "too much" category but this is just my subjective judgment based on expected use cases. |
I decided not to use the I'll have a look and try to come up with an elegant way to deal with NaNs. Otherwise I'll add warning messages if NaNs are found in the input. In the meantime I'm hesitant to introduce any in-code changes here as I don't want to block this from being potentially merged before 1.1 branches. So that would be another PR. |
I am arguing against such a warning so long as the behavior of NaN is properly documented. Are you saying we should have a warning if NaN is in the input? |
... and actually the tests should probably be updated to test that |
Yes, unless I think of another way to deal with NaNs without completely rewriting the algorithms.
I agree although I don't see it as a priority because in most cases the documentation states to "expect the unexpected" if NaNs are contained in the input. I hope nobody tries to use that as a feature. 😉 |
Since the comparisons always evaluate to False, isn't |
Not really. If we compare a finite number to nan it is both "not smaller", "not larger" and "not equal" at the same time. That means depending on the comparator chosen at different points in the algorithm nan is treated as +inf or -inf. Replacing with either one of those values would change the behavior. In some cases this might actually be useful but your proposed case we don't gain anything. If we replace with +inf, we find new peaks at those positions. If never return those inf-peaks we don't gain anything compared to just leaving them nan because the neighbouring samples still can't be peaks. (Or did I misunderstand your suggested solution?) |
I'm not proposing a solution -- I'm trying to describe the behavior of the current algorithm for the user when their input has NaNs. If it can be distilled to something equivalent that is simple to explain to the user then we can just clearly document this behavior rather than emitting a warning. So something like "Locations with the value NaN will be treated like It might not be possible to distill the other functions like |
Well in case you didn't see it, I think the documentation for find_peaks already states almost what you want:
|
Correct, those are more complicated to judge. That's why my warning in those cases was a bit more general. |
Ahh okay. In that case, can you make a PR to add warnings for those other functions? I agree it makes sense to have them. It would be good to have for 1.1.0 if possible. Adding them, then using |
I'll try. But if today is the deadline I have a feeling there won't be enough time left to properly review & iterate... |
@larsoner On another note... do you consider this ready to be merged today? |
I can rapidly iterate today if you can :) But more seriously 1) I don't think it should be too difficult or require too much iteration since it's warning emission, and 2) I'd also consider it a form bug fix which means that tomorrow isn't a hard deadline (can backport). |
The updated docstring says
Is this the API that you designed, or is it an accident of the implementation (i.e. the code was not written with NaNs in mind, but if there are NaNs in the data, that is how it happens to be treated)? If the latter, it might be safer to simply state in the docstring that "the behavior of the function with data containing NaNs is undefined". That way there is no commitment to an API. Then if explicit handling of NaN is added later, there is no need to worry about backwards compatbility. |
This makes sense to me. I was probably over-engineering a bit before. I think we can just say this in all the functions for now. @WarrenWeckesser any opinion on |
Yes, I didn't have NaNs in mind when writing those functions so its just behavior resulting from the implementation.
I agree, here. |
No directive is probably fine, but that's not a strong opinion. |
To avoid having to wait for CI to build the doc, here the images with |
Looks good to me |
The behavior of these functions is undefined for data containing NaN. This should at least be clearly documented with a warning in the docstrings. In the future support for NaNs might be added.
as well as a typo I noticed in THANKS.txt.