-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: raise error trying to coerce object arrays containing timedelta64('NaT') to StringDType #26024
Conversation
You could even use a Python set in principle. But then you have to build it on the first call or during module init. For best speed a bloom filter :). But no, I don't think we have a super short and specific pattern here. |
no real suggestion, but the top of the function seems a bit duplicated with numpy/numpy/_core/src/common/get_attr_string.h Lines 7 to 37 in dda030f
Given short-circuiting, at least writing it as a series of p.s. Really getting off-topic, but also somewhat duplicated with numpy/numpy/_core/src/multiarray/dtypemeta.c Lines 830 to 860 in dda030f
|
Good point, |
Good call about all these similarities. The stringdtype code predates adding this to numpy. There was definitely some copy/pasting that can be reverted now that the code lives in numpy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes itself look good! (And maybe best to do any refactoring elsewhere!)
Maybe.. but I went ahead and did it before I saw this comment 🙃. Was simple enough though... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That python_builtins_are_known_scalar_types
name is not brilliant (is_python_version_of_scalar_type
might be better), but that's really out of scope here... So, approving...
No idea why the sanitizer job is filling its log with |
Yeah, that's #25875 which seems to have gotten particularly annoying today for some reason. I disabled that action and it shouldn't show up in future CI logs until we figure out what's wrong and re-enable it. |
Indeed, seems good to disable - though hopefully the alternatives work - it caught few bugs in my FFT gufunc implementations! |
Thanks, lets put this in. I just remembered we actually do have a faster version for finding the builtin scalar types I think (it sorts them and does a binary search for finding them when you do But let's fix the bug for now. |
Without the change to
stringdtype_is_known_scalar_type
, the second iteration of the for loop in the test would fail. Datetime already fails, so this just makes timedelta consistent with that.I discovered this running the pandas test suite against my pandas string dtype built on StringDType.
@seberg is there maybe a better way to write
is_known_scalar_type
than using a bunch of if statements?