-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve error message for unsupported datetime
dtypes in Pandas extra
#3518
Comments
I'm currently working on fixing this issue and wanted to ask for some guidance. I ran the example in the stack overflow question and got the following error:
I noticed that there was a deprecation message being sent out from the "extra/pandas/impl.py" file, specifically in a function called "elements_and_dtype." I was wondering: would it be appropriate to place a "dtype=pd.datetime"-specific deprecation notice in this area? This is what the function looks like: hypothesis/hypothesis-python/src/hypothesis/extra/pandas/impl.py Lines 60 to 99 in 173de52
Additionally, the place where the error is actually raised is from "extras/numpy.py" in an area that looks like it's checking the different numpy "dtype.kind" properties of the given dtype. Would it be appropriate to place a "dtype=pd.datetime"-specific error message here? My thinking is that the deprecation notice that I mentioned above should be enough for the user to deduce that they may have used a deprecated datetime object as the dtype, but in your raised issue message you said that:
Does this mean to do it in the raised error message? There is another piece that I want to explore which is the "try_convert" method from the "hypothesis.internal.validation" part of the program but I wanted to reach out for guidance just in case I'm overcomplicating things (as I tend to do). Thanks! |
In this StackOverflow question, the asker notes that
column('timestamp', dtype=pd.datetime)
doesn't actually create a datetime column in the generated dataframe. This turns out to be a fairly general issue: passingdtype=datetime.datetime
will actually create an object column too, and even passingdtype="datetime64"
is problematic as Hypothesis will choose various time-resolution units. I think we should therefore:dtype="datetime64[ns]"
as part of the error messagenote_deprecation(...)
if we can detect this and it wasn't previously an errornote_deprecation(...)
if the time-resolution unit is notns
- others are valid in Numpy but not Pandaspd.Timedelta
and related dtype arguments (see docs here)The text was updated successfully, but these errors were encountered: