Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mean() for Timestamps. #21583

Closed
sorenwacker opened this issue Jun 22, 2018 · 8 comments
Closed

Support mean() for Timestamps. #21583

sorenwacker opened this issue Jun 22, 2018 · 8 comments
Assignees
Labels
Closing Candidate May be closeable, needs more eyeballs Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Enhancement Numeric Operations Arithmetic, Comparison, and Logical operations Reduction Operations sum, mean, min, max, etc.

Comments

@sorenwacker
Copy link

Fitting a linear model to Timestamps currently does not work out of the box.
and pandas does not support the ols function any further. When using e.g. seaborn to make a fit it throws an error.

TypeError: reduction operation 'mean' not allowed for this dtype

I wonder whether the dtype can be change so that mean is supported? And whether this would make it compatible with other packages, such as scikit-learn, seaborn, etc. Also, the mean of some dates could be useful for other use cases. It does make sense to calculate the mean of some dates doesn't it?

@sorenwacker
Copy link
Author

Note, how to fit a model using datetimes/timestamps comes up frequently on stackoverflow in different variations.

@gfyoung gfyoung added Enhancement Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions labels Jun 22, 2018
@gfyoung
Copy link
Member

gfyoung commented Jun 22, 2018

cc @jreback @jorisvandenbossche

@jreback
Copy link
Contributor

jreback commented Jun 22, 2018

i believe we have an issue for this - can you see if this is duplicate

@gfyoung
Copy link
Member

gfyoung commented Jun 22, 2018

@jreback : It's similar to #17382, but that's for Timedelta instead of Timestamp.

@mroeschke
Copy link
Member

Some thoughts:

  1. Would this change describe()? Currently we provide top, freq, first, last (categorical summary) for Timestamp data.

  2. What about other ops? (std, median, quantile, etc.) For example std of Timestamp data should probably return a Timedelta but how about skew and kurtosis (or do they not make sense)?

@sorenwacker
Copy link
Author

I agree, median, std, quantiles etc would be very useful too. Also, subtraction of dates would be nice. Why are they called first and last if it could be min and max as well?

@jbrockmendel jbrockmendel added this to Reductions in DatetimeArray Refactor Nov 16, 2018
@jbrockmendel jbrockmendel self-assigned this Oct 17, 2019
@jbrockmendel jbrockmendel added Numeric Operations Arithmetic, Comparison, and Logical operations Reduction Operations sum, mean, min, max, etc. labels Sep 21, 2020
@jbrockmendel
Copy link
Member

Is there anything left to do here?

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Nov 16, 2021
@mroeschke
Copy link
Member

Yes this appears to work now so we can close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Enhancement Numeric Operations Arithmetic, Comparison, and Logical operations Reduction Operations sum, mean, min, max, etc.
Projects
No open projects
DatetimeArray Refactor
  
Reductions
Development

No branches or pull requests

5 participants