Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert datetime64 from/to vectorized dates #4824

Closed
kasal opened this issue Jun 24, 2014 · 4 comments
Closed

convert datetime64 from/to vectorized dates #4824

kasal opened this issue Jun 24, 2014 · 4 comments

Comments

@kasal
Copy link

kasal commented Jun 24, 2014

In our applications, we often represent datetime as a np.array with dtype
[('YY', '<i4'), ('MM', '<i4'), ('DD', '<i4'), ('hh', '<i4'), ('mm', '<i4'), ('ss', '<f8')]

Can this be converted to datetime64, using vector methods?

Of course, converting to string and reparsing is not an option, it would be awfully slow.

@sebix
Copy link

sebix commented Sep 6, 2014

This page is for bugs and feature requests to Numpy, consider posting this question at Stackoverlow

@kasal
Copy link
Author

kasal commented Sep 7, 2014

This indeed was a bug report: both about documentation and methods available for datetime64.

The documentation focuses on initialization of datetime64 from list of strings. This is irrelevant for huge time rows.

  1. I believe that using numpy to work with large arrays is a fairly common use case; if I'm right, could you please add some hints to the docs?

  2. It seems that the situation described above needs some tricks with timedelta64 to get it right.
    I believe that importing a time row this way must be fairly common; it would be very nice if you could consider supporting this type of import directly.

Thanks you very much for considering these bug / enhancement reports.

@sebix
Copy link

sebix commented Sep 7, 2014

Sorry to be pedantic, but this is now a bug/"documentation request", the first text was not.

ad 1) I fully agree with you that the docs are lacking usage examples like conversion from other formats like the one you present.
ad 2) My approach would be to convert all except the first type (the year) to a timedelta64, this is works straightforard using .astype(). The second step would be to sum the columns up, i wouldn't put much effort in improving this step, just use a loop or so.

import numpy as np
olddtype = np.dtype([('YY', '<i4'), ('MM', '<i4'), ('DD', '<i4'), ('hh', '<i4'), ('mm', '<i4'), ('ss', '<f8')])
d = np.array([(2014, 12, 24, 12, 00, 45),
              (2012, 2, 14, 23, 29, 52),
              (2013, 10, 4, 8, 49, 21)], dtype=olddtype)
d1 = d.astype(np.dtype([('YY', 'datetime64[Y]'), ('MM', 'timedelta64[M]'), ('DD', 'timedelta64[D]'), ('hh', 'timedelta64[h]'), ('mm', 'timedelta64[m]'), ('ss', 'timedelta64[s]')]))
d2 = d1['YY']-1971 + d1['MM'] + d1['DD'] + d1['hh'] + d1['mm'] + d1['ss']

@kasal
Copy link
Author

kasal commented Sep 7, 2014

this is now a bug/"documentation request", the first text was not

I agree. I deliberately prefer to submit bad bug/doc report quickly: I would forget if I tried to put it off.
This one was indeed far from clear, sorry for that.

Thank you for your explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants