Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_datetime should support ISO week year #16607

Closed
buyology opened this issue Jun 5, 2017 · 15 comments · Fixed by #25541
Closed

to_datetime should support ISO week year #16607

buyology opened this issue Jun 5, 2017 · 15 comments · Fixed by #25541
Labels
Compat pandas objects compatability with Numpy or Python functions Timeseries
Milestone

Comments

@buyology
Copy link

buyology commented Jun 5, 2017

to_datetime does not currently seem to support ISO week year like strptime does:

In [38]: datetime.date(2016, 1, 1).strftime('%G-%V')
Out[38]: '2015-53'

In [39]: datetime.datetime.strptime(datetime.date(2016, 1, 1).strftime('%G-%V')+'-1', '%G-%V-%u')
Out[39]: datetime.datetime(2015, 12, 28, 0, 0)

In [41]: pd.to_datetime(datetime.date(2016, 1, 1).strftime('%G-%V')+'-1', format='%G-%V-%u')
        ---------------------------------------------------------------------------
        TypeError                                 Traceback (most recent call last)
        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
            443             try:
        --> 444                 values, tz = tslib.datetime_to_datetime64(arg)
            445                 return DatetimeIndex._simple_new(values, name=name, tz=tz)

        pandas/_libs/tslib.pyx in pandas._libs.tslib.datetime_to_datetime64 (pandas/_libs/tslib.c:33275)()

        TypeError: Unrecognized value type: <class 'str'>

        During handling of the above exception, another exception occurred:

        ValueError                                Traceback (most recent call last)
        <ipython-input-41-7ce30c959690> in <module>()
        ----> 1 pd.to_datetime(datetime.date(2016, 1, 1).strftime('%G-%V')+'-1', format='%G-%V-%u')

        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin)
            516         result = _convert_listlike(arg, box, format)
            517     else:
        --> 518         result = _convert_listlike(np.array([arg]), box, format)[0]
            519 
            520     return result

        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
            445                 return DatetimeIndex._simple_new(values, name=name, tz=tz)
            446             except (ValueError, TypeError):
        --> 447                 raise e
            448 
            449     if arg is None:

        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
            412                     try:
            413                         result = tslib.array_strptime(arg, format, exact=exact,
        --> 414                                                       errors=errors)
            415                     except tslib.OutOfBoundsDatetime:
            416                         if errors == 'raise':

        pandas/_libs/tslib.pyx in pandas._libs.tslib.array_strptime (pandas/_libs/tslib.c:63124)()

        pandas/_libs/tslib.pyx in pandas._libs.tslib.array_strptime (pandas/_libs/tslib.c:63003)()

        ValueError: 'G' is a bad directive in format '%G-%V-%u'

INSTALLED VERSIONS ------------------ commit: None

pandas: 0.20.1
pytest: 3.1.0
pip: 9.0.1
setuptools: 28.8.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 6.0.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.1.10
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Jun 5, 2017

sure could be added.

pull-requests are welcome.

@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate Timeseries labels Jun 5, 2017
@jreback jreback added this to the Next Major Release milestone Jun 5, 2017
@rosygupta
Copy link

@jreback can you help me start working on this pull request? Like where to look to fix it?

@buyology
Copy link
Author

buyology commented Jun 7, 2017

The relevant part should be in array_strptime in /_libs/tslib.pyx.

A good ref impl is datetimes strptime: https://github.com/python/cpython/blob/6f0eb93183519024cb360162bdd81b9faec97ba6/Lib/_strptime.py#L321

@rosygupta
Copy link

@buyology Could I be assigned this task so that I can work on its pull request?

@buyology
Copy link
Author

buyology commented Jun 7, 2017

@rosygupta feel free to work on this :-)

@rosygupta
Copy link

@buyology This will be my first attempt to make a PR here. Do you suggest if this would be the right task to take on ?

@buyology
Copy link
Author

buyology commented Jun 7, 2017

@rosygupta it involves some cython, but should be pretty straightforward as long as you get the environment up and running properly. otherwise, if you want something more lightweight, go and look for novice-labeled issues 🚀

@rosygupta
Copy link

@buyology Seems achievable. Where do I look for the tests for this particular piece to check my code?

@TomAugspurger
Copy link
Contributor

@rosygupta this is a new feature request, so there aren't existing tests for it. Similar tests to what you would need to add are in https://github.com/pandas-dev/pandas/blob/73930c58e8eac4031608bb8c4bf624d77e1d1dcb/pandas/tests/indexes/datetimes/test_tools.py

@rosygupta
Copy link

Hey, I'm not sure why tests of all classes are not being executed when testing. Only 6 classes being executed. Can someone clear out?

@rosygupta
Copy link

@buyology @TomAugspurger Pulled a PR. Need some guidance.

@RjLi13
Copy link
Contributor

RjLi13 commented Jan 19, 2019

Hey I'm curious what's happening with this issue. I took a look at the referenced PR @rosygupta made and it looks like the PR fixes the issue, just needs to be rebased / fix any merge conflicts now? Any update into looking to get it merged in?

@jreback
Copy link
Contributor

jreback commented Jan 19, 2019

just needs to be rebased / fix any merge conflicts now?

you are welcome to do this

@RjLi13
Copy link
Contributor

RjLi13 commented Jan 20, 2019

Sure I can tackle this.

@RjLi13
Copy link
Contributor

RjLi13 commented Jan 20, 2019

So the datetime code has been moved out of _libs/tslib.pyx making a rebase hard to do. Is it better to move her changes out and reapply them manually?

@jreback jreback modified the milestones: Contributions Welcome, 0.25.0 Mar 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Timeseries
Projects
None yet
5 participants