Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify format for strptime in to_datetime #2213

Closed
wesm opened this issue Nov 9, 2012 · 6 comments
Closed

Specify format for strptime in to_datetime #2213

wesm opened this issue Nov 9, 2012 · 6 comments

Comments

@wesm
Copy link
Member

wesm commented Nov 9, 2012

http://stackoverflow.com/questions/13133458/pandas-casting-iso-string-to-datetime64

@paulproteus
Copy link

The fix for this would begin by adding a new keyword argument to the to_datetime() function in pandas/tseries/tools.py. I suggest calling the keyword argument "time_format".

You would need to:

  • Modify the docstring on that code to explain the new argument
  • Modify the code so it calls datetime.strptime() appropriately
  • Make sure your new code handles errors in the same way as the function generally does (pay attention to errors='ignore' vs. errors='raise')
  • Write a test case covering this new code.

@dundo4he
Copy link

I wanted to try out the Issue2213 branch. But I got error message when I tried to compile and install the package. The error message I got is

error: install-base or install-platbase supplied, but installation scheme is incomplete

What I did is

python setup.py build_ext
python setup.py install --install-base=/tmp install-platbase=/tmp

What does this error mean? How do I solve it?

I googled this error message but could not find an answer. I also asked on the irc channel, but did not get an answer either.

@wesm
Copy link
Member Author

wesm commented Dec 26, 2012

I've never used those options before-- for development I would suggest building the C extensions in place and working directly from the base of the git clone:

python setup.py build_ext --inplace

@dundo4he
Copy link

@wesm Yes, I used this method for building the C extensions in place. But I did not see any improvement that Emily mentioned when comparing Issue2213 branch and master branch.

Here is what I did:

I first wanted to test the performance of Issue2213 branch.

mkdir pandas_1224
cd pandas_1224
git clone git://github.com/six5532one/pandas.git
cd pandas
git checkout Issue2213
git status
####Then, I added one line (print "hello world") in pandas/tslib.pyx
python setup.py build_ext --inplace
ipython
import pandas
rng = pandas.date_range('1/1/2000', periods=20000, freq='ms')
strings = [x.strftime("%Y%m%dT%H%M%S.%f') for x in rng]
timeit pandas.to_datetime(strings) 
#### there are lots of "hello world" printed. So it is sure the Issue2213 branch
quit()

Then, I wanted to see the performance of master branch.

git commit -m "Issue2213-print" -a
git checkout master
git status
python setup.py build_ext --inplace
ipython
import pandas
rng = pandas.date_range('1/1/2000', periods=20000, freq='ms')
strings = [x.strftime("%Y%m%dT%H%M%S.%f') for x in rng]
timeit pandas.to_datetime(strings) 
#### this time there is no "hello world" printed. So it is sure the master branch
quit()

I did not see any improvement. What did I do wrong?

@wesm
Copy link
Member Author

wesm commented Dec 27, 2012

You did not specify the date format in the first pandas.to_datetime usage. If you don't pass the format string, it will fall back on the dateutil slower parser.

@wesm
Copy link
Member Author

wesm commented Feb 17, 2013

This is done in 015447a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants