Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As tsfromtxt like ability to convert multiple text file columns into a single date column #1186

Closed
wesm opened this issue May 2, 2012 · 12 comments

Comments

@wesm
Copy link
Member

commented May 2, 2012

No description provided.

@wesm

This comment has been minimized.

Copy link
Member Author

commented May 2, 2012

example data from @timmie

ID,date,NominalTime,ActualTime,TDew,TAir,Windspeed,Precip,WindDir
 KORD,1999027, 19:00:00, 18:56:00, 0.8100, 2.8100, 7.2000, 0.0000, 280.0000
 KORD,1999027, 20:00:00, 19:56:00, 0.0100, 2.2100, 7.2000, 0.0000, 260.0000
 KORD,1999027, 21:00:00, 20:56:00, -0.5900, 2.2100, 5.7000, 0.0000, 280.0000
 KORD,1999027, 21:00:00, 21:18:00, -0.9900, 2.0100, 3.6000, 0.0000, 270.0000
 KORD,1999027, 22:00:00, 21:56:00, -0.5900, 1.7100, 5.1000, 0.0000, 290.0000
 KORD,1999027, 23:00:00, 22:56:00, -0.5900, 1.7100, 4.6000, 0.0000, 280.0000

@ghost ghost assigned changhiskhan May 5, 2012

@changhiskhan

This comment has been minimized.

Copy link
Contributor

commented May 5, 2012

Isn't this in #1174 already?

@timmie

This comment has been minimized.

Copy link
Contributor

commented May 7, 2012

The following is liked and may be considered:
#854 (comment)

and after this issue here is solved, we may start with:
#1180

@timmie

This comment has been minimized.

Copy link
Contributor

commented May 7, 2012

to comment:
#1186 (comment)

can both issues be merged?

@wesm

This comment has been minimized.

Copy link
Member Author

commented May 14, 2012

This looks good now, we'll need to write docs prior to release

@wesm wesm closed this May 14, 2012

@wesm

This comment has been minimized.

Copy link
Member Author

commented May 14, 2012

The only remaining question might be: should the combined columns be dropped from the resulting DataFrame?

@timmie

This comment has been minimized.

Copy link
Contributor

commented May 15, 2012

The only remaining question might be: should the combined columns be dropped from the resulting DataFrame?
I think it's possible because they're in the index.

@changhiskhan

This comment has been minimized.

Copy link
Contributor

commented May 16, 2012

We should add another keyword to control that actually. And perhaps also to control whether to put the new date columns first or last

@timmie

This comment has been minimized.

Copy link
Contributor

commented May 17, 2012

Could you add a small documentation snippet?

I have seen that there's a test file, but something more to the front would be nice.

BTW, is the website with the dev docs updated on nightly basis?

@changhiskhan

This comment has been minimized.

Copy link
Contributor

commented May 17, 2012

The docs are not updated nightly right now but will be soon.
Thanks for the suggestion. I will add a snippet to the docs.
On May 17, 2012 8:32 AM, "timmie" <
reply@reply.github.com>
wrote:

Could you add a small documentation snippet?

I have seen that there's a test file, but something more to the front
would be nice.

BTW, is the website with the dev docs updated on nightly basis?


Reply to this email directly or view it on GitHub:
#1186 (comment)

@timmie

This comment has been minimized.

Copy link
Contributor

commented May 23, 2012

This is still not transparent to me:

This works weel:

from pandas import read_table
from cStringIO import StringIO
import os
N = 10000
K = 8
data = '''\
KORD,19990127, 19:00:00, 18:56:00, 0.8100, 2.8100, 7.2000, 0.0000, 280.0000
KORD,19990127, 20:00:00, 19:56:00, 0.0100, 2.2100, 7.2000, 0.0000, 260.0000
KORD,19990127, 21:00:00, 20:56:00, -0.5900, 2.2100, 5.7000, 0.0000, 280.0000
KORD,19990127, 21:00:00, 21:18:00, -0.9900, 2.0100, 3.6000, 0.0000, 270.0000
KORD,19990127, 22:00:00, 21:56:00, -0.5900, 1.7100, 5.1000, 0.0000, 290.0000
'''


data = data * 2000


mydata = read_table(StringIO(data), sep=',', header=None, parse_dates={'date1' : [1, 2], 'date2' : [1,3]})

import pandas as pd

myseries = pd.Series(data=mydata['X.5'], index=mydata.date1)

but changing to

mydata = read_table(StringIO(data), sep=',', header=None, parse_dates=[1, 2])

does not return the desired combined date time column

@changhiskhan

This comment has been minimized.

Copy link
Contributor

commented May 23, 2012

you want parse_dates=[[1, 2]]

parse_dates=[1, 2] means you want to parse each of columns 1 and 2 as dates

I'll see if I can make the docs a little more clear before the 0.8 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.