Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating datetime index from year and month columns combined #3

Closed
SophMC opened this issue May 13, 2016 · 2 comments
Closed

Creating datetime index from year and month columns combined #3

SophMC opened this issue May 13, 2016 · 2 comments

Comments

@SophMC
Copy link
Owner

SophMC commented May 13, 2016

I have taken the mean and std of ws grouped by year and month.
This is the first 5 lines of the new dataframe which I stored the values in.

year month mean std
0 1984 3 5.060840 2.280239
1 1984 4 5.219944 1.490117
2 1984 5 5.595610 1.396281
3 1984 6 6.520606 1.927716
4 1984 7 6.095756 1.958878
year int64
month int64
dtype: object

I want to create a datetime object from the year and month columns and make it the index of the dataframe, preferably using: pd.to_datetime and preferably using only one line of code!

These attempts don't work:
df.index=pd.to_datetime(year + month)
df.index=pd.to_datetime(df.year + df.month)
df['date'] = df.apply(lambda row: datetime(row['Year'], row['Month'], axis=1)) #with the intention to change it to the index after.
df['date']= df.apply(lambda x:datetime.strptime("{0} {1}".format(x['year'],x['month']), "%Y%m"),axis=1)

Though i've already done something similar using pd.to_datetime which does work and I'm not sure why it does, as nothing above seems to. I suspect it is something to do with the new dataframe I created:
wind['date_time'] = pd.to_datetime(wind.date_time) + wind.hour.astype('timedelta64[h]')

wind.date_time is already a datetime object created by read_csv when I read the file in and tell it to parse the first 3 columns together.

I can do it if I use a loop:

for i in range(0,len(df)):
year_n, month_n = df.index[i]
year_new.append(datetime(year_n,month_n,1))

But I guess that's slow and not the best use of pandas...

@SophMC
Copy link
Owner Author

SophMC commented Jun 5, 2016

Solution
Turn the groupby MultiIndex (consisting of [year[month]]) into two columns [year][month]:
wind_group = wind_group.reset_index()

Join year and month into one column with the format yyyy/mm and apply to_datetime for each row:
wind_group['date'] = wind_group[['year', 'month']].apply(lambda x: pd.to_datetime('/'.join(x)), axis=1)

Make the new datetimeIndex column ['date'] the dataFrame index:
wind_group.index=wind_group['date']

@SophMC SophMC closed this as completed Jun 5, 2016
@aaren
Copy link

aaren commented Jun 5, 2016

Actually lambda here is ok - the function is pretty simple. Main thing is readability - if the lambda isn't short and simple then use a separate def.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants