Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Create ability to localize tz with data already in that tz (fall transition) #4230

Closed
rockg opened this issue Jul 13, 2013 · 3 comments
Closed
Milestone

Comments

@rockg
Copy link
Contributor

rockg commented Jul 13, 2013

Many times data is already presented assuming a local timezone (e.g., with two hour 1s in the fall dst). Currently there is no way to localize this data as it raises an ambiguous time error. Please see the link for an example.

http://stackoverflow.com/questions/17370826/create-pandas-timezone-aware-datetimeindex-on-already-local-timezone

I think many times the simple assumption about the second hour being non-dst is a sound one and should be available. Perhaps there is already a way to do this, but I have yet to find it.

@rockg
Copy link
Contributor Author

rockg commented Jul 13, 2013

The resulting times in that answer are 9pm EDT/EST, not 1am EDT/EST so are
5 hours off. The desired output shows how it should look (generated from
date_range). Also, doing any shifting to that answer does not preserve the
right times. The initial times are US/Eastern, not UTC so doing any
conversion distorts the times.

On Sat, Jul 13, 2013 at 10:59 AM, jreback notifications@github.com wrote:

How does the answer not solve the problem?


Reply to this email directly or view it on GitHubhttps://github.com//issues/4230#issuecomment-20921027
.

@nehalecky
Copy link
Contributor

I agree with @rockg, the answer given shifts the times, as it parses local timestamps in 'US/Eastern' as 'UTC' first (they are not) and then converts these times back to local. This, obviously, shifts the times.

I also have been dealing with this problem of pandas inability to parse time series presented in local time when the time series contains a DST transition, and I have been working on a general solution, but haven't had the time to complete it (don't know much about the time series indexing internals just yet)!

In the meantime, I get around this with a simple (and quite hackish) function: Assuming some DataFrame, df, with a set, naive DatetimeIndex that is continuous (i.e., runs around the clock, all day, all night, 365.25 days a year) and has the same frequency (i.e., every time stamp offset is equal), then you can simple generate a time range based off of this frequency and assign it to the index.

def parse_local_timeseries(df, tz):
    dti = df.index.to_pydatetime()
    dti_delta = pd.TimeSeries(dti[1:] - dti[:-1])
    main_freq = dti_delta.value_counts().index[0]/10**9 /60
    df.index = pd.date_range(dti[0], dti[-1], freq=str(main_freq)+'T', tz=tz)
    return df

A nice little error check generates from this function in that, if your time series is missing any time stamps, this assignment of the generated date range to the df will fail due to difference in sizes of the df and generated DatetimeIndex.

Re a better approach: I might open up a more general issue in this regard soon. So many ideas, so little time. :)

rockg added a commit to rockg/pandas that referenced this issue Oct 2, 2013
Fix to issue pandas-dev#4230 which allows to localize an index which is
implicitly in a tz (e.g., reading from a file) by passing infer_dst to
tz_localize.
@jreback
Copy link
Contributor

jreback commented Oct 2, 2013

closed by #4706

@jreback jreback closed this as completed Oct 2, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants