-
Notifications
You must be signed in to change notification settings - Fork 682
BUG: Add timezone to quote times in Options. #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
setting the timezone is a bit odd and makes working with these times much harder for the user so unless you have a really compelling reason I would not put the tz |
|
Ya, I'm on the fence about the second one. Was unaware that it wasn't efficiently represented. I think we just have timezone on the quote time (first commit). |
|
look at the dtypes of the frame before/after |
|
I see - datetime64 vs object. Looks like it doesn't matter if its in the index (both DateTimeIndex)? I'm fine with leaving them naive. Thoughts @aisthesis? |
|
The problem with naive is that applications will typically just interpret it as UTC, which it definitely isn't in the case of quote time. I feel like quote time at least should have a clear timezone. For performance in getting the data, the overhead in retrieval looks to me like it's coming from connecting to various Yahoo sources and not from Pandas putting everything together. But I could be wrong there, as I only see it as consumer. It's also clear that the expirations are in a different spot both substantively (there's no actual time but just a date) and programmatically (they're a Pandas Timeseries within the index whereas quote time is a numpy datetime64 because numpy really only understands floats inside the array). What do you guys think of making quote time have a timezone but leaving expiration tz-naive? |
|
I disagree with jreback that setting the times is odd and most definitely that it makes working with these times more difficult for the user. Speaking as user, it creates problems for me not having the timezone. If you decide to leave it naive, I'll have to put in my own code to set the timezone in both cases. I can't speak to the internal efficiency issue within Pandas. |
45ffc63 to
2543d0d
Compare
|
As I see it, we have 3 options:
I'm inclined to the 2nd one. The timezone for the quote time has a meaning and would be useful for people not on the east coast. I'm iffy on adding a timezone to the expiry date. However, if we start adding option quotes for European exchanges, then timezones for expiry dates might start mattering. @jreback @aisthesis Thoughts? |
|
you have to be be consistent across the various methods. E.g. if you choose to do a timezone on say quote time, then the other data methods should do the same (I mean for say stock data). But to be honest I suspect most people either keep this data as relative and naive (e.g. 4pm, e.g. the reported time but as a naive time), or convert to UTC. |
|
@jreback I suspect very few do anything with it as naive. You run into problems very quickly, as I have, unless your local machine doing the processing happens to be on the U.S. East Coast. Naive isn't a problem as long as you're just doing something like pulling it from command line to check it out. But as soon as the date matters in a program, you're forced (as I've done) simply to insert the proper timezone. Moreover, for those who keep the data as naive timezone, it wouldn't matter if it were correctly specified. Whereas not specifying it definitively creates problems for some. In other words, even if it doesn't particularly matter for some users, specifying the timezone hurts no one and helps most users who actually use the time programmatically. Yahoo! presumably is just providing it as a string with no tz specification, right? |
|
Maybe the least common denominator would be to put everything in UTC. The problem I have with naive timezones is that there is no reliable contract as to what the time actually is. So, I can't write reliable client code comparing times. Common use cases: Has the option expired? Is the given I can live with any of the following solutions:
I find it unsatisfactory to have a naive timezone with no way to tell what the corresponding UTC time actually is. |
|
Just to continue this discussion. What if we just make objects that have a time attached to them (ie: just option quote time currently) have the correct time zone, while leaving objects that are just dates naive. Does that work? Or do we need to be all or none? |
|
That would work for my use case. |
|
What was the final verdict here? Are the Timestamps all in EST now? |
|
They are still all naive. @jreback is your concern about performance alleviated now in 0.17+? http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0170-tz |
|
yep putting them in tz would be ok now |
Fixes #28