Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Timezone info lost when broadcasting scalar datetime to DataFrame #11682
Comments
|
this is covered by #11672 |
jreback
added Bug Indexing Timezones
labels
Nov 24, 2015
jreback
added this to the
0.18.0
milestone
Nov 24, 2015
jreback
referenced
this issue
Nov 24, 2015
Closed
BUG: GH11616 fixes timezone selection error #11672
jreback
added a commit
that referenced
this issue
Nov 27, 2015
|
|
varun-kr + jreback |
e838266
|
|
closed by #11672 |
jreback
closed this
Nov 27, 2015
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
ajenkins-cargometrics commentedNov 23, 2015
I've encountered a bug in pandas 0.16.2, where when using broadcasting to assign a datetime.datetime value to a whole column of a DataFrame, the timezone info is lost. Here is an example:
Note how
dthas a timezone attached, but the values in the 'b' column don't. The problem only occurs when broadcasting a scalar datetime column, not when assigning an array or series. Also, the problem only occurs when using the builtin datetime.datetime class, not pandas's Timestamp class.I've tracked the problem down to the pandas.core.common._infer_dtype_from_scalar function, which is called during the assignment. It contains this code for handling scalar date times:
The problem is that the Timestamp.value property returns an integer value which doesn't contain the timezone information, so the timezone is lost. The reason this problem occurs for datetime.datetime, but not for pandas.Timestamp, is because the code is looking for the 'tz' attribute, which is specific to Timestamp. If the gettattr call was changed to look at the 'tzinfo' attribute instead, this code would work correctly for both pandas.Timestamp and datetime.datetime values. So a fix for this code which works for both datetime and Timestamp would be:
I checked and this bug still exists in the latest version of the pandas source. Nevertheless here is the output of show_versions() on my machine: