Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API/BUG: should df.loc[:,'col'] be the same as df['col'] for assignment #6149

Closed
jreback opened this issue Jan 28, 2014 · 6 comments · Fixed by #6159
Closed

API/BUG: should df.loc[:,'col'] be the same as df['col'] for assignment #6149

jreback opened this issue Jan 28, 2014 · 6 comments · Fixed by #6159
Labels
API Design Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jan 28, 2014

see also: http://stackoverflow.com/questions/21415432/pandas-v0-13-0-setting-dataframe-values-of-type-datetime64ns

I believe this changed from 0.12. was the same in 0.12

Boils down to if the row indexer is a null-slice (IOW all rows are selected), should
dtype conversion be done or not (as it current)

In [45]: df = pd.DataFrame({'date':pd.date_range('2000-01-01','2000-01-5'),'val' : np.arange(5)})

In [46]: df
Out[46]: 
        date  val
0 2000-01-01    0
1 2000-01-02    1
2 2000-01-03    2
3 2000-01-04    3
4 2000-01-05    4

[5 rows x 2 columns]

In [47]: df['date'] = 0

In [48]: df
Out[48]: 
   date  val
0     0    0
1     0    1
2     0    2
3     0    3
4     0    4

[5 rows x 2 columns]

In [49]: df = pd.DataFrame({'date':pd.date_range('2000-01-01','2000-01-5'),'val' : np.arange(5)})

In [53]: df.loc[:,'date'] = np.array([0])

In [54]: df
Out[54]: 
        date  val
0 1970-01-01    0
1 1970-01-01    1
2 1970-01-01    2
3 1970-01-01    3
4 1970-01-01    4

[5 rows x 2 columns]
@ghost
Copy link

ghost commented Jan 29, 2014

Tough one.

@jreback
Copy link
Contributor Author

jreback commented Jan 29, 2014

was going to throw this in the docs and then think about this for 0.14
yes?

+.. _indexing.getitem_vs_loc:
+
+``[]`` vs loc
+-------------
+
+``[]`` can behave slightly differently versus a similar ``.loc``
+expression for setitem. An example will help explain what is occuring.
+
+.. ipython:: python
+
+   t  = date_range('2000-01-01','2000-01-05')
+   v  = np.arange(0,len(t))
+   df = DataFrame({'date':t,'val':v})
+   df
+   df.dtypes
+
+Setting via ``[]`` overwrites **the entire column**, so the dtype of the resultant column
+will be the same as the provided value.
+
+.. ipython:: python
+
+   df['date'] = v[0]
+   df.dtypes
+
+On the other hand, ``df.loc[:,'date']`` says something different. It says, overwrite the
+current data with the value that is provided to be set, but don't change the dtype, even
+though its slicing the entire column. This is done to avoid having to check for dtype changes
+each time on this particular setting type of operation.
+
+.. ipython:: python
+
+   df = DataFrame({'date':t,'val':v})
+   df.loc[:,'date'] = v[0]
+   df
+   df.dtypes
+

@ghost
Copy link

ghost commented Jan 29, 2014

hmm. That's a corner case. I think that part of the docs is there to help
the user understand how to use pandas. That snippet answers a question I don't think
99.999% of users will be asking themselves while reading the docs and is
therefore out of place.

perhaps a SO question with accepted answer and a GH issue are good enough
to cover this? Sometimes more is less.

@jreback
Copy link
Contributor Author

jreback commented Jan 29, 2014

ok...going to move this to 0.14 to think about it

@ghost
Copy link

ghost commented Jan 29, 2014

my 2c, when you assign an entire column, the resulting dtype should match what
would have been there if you passed in the same (broadcasted in this case)
column data at construction time. There may be details I'm unaware of,
but that's what i'd intuitively expect.

@jreback
Copy link
Contributor Author

jreback commented Jan 29, 2014

ok...i can change pretty easily ....maybe will do tomorrow..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant