Skip to content

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Oct 17, 2017

closes #17894

xref #16823

@jreback jreback added API Design Indexing Related to indexing on series/frames, not to indexes themselves labels Oct 17, 2017
@jreback jreback added this to the 0.21.0 milestone Oct 17, 2017
@jreback
Copy link
Contributor Author

jreback commented Oct 17, 2017

cc @toobaz @jorisvandenbossche

@jreback
Copy link
Contributor Author

jreback commented Oct 17, 2017

In [1]: df = pd.DataFrame()

In [2]: df['foo'] = 1

In [3]: df
Out[3]: 
Empty DataFrame
Columns: [foo]
Index: []

In [4]: df.dtypes
Out[4]: 
foo    int64
dtype: object

In [5]: df = pd.DataFrame()

# 17895
In [6]: df.loc[1] = 1
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-24a18e3c59c6> in <module>()
----> 1 df.loc[1] = 1

~/pandas/pandas/core/indexing.py in __setitem__(self, key, value)
    192             key = com._apply_if_callable(key, self.obj)
    193         indexer = self._get_setitem_indexer(key)
--> 194         self._setitem_with_indexer(indexer, value)
    195 
    196     def _has_valid_type(self, k, axis):

~/pandas/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    421                     # no columns and scalar
    422                     if not len(self.obj.columns):
--> 423                         raise ValueError("cannot set a frame with no defined "
    424                                          "columns")
    425 

ValueError: cannot set a frame with no defined columns

so revert puts us back to original. we still are changing the column dtype, though the value is not assigned.
Doing the reverse [6] does raise (xref #17895) so inconsistent here.

@toobaz
Copy link
Member

toobaz commented Oct 17, 2017

so revert puts us back to original. we still are changing the column dtype, though the value is not assigned.

I think this is nice

Doing the reverse [6] does raise (xref #17895) so inconsistent here.

I agree, I will look at it

@codecov
Copy link

codecov bot commented Oct 17, 2017

Codecov Report

Merging #17902 into master will decrease coverage by 0.01%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17902      +/-   ##
==========================================
- Coverage   91.23%   91.22%   -0.02%     
==========================================
  Files         163      163              
  Lines       50105    50101       -4     
==========================================
- Hits        45715    45703      -12     
- Misses       4390     4398       +8
Flag Coverage Δ
#multiple 89.03% <50%> (ø) ⬆️
#single 40.32% <50%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/reshape/pivot.py 96.35% <ø> (-0.03%) ⬇️
pandas/core/frame.py 97.75% <50%> (-0.1%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/groupby.py 92.03% <0%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5bf7f9a...365e32c. Read the comment docs.

@jreback
Copy link
Contributor Author

jreback commented Oct 17, 2017

anyone want to warn on this operation?

@jreback
Copy link
Contributor Author

jreback commented Oct 17, 2017

@toobaz btw watched your talk at EuroPython: https://www.youtube.com/watch?v=4JwpDGrMsJE

very nice!

@toobaz
Copy link
Member

toobaz commented Oct 17, 2017

very nice!

Thanks! I was asked to replicate at a couple of other conferences, so any suggestions to improve are welcome.

@jreback
Copy link
Contributor Author

jreback commented Oct 17, 2017

near the end of your talk

In [16]: df = pd.DataFrame(['a b c']*10000, columns=['col'])

In [18]: %timeit pd.DataFrame(df['col'].apply(lambda x: pd.Series(x.split())))
1.37 s +- 33.4 ms per loop (mean +- std. dev. of 7 runs, 1 loop each)

In [20]: %timeit pd.DataFrame(df['col'].apply(lambda x: x.split()))
3.59 ms +- 240 us per loop (mean +- std. dev. of 7 runs, 100 loops each)

In [24]: %timeit df['col'].str.split(expand=True)
14.2 ms +- 140 us per loop (mean +- std. dev. of 7 runs, 100 loops each)

I would also show [24], as it is idiomatic, slightly slower but handles NaN values

@toobaz
Copy link
Member

toobaz commented Oct 18, 2017

Aha, cool, I had missed this, thanks.

@jreback jreback merged commit bbaa576 into pandas-dev:master Oct 18, 2017
@jreback
Copy link
Contributor Author

jreback commented Oct 18, 2017

ok reverting for now. any thoughts on warning though?

@toobaz
Copy link
Member

toobaz commented Oct 18, 2017

I didn't reply above because I'm slightly against introducing warnings in this case - I consider warnings as something helping you understand the implementation caveats, not the API. But this might be a personal view.

On the other hand, I think that once we also fix #17895 we should clearly document the behavior in the docs.

yeemey pushed a commit to yeemey/pandas that referenced this pull request Oct 20, 2017
…h no index ( pandas-dev#16823) (pandas-dev#16968)" (pandas-dev#17902)

* Revert "ERR: Raise ValueError when setting scalars in a dataframe with no index ( pandas-dev#16823) (pandas-dev#16968)"

This reverts commit f9ba6fe.

* TST: expicit test on setting scalars on empty frame

closes pandas-dev#17894
alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017
…h no index ( pandas-dev#16823) (pandas-dev#16968)" (pandas-dev#17902)

* Revert "ERR: Raise ValueError when setting scalars in a dataframe with no index ( pandas-dev#16823) (pandas-dev#16968)"

This reverts commit f9ba6fe.

* TST: expicit test on setting scalars on empty frame

closes pandas-dev#17894
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
…h no index ( pandas-dev#16823) (pandas-dev#16968)" (pandas-dev#17902)

* Revert "ERR: Raise ValueError when setting scalars in a dataframe with no index ( pandas-dev#16823) (pandas-dev#16968)"

This reverts commit f9ba6fe.

* TST: expicit test on setting scalars on empty frame

closes pandas-dev#17894
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow setting a column with a scalar and no index (revert #16823)
3 participants