Skip to content

Conversation

holymonson
Copy link
Contributor

@holymonson holymonson commented Jul 20, 2018


@TomAugspurger Hope you can review this.

BTW, perhpas you may want to change Iterable in is_list_like() to Sequence.

gen = (i for i in range(10))
pandas.core.dtypes.inference.is_list_like(gen)
# True
len(gen)
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# TypeError: object of type 'generator' has no len()
gen[0]
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# TypeError: 'generator' object is not subscriptable

@holymonson
Copy link
Contributor Author

This test failed. But a single-column DataFrame seems reasonable to me.

# corner, silly
# TODO: Fix this Exception to be better...
with tm.assert_raises_regex(ValueError, 'constructor not '
'properly called'):
DataFrame((1, 2, 3))

pd.DataFrame((1, 2, 3))
#   0
#0  1
#1  2
#2  3

@holymonson holymonson force-pushed the iterable_in_constructor branch from b498a5c to d1c1bc2 Compare July 20, 2018 10:48

expected = Series(list(range(10)), dtype='int64')
result = Series(range(10), dtype='int64')
result = Series(Iter(), dtype='int64')
Copy link
Member

@mroeschke mroeschke Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant this simply be iter(range(10))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

range is list-like, subclass of Sequence, which doesn't need consuming by data = list(data). I'd like to add back range(10) test and keep both.

@holymonson holymonson force-pushed the iterable_in_constructor branch from d1c1bc2 to 4fb3b38 Compare July 21, 2018 02:52
@holymonson
Copy link
Contributor Author

Allow pd.DataFrame((1, 2, 3)) because pd.DataFrame([1, 2, 3]) is valid. They should be the same.

@codecov
Copy link

codecov bot commented Jul 21, 2018

Codecov Report

Merging #21987 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21987      +/-   ##
==========================================
- Coverage   92.02%   92.02%   -0.01%     
==========================================
  Files         170      170              
  Lines       50707    50704       -3     
==========================================
- Hits        46661    46658       -3     
  Misses       4046     4046
Flag Coverage Δ
#multiple 90.42% <100%> (-0.01%) ⬇️
#single 42.31% <80%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/series.py 94.11% <100%> (ø) ⬆️
pandas/core/frame.py 97.19% <100%> (-0.01%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 693e235...a2ace52. Read the comment docs.

@gfyoung gfyoung added Enhancement Internals Related to non-user accessible pandas implementation Low-Memory labels Jul 22, 2018
@gfyoung gfyoung requested a review from TomAugspurger July 22, 2018 22:45
expected = DataFrame([[1, 2, 3]] * 10)
result = DataFrame(Iter())
tm.assert_frame_equal(result, expected)

Copy link
Member

@gfyoung gfyoung Jul 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference the issue number in a comment below the functional definition.

Applies to all newly-added tests.

@holymonson holymonson force-pushed the iterable_in_constructor branch from 4fb3b38 to 166f4f6 Compare July 23, 2018 03:10
@TomAugspurger
Copy link
Contributor

Allow pd.DataFrame((1, 2, 3)) because pd.DataFrame([1, 2, 3]) is valid. They should be the same.

Context: That goes back a long way to DataMatrix: c34ac74#diff-225c6e44f997b9b698ab52a48b223018R39

+1 for allowing it.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging later today, unless someone else beats me to it.

@jreback
Copy link
Contributor

jreback commented Jul 24, 2018

will have a look

don’t merge

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a whatsnew note, other enhancements in 0.24

copy=copy)
elif isinstance(data, (list, types.GeneratorType)):
if isinstance(data, types.GeneratorType):
elif (isinstance(data, collections.Iterable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment here (1 line above) which defines what this clause is checking

elif isinstance(data, (set, frozenset)):
raise TypeError("{0!r} type is unordered"
"".format(data.__class__.__name__))
elif (isinstance(data, collections.Iterable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@jreback jreback added this to the 0.24.0 milestone Jul 25, 2018
@holymonson holymonson force-pushed the iterable_in_constructor branch from 166f4f6 to a2ace52 Compare July 25, 2018 11:33
@jreback jreback merged commit be6ad72 into pandas-dev:master Jul 26, 2018
@jreback
Copy link
Contributor

jreback commented Jul 26, 2018

thanks @holymonson !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement Internals Related to non-user accessible pandas implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot produce DataFrame from dict-yielding iterator.

5 participants