Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

series str as an iterator #3638

Closed
hayd opened this Issue · 9 comments

3 participants

@hayd
Collaborator

This was noticed on SO by @DSM. It seems that the Series str method is a never ending iterator. I have no idea whether this should be classified as a bug, or just misuse. But here goes:

In [224]: g = (i for i in ds.str)

In [225]: next(g)
Out[225]:
google       NaN
wikimedia    NaN
wikipedia    NaN
wikitravel   NaN
dtype: float64

In [226]: next(g)
Out[226]:
google       NaN
wikimedia    NaN
wikipedia    NaN
wikitravel   NaN
dtype: float64

In [227]: next(g)
Out[227]:
google       NaN
wikimedia    NaN
wikipedia    NaN
wikitravel   NaN
dtype: float64

In [228]: next(g)
Out[228]:
google       NaN
wikimedia    NaN
wikipedia    NaN
wikitravel   NaN
dtype: float64

In [229]: list(g)  # lalalala
@jreback
Owner

Not sure if this is a bug or a feature:

the reason __getitem__ exists in str (and thus its invoked when used in list context)

is that

In [18]: ds
Out[18]: 
google        40
wikimedia     22
wikipedia     10
wikitravel    33
dtype: int64

In [19]: Series(ds.index).str[0:5]
Out[19]: 
0    googl
1    wikim
2    wikip
3    wikit
dtype: object

not sure how useful that is though

@cpcloud
Collaborator

u could have it iterate like zip and just throw StopIteration when the last character of the shortest string is reached or u could iterate to the last char of the longest and provide nans for the rest. the latter seems more pandas like

@hayd
Collaborator

I'm a bit confused, the next(g) above is always the same (length of indexes doesn't seem to change).

@cpcloud
Collaborator

@hayd @jreback check out my branch series-str-iter-3638 what do u think?

@cpcloud
Collaborator

could change any to all to get the first type of behavior i mentioned. haven't tested it gotta run out, will test tmrw or later 2nite

@hayd
Collaborator

I think any would be best, but... would anyone ever use this?

I suppose it makes sense, analogous to iterating through a string. I certainly have no idea what else it could/should do, perhaps more of an egg than a feature... :)

@cpcloud
Collaborator
@hayd
Collaborator

I'm sold. pr?

@cpcloud
Collaborator

sure. also i will revert back to self.get to avoid branching every time (no need to check for a slice when iterating)

@jreback jreback closed this in #3645
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.