Add stack argument to str.findall #4428

hayd · 2013-08-01T10:05:20Z

Some of the string methods return a Series of lists (e.g. findall) which isn't that convenient for doing more analysis:

df = pd.DataFrame([['@a @b'], ['@a'], ['@c']], columns=['tweets'])
at_mentions = df.tweets.str.findall('@[a-zA-Z0-9_]+')

In [3]: at_mentions
Out[3]:
0    [@a, @b]
1        [@a]
2        [@c]
Name: tweets, dtype: object

I think it would be nice to be able to stack the results:

In [4]: at_mentions.apply(pd.Series).stack()
Out[4]:
tweets
1       0    @a
        1    @b
2       0    @a
3       0    @c
dtype: object

see this SO answer

Could just do it directly like that, or perhaps make more efficient. Thoughts?

The text was updated successfully, but these errors were encountered:

jreback · 2013-08-01T12:27:33Z

I actually like this (as the default)

In [7]: at_mentions.apply(lambda x: Series(1,x)).fillna(0)
Out[7]: 
   @a  @b  @c
0   1   1   0
1   1   0   0
2   0   0   1

cpcloud · 2013-08-01T12:29:05Z

clever....that's the sweet dummy variable hack

jreback · 2013-09-28T17:26:33Z

realted is #4685, pushing to 0.14

mroeschke · 2020-05-03T19:53:58Z

I think explode solves this use case. Happy to reopen if this doesn't solve the original use case.

In [7]: at_mentions.explode()
Out[7]:
0    @a
0    @b
1    @a
2    @c
Name: tweets, dtype: object

jreback modified the milestones: 0.15.0, 0.14.0 Mar 19, 2014

jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015

datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018

mroeschke closed this as completed May 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add stack argument to str.findall #4428

Add stack argument to str.findall #4428

hayd commented Aug 1, 2013

jreback commented Aug 1, 2013

cpcloud commented Aug 1, 2013

jreback commented Sep 28, 2013

mroeschke commented May 3, 2020

Add stack argument to str.findall #4428

Add stack argument to str.findall #4428

Comments

hayd commented Aug 1, 2013

jreback commented Aug 1, 2013

cpcloud commented Aug 1, 2013

jreback commented Sep 28, 2013

mroeschke commented May 3, 2020