We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some of the string methods return a Series of lists (e.g. findall) which isn't that convenient for doing more analysis:
df = pd.DataFrame([['@a @b'], ['@a'], ['@c']], columns=['tweets']) at_mentions = df.tweets.str.findall('@[a-zA-Z0-9_]+') In [3]: at_mentions Out[3]: 0 [@a, @b] 1 [@a] 2 [@c] Name: tweets, dtype: object
I think it would be nice to be able to stack the results:
In [4]: at_mentions.apply(pd.Series).stack() Out[4]: tweets 1 0 @a 1 @b 2 0 @a 3 0 @c dtype: object
see this SO answer
Could just do it directly like that, or perhaps make more efficient. Thoughts?
The text was updated successfully, but these errors were encountered:
I actually like this (as the default)
In [7]: at_mentions.apply(lambda x: Series(1,x)).fillna(0) Out[7]: @a @b @c 0 1 1 0 1 1 0 0 2 0 0 1
Sorry, something went wrong.
clever....that's the sweet dummy variable hack
realted is #4685, pushing to 0.14
I think explode solves this use case. Happy to reopen if this doesn't solve the original use case.
explode
In [7]: at_mentions.explode() Out[7]: 0 @a 0 @b 1 @a 2 @c Name: tweets, dtype: object
No branches or pull requests
Some of the string methods return a Series of lists (e.g. findall) which isn't that convenient for doing more analysis:
I think it would be nice to be able to stack the results:
see this SO answer
Could just do it directly like that, or perhaps make more efficient. Thoughts?
The text was updated successfully, but these errors were encountered: