Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Vectorize pandas.Series.apply #355

Closed
wants to merge 5 commits into
from

Conversation

Projects
None yet
2 participants
Contributor

aman-thakral commented Nov 10, 2011

When profiling my code I noticed that numpy.vectorize was a bit faster than pandas.Series.apply.

Execution Times:
numpy.vectorize: 3.3 seconds
pandas.Series.apply: 4.4 seconds.

Contributor

aman-thakral commented on 8264a12 Oct 18, 2011

fixed #251

Owner

wesm commented Nov 10, 2011

hey, be careful to add features in branches and not the master branch :) I will take a look at the apply patch, looks like a no brainer. You may consider resetting your fork to be exactly wesm/master since you've got a couple of merge commits

Contributor

aman-thakral commented Nov 10, 2011

Definitely a good idea for me to reset my fork. Also, I guess the same reasoning could apply to Series.map. Unless you have some use case distinctions for map and apply in mind (besides map also accepting a dict). It would be good to document this distinction as well :).

Owner

wesm commented Nov 11, 2011

np.vectorize has some string handling problems (changing Series.map and apply to vectorize caused unit test failures). I'll patch this separately very quickly (maybe writing a Cython function) and make sure the speed is comparable

Owner

wesm commented Nov 11, 2011

see above commit...added a new cython function that's about 25% faster than np.vectorize and gets the types right. thanks for pointing this out!

@wesm wesm closed this Nov 11, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment