New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Release GIL on some datetime ops #11263

Merged
merged 1 commit into from Oct 17, 2015

Conversation

Projects
None yet
3 participants
@chris-b1
Contributor

chris-b1 commented Oct 8, 2015

This is a WIP, but far enough along I thought I'd share and see if the approach was reasonable.

This releases the GIL on most vectorized field accessors (e.g. dt.year) and conversion to and from Period. May be places it could be done - obviously would be nice for parsing, but I'm not sure that's possible.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 8, 2015

Contributor

ohh nice!

can u share some timings?

Contributor

jreback commented Oct 8, 2015

ohh nice!

can u share some timings?

@jreback

View changes

Show outdated Hide outdated pandas/src/period.pyx Outdated
@jreback

View changes

Show outdated Hide outdated pandas/src/period.pyx Outdated
@chris-b1

This comment has been minimized.

Show comment
Hide comment
@chris-b1

chris-b1 Oct 8, 2015

Contributor

Here are some timings - getting a pretty nice speedup. In single-threaded case things are looking about flat.

In [1]: from pandas.util.testing import test_parallel
In [2]: dti = pd.date_range('1900-1-1', periods=100000)

In [3]: def f():
   ...:     for i in range(4):
   ...:         dti.year
In [4]: @test_parallel(4)
   ...: def g():
   ...:     dti.year

In [8]: %timeit f()
10 loops, best of 3: 25.8 ms per loop

In [9]: %timeit g()
100 loops, best of 3: 7.71 ms per loop
Contributor

chris-b1 commented Oct 8, 2015

Here are some timings - getting a pretty nice speedup. In single-threaded case things are looking about flat.

In [1]: from pandas.util.testing import test_parallel
In [2]: dti = pd.date_range('1900-1-1', periods=100000)

In [3]: def f():
   ...:     for i in range(4):
   ...:         dti.year
In [4]: @test_parallel(4)
   ...: def g():
   ...:     dti.year

In [8]: %timeit f()
10 loops, best of 3: 25.8 ms per loop

In [9]: %timeit g()
100 loops, best of 3: 7.71 ms per loop
@jreback

View changes

Show outdated Hide outdated pandas/tslib.pyx Outdated
@@ -3849,6 +3849,7 @@ def get_time_micros(ndarray[int64_t] dtindex):
@cython.wraparound(False)
@cython.boundscheck(False)
def get_date_field(ndarray[int64_t] dtindex, object field):

This comment has been minimized.

@kawochen

kawochen Oct 15, 2015

Contributor

If you declared field as char[:] instead would you be able to nogil the whole thing until raise?

@kawochen

kawochen Oct 15, 2015

Contributor

If you declared field as char[:] instead would you be able to nogil the whole thing until raise?

This comment has been minimized.

@chris-b1

chris-b1 Oct 16, 2015

Contributor

hmm, tried that out, but cython doesn't seem to take a view of strings like that? http://stackoverflow.com/questions/28203670/how-to-use-cython-typed-memoryviews-to-accept-strings-from-python

@chris-b1

chris-b1 Oct 16, 2015

Contributor

hmm, tried that out, but cython doesn't seem to take a view of strings like that? http://stackoverflow.com/questions/28203670/how-to-use-cython-typed-memoryviews-to-accept-strings-from-python

@jreback jreback added this to the 0.17.1 milestone Oct 16, 2015

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 16, 2015

Contributor

@chris-b1 loooks good. can you add a whatsnew note (perf) and squash.

Contributor

jreback commented Oct 16, 2015

@chris-b1 loooks good. can you add a whatsnew note (perf) and squash.

@chris-b1 chris-b1 changed the title from (WIP) PERF: Release GIL on some datetime ops to PERF: Release GIL on some datetime ops Oct 16, 2015

@chris-b1

This comment has been minimized.

Show comment
Hide comment
@chris-b1

chris-b1 Oct 17, 2015

Contributor

@jreback - updated

Contributor

chris-b1 commented Oct 17, 2015

@jreback - updated

jreback added a commit that referenced this pull request Oct 17, 2015

Merge pull request #11263 from chris-b1/tslib-gil
PERF: Release GIL on some datetime ops

@jreback jreback merged commit 7e5b223 into pandas-dev:master Oct 17, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 17, 2015

Contributor

thanks!

Contributor

jreback commented Oct 17, 2015

thanks!

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 20, 2015

Contributor

@chris-b1 can you add these (clean then make again to see them)

warning: pandas/src/period.pyx:144:24: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:145:23: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:147:55: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:148:19: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:169:24: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:170:19: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:172:15: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:172:53: Use boundscheck(False) for faster access
building 'pandas._period' extension
Contributor

jreback commented Oct 20, 2015

@chris-b1 can you add these (clean then make again to see them)

warning: pandas/src/period.pyx:144:24: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:145:23: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:147:55: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:148:19: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:169:24: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:170:19: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:172:15: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:172:53: Use boundscheck(False) for faster access
building 'pandas._period' extension

@chris-b1 chris-b1 deleted the chris-b1:tslib-gil branch Oct 21, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment