New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Greenlet.add_spawn_callback() and Greenlet.remove_spawn_callback() #1289

Merged
merged 6 commits into from Oct 16, 2018

Conversation

Projects
None yet
3 participants
@1st1
Contributor

1st1 commented Oct 15, 2018

These methods allow to set up code to intercept Greenlet objects
creation. This is useful to implement custom tracing of gevent tasks
and custom variants of gevent.local().

The performance impact is minuscule: one extra fast "if" check in
Greenlet.__init__.

@1st1

This comment has been minimized.

Contributor

1st1 commented Oct 15, 2018

Context: we use gevent heavily at Pinterest. Having this API in gevent would preclude us from maintaining a private fork. We also believe that this functionality is generally useful for any large-scale gevent deployment with custom instrumentation to trace tasks/io and persist task-local data.

@1st1 1st1 force-pushed the 1st1:task_hook branch from 67a7f3a to cad2f8e Oct 15, 2018

@jamadden

This comment has been minimized.

Member

jamadden commented Oct 15, 2018

Hi, and thanks for the contribution!

One question I have is why couldn't this be done with a subclass in your code base? What I mean is, is there someplace in gevent or elsewhere that specifically and only uses a hardcoded Greenlet.spawn or Greenlet() call? gevent generally tries to make using subclasses easy with things like the greenlet_class attribute of a Pool and similar; if there are places that are missing we can work on adding them.

I'm somewhat concerned by the way the reference to self now escapes __init__ potentially before subclass __init__ have run and finished their set up. What if the callbacks were called from start() and start_later() instead?

@1st1

This comment has been minimized.

Contributor

1st1 commented Oct 15, 2018

One question I have is why couldn't this be done with a subclass in your code base? What I mean is, is there someplace in gevent or elsewhere that specifically and only uses a hardcoded Greenlet.spawn or Greenlet() call?

Using Greenlet() directly to create tasks is a documented pattern and we have some examples of it in our (constantly growing) code base. I suspect that it's probably used in some third-party libraries as well. We simply can't be sure that some library that has always been using spawn() wouldn't start using Greenlet() at some point.

Therefore the most reliable option for us was monkey patching of the gevent.Greenlet.__init__ function. That way we were absolutely certain that all gevent.Greenlets (no matter how they were created) are instrumented and we always get complete traces. However monkey patching, as of gevent 1.3.x, no longer works, because Greenlet type is now compiled with Cython and Greenlet.__init__ is now readonly.

FWIW, this use case is one of the reasons why asyncio.set_task_factory() exists.

I'm somewhat concerned by the way the reference to self now escapes init potentially before subclass init have run and finished their set up. What if the callbacks were called from start() and start_later() instead?

Doing that in start() and start_later() would also work!

Show resolved Hide resolved src/gevent/greenlet.py Outdated
Show resolved Hide resolved src/gevent/greenlet.py Outdated
@1st1

This comment has been minimized.

Contributor

1st1 commented Oct 15, 2018

I've updated the PR to run callbacks in Greenlet.start() and Greenlet.start_later(). The CI should also be fixed now. 🤞

1st1 added some commits Oct 15, 2018

Add Greenlet.add_spawn_callback() and Greenlet.remove_spawn_callback()
These methods allow to set up code to intercept Greenlet objects
creation.  This is useful to implement custom tracing of gevent tasks
and custom variants of `gevent.local()`.

The performance impact is minuscule: one extra fast "if" check in
`Greenlet.__init__`.

@1st1 1st1 force-pushed the 1st1:task_hook branch from 89b2baa to 9e8833e Oct 15, 2018

@jamadden

This comment has been minimized.

Member

jamadden commented Oct 15, 2018

I've updated the PR to run callbacks in Greenlet.start() and Greenlet.start_later(). The CI should also be fixed now.

It looks like it actually is doing it from Greenlet.spawn() and Greenlet.spawn_later(). Is that what you meant? That means that Greenlet(a_function).start() will not call the callbacks.

@jamadden

This comment has been minimized.

Member

jamadden commented Oct 15, 2018

That way we were absolutely certain that all gevent.Greenlets (no matter how they were created) are instrumented and we always get complete traces.

Thanks, that makes sense and is pretty much what I was thinking.

@1st1

This comment has been minimized.

Contributor

1st1 commented Oct 15, 2018

It looks like it actually is doing it from Greenlet.spawn() and Greenlet.spawn_later(). Is that what you meant? That means that Greenlet(a_function).start() will not call the callbacks.

Good catch! Yes, I wanted to change start() and start_later() and was confused by similarly looking spawn() and spawn_later(). I've pushed another commit with improved unittests to make sure we test Greenlet().start(), Grenlet.spawn(), and Greenlet.spawn_later().

@1st1 1st1 force-pushed the 1st1:task_hook branch from b343f64 to 5dba8d3 Oct 15, 2018

1st1 added some commits Oct 15, 2018

@1st1

This comment has been minimized.

Contributor

1st1 commented Oct 16, 2018

Travis is green; AppVeyor has one failed test that seems to be unrelated to this PR.

@jamadden

This comment has been minimized.

Member

jamadden commented Oct 16, 2018

I agree the failure is unrelated. Thanks for the PR!

@jamadden jamadden merged commit 19af005 into gevent:master Oct 16, 2018

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

jamadden added a commit that referenced this pull request Oct 16, 2018

@1st1

This comment has been minimized.

Contributor

1st1 commented Oct 16, 2018

Thank you!

@1st1 1st1 deleted the 1st1:task_hook branch Oct 16, 2018

@1st1

This comment has been minimized.

Contributor

1st1 commented Nov 19, 2018

Hi @jamadden, is there any ETA for the next release that includes this change?

@jamadden

This comment has been minimized.

Member

jamadden commented Nov 19, 2018

No specific ETA, but hopefully before end-of-year. (I'm pretty slammed right now with other commitments.) I'm also hoping manylinux2010 comes out so I can distribute libuv support in the binary linux wheels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment