Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create App context callback for timer custom #18898

Merged
merged 1 commit into from
Jul 23, 2019

Conversation

pfreixes
Copy link
Collaborator

@pfreixes pfreixes commented Apr 28, 2019

When a timer reaches its deadline and it uses the combination of having
enabled the custom timer and using a completion queue of callback type we
must provide the App context callback at TLS level. Otherwise, a segfault
is produced during the enqueuing process of the application callback.

Resolves partially this issue #18721

What is missing?

Testing, before implementing it I would prefer on having the blessing on the intentionality of that PR and also some advice on how to implement a test for this PR.

I have the impression that test this fix should be doable at unit test level, so having an ad-hoc test into a new file that will basically call the grpc_custom_timer_callback function by mocking a timer->closure function that would try to enqueue something into the application callback context.

Another alternative is not doing any unit test at all considering that other commits related to that issue haven't considered it [1], maybe relying on a - future ? - end2end test?

Further work
If this PR is accepted future PR would be done for all of the custom iomgr files that lack, like tcp_custom or tcp_server_custom, on the application context.

/cc @vjpai and @markdroth happy to hear your comments

[1] 85d76b2

@pfreixes
Copy link
Collaborator Author

ping @markdroth and @vjpai

@vjpai
Copy link
Member

vjpai commented Apr 30, 2019

Do you have a test that fails without this and passes with it? Overall it looks good to me but I'd like to see it tested.

@pfreixes
Copy link
Collaborator Author

pfreixes commented May 1, 2019

@vjpai thanks for your review.

Agree in an ideal world a test would be needed, indeed I have the feeling that this could be covered with a unit test instead of using a general end2end test. But I have the feeling - seeing the current code - that the following pattern is not explicitly tested in any place

void foo(...) {
    grpc_core::ApplicationCallbackExecCtx callback_exec_ctx;
    grpc_core::ExecCtx exec_ctx;
    ....
}

As I was commenting there are other places where this fix needs to be applied, like tcp_custom.cc, most of them related to the usage of a custom iomgr together with a callback completion queue, testing these changes - being only one line - is not gonna be something trivial.

So IMO we have different options:

  1. Not testing it
  2. Cover these changes with unit tests.
  3. Implement a set of end2end tests that implicitly test all of the entry points/callbacks to the grpc library when a custom iomgr instance is used. This is gonna be a none negligible effort, considering that right now the custom iomgr does not have any end2end test. I'm wondering if current test coverage of the custom iomgr relays on other language implementations that have frameworks that are using the custom iomgr, like the gevent one for python [1].

Thoughts?

[1]

grpc_custom_iomgr_init(&gevent_socket_vtable,

@pfreixes
Copy link
Collaborator Author

hi, @vjpai any thoughts on that? Pending questions that I do have:

  1. Do we require a specific test for the change proposed? If so, what strategy is less painful?
  2. Is the CI failing? Seems that the CSharp test are failing.... but seems that it's a DNS issue problem which wouldn't be related with that change
  3. What does it mean having the Mergeable step failing?

@lidizheng lidizheng added lang/core release notes: no Indicates if PR should not be in release notes kokoro:run labels Jul 8, 2019
@lidizheng
Copy link
Contributor

The csharp test fails after re-run. It might signal a true regression.
@jtattermusch Can you take a look at the failed C# test log?

@jtattermusch
Copy link
Contributor

I think the C# issue looks like a pre-existing one:
#18637
https://source.cloud.google.com/results/invocations/bbd7497c-aaf1-4546-88c2-f63df33d9176/targets/github%2Fgrpc%2Fcsharp_windows_opt_native_default_coreclr/tests
@apolcyn might know more.

I tried rerunning the test to see if the test is failing consistently.

@pfreixes
Copy link
Collaborator Author

Seems that now are failing other tests ..... any suggestions on that? does it make sense on rebase this branch with the master one? Theoretically should make no difference ... but happy to do it if you prefer to diagnose a PR that is up to date with master.

@lidizheng
Copy link
Contributor

@pfreixes Rebase this PR is a good idea.

When a timer reaches its deadline and it uses the combination of having
enabled the custom timer and using a completion queue of callback type we
must provide the App context callback at TLS level. Otherwise, a segfault
is produced during the enqueuing process of the application callback.
@pfreixes pfreixes force-pushed the initialize_callbackctx_timer_custom branch from 9d93aaf to 884b68c Compare July 18, 2019 20:59
@pfreixes
Copy link
Collaborator Author

done @lidizheng !

@lidizheng
Copy link
Contributor

The internal presubmit tests is passed. Id for the change -> 258871136.
Our GitHub testing infrastructure seems having a bad time recently...

@lidizheng
Copy link
Contributor

Known failure: #19614 #19669

@lidizheng lidizheng merged commit cabc1bd into grpc:master Jul 23, 2019
@pfreixes
Copy link
Collaborator Author

Thanks for helping me on having this merged!

@lock lock bot locked as resolved and limited conversation to collaborators Oct 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lang/core release notes: no Indicates if PR should not be in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants