Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(provider-sdk): use send_request to handle request timeout #1439

Closed
wants to merge 1 commit into from

Conversation

Iceber
Copy link
Member

@Iceber Iceber commented Feb 4, 2024

Feature or Problem

rpc_client.request will always use the default timeout, and even if the user sets a larger timeout, we will still be limited by it.

let payload = if let Some(timeout) = timeout {
// match tokio::time::timeout(timeout, this.request(topic, nats_body)).await {
match tokio::time::timeout(timeout, self.request(topic, nats_body)).await {
Err(_) => Err(InvocationError::Timeout),
Ok(Ok(data)) => Ok(data),

pub async fn request(&self, subject: String, payload: Vec<u8>) -> InvocationResult<Vec<u8>> {
match maybe_timeout(
self.timeout,
self.client
.request(subject, payload.into())

use send_request to send the request

Related Issues

Release Information

Consumer Impact

Testing

Unit Test(s)

Acceptance or Integration

Manual Verification

@Iceber Iceber force-pushed the fix_rpc_client_timeout branch 3 times, most recently from f92eded to 18b00bf Compare February 4, 2024 05:30
@Iceber Iceber changed the title fix(provider-sdk): request timeout always limited by the default timeout fix(provider-sdk): timeout always limited by the default rpc timeout Feb 4, 2024
@Iceber
Copy link
Member Author

Iceber commented Feb 4, 2024

It looks like wasmbus-rpc needs to be fixed as well, I'll submit a new pr over there after this one is approved.

@connorsmith256
Copy link
Contributor

Hey @Iceber, request is meant to be a low-level function used within the RPC SDK. Using a consistent timeout value is by design: providers receive this timeout value from the host as default_rpc_timeout_ms in the HostData. (This field's type is Option<u64>, but that is for historical reasons. The host always provides a value.)

request is not meant to be directly called by providers. I think it's a mistake that it is marked as public and the doc comment says it can be used for general NATS messages, because the client used by this function is configured to use an RPC-specific connection + credentials + timeout. CC @thomastaylor312). Providers should call send to send RPC messages.

For providers that need to send messages with a custom timeout, we have send_timeout, which takes a custom timeout argument. This should be used rarely. The only use case I know of is for the HTTP server, which allows setting a custom timeout for handling requests.

I think you did find a bug, however: if a provider calls send_timeout with a timeout of 10s, but the RPC client's inner timeout was set to 2s, the inner future returned by maybe_timeout will resolve after 2s, not waiting the full timeout.

So I think the fix should be to remove the tokio::time::timeout wrappers from the provider SDK. The underlying NATS client is already sending requests with a timeout, so there's no way to extend this timeout with tokio, which would be a more common use case than shortening the timeout.

To extend the timeout, we will need to refactor the call to self.client.request (note that the maybe_timeout around self.client.publish does nothing, since publishing should never take more than a few milliseconds). Instead of calling async_nats::Client::request, we will need to use async_nats::Client::send_request and pass a request with a timeout. Here is an example where we did this in the host.

@Iceber
Copy link
Member Author

Iceber commented Feb 5, 2024

So I think the fix should be to remove the tokio::time::timeout wrappers from the provider SDK. The underlying NATS client is already sending requests with a timeout, so there's no way to extend this timeout with tokio, which would be a more common use case than shortening the timeout.
To extend the timeout, we will need to refactor the call to self.client.request (note that the maybe_timeout around self.client.publish does nothing, since publishing should never take more than a few milliseconds). Instead of calling async_nats::Client::request, we will need to use async_nats::Client::send_request and pass a request with a timeout. Here is an example where we did this in the host.

Yes, it would be more appropriate to use send_request, and I'll update this pr

@Iceber Iceber force-pushed the fix_rpc_client_timeout branch 2 times, most recently from 073de22 to dcd2184 Compare February 5, 2024 09:07
@Iceber Iceber marked this pull request as draft February 5, 2024 09:17
@Iceber Iceber force-pushed the fix_rpc_client_timeout branch 3 times, most recently from 99d35a0 to ab7f6e4 Compare February 5, 2024 15:08
@Iceber Iceber marked this pull request as ready for review February 5, 2024 15:11
@Iceber Iceber changed the title fix(provider-sdk): timeout always limited by the default rpc timeout fix(provider-sdk): use send_request to handle request timeout Feb 5, 2024
@Iceber
Copy link
Member Author

Iceber commented Feb 5, 2024

Hi @connorsmith256 , I replaced maybe_request with send_request to solve the problem of timeout conflicts in RpcClient.
PTAL, Thanks

Copy link
Contributor

@connorsmith256 connorsmith256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for fixing this @Iceber. I'll wait to see if @thomastaylor312 has any other feedback before merging

EDIT: we're blocked on merging currently, as we debug an issue in CI. Stand by 🙏

@connorsmith256
Copy link
Contributor

@Iceber would you mind rebasing off main? I think we've resolved the issue in CI, but it's still happening on your branch

Copy link
Contributor

@thomastaylor312 thomastaylor312 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks all good. Merge away @connorsmith256! Thanks for the fix @Iceber

Signed-off-by: Iceber Gu <caiwei95@hotmail.com>
@connorsmith256
Copy link
Contributor

@Iceber we finally sorted out the issues with CI. I pushed your commit here and merged it, so I'll close this PR. Thanks again for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants