Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exhausted heap space: Out of Memory error with remote-terminated long-running rpc call #458

Closed
matejkramny opened this issue Mar 13, 2021 · 7 comments

Comments

@matejkramny
Copy link

matejkramny commented Mar 13, 2021

Hello,

I've isolated an issue with this library, although I am unable to debug it and find the precise cause. Would appreciate help with this.
The issue is with a long running RPC which, if terminated by the remote (f.e. server killed) causes the following error message

E/DartVM  ( 4833): Exhausted heap space, trying to allocate 1752460688 bytes.
[UpdateStream] [FINE] gRPC Error (code: 2, codeName: UNKNOWN, message: Out of Memory, details: null, rawResponse: null)

Library: grpc 2.9.0 on flutter mobile.

Repro steps

Set up a Bidirectional or server stream (long-running).

  1. Connect via the stream
  2. Kill the server
  3. Sometimes get the correct Error: Stream was terminated by peer thrown, otherwise straight to 6.
  4. Restart the connection
  5. Kill the server
  6. Error Out of memory thrown

Expected result: no out of heap space complaints

Actual result: As above in the intro

Details

I've tried looking through the dart debugger into memory when the heap space exhaustion happens but I cannot find anything relevant.
Tracing the thrown error leads to client/call.dart function _terminate (but line 0 ??). Dart cannot specify the line number and further frames up are not shown, probably due to the call trace size limit or due to dart being unable to locate the frame because of Futures.

What is the best way to debug this and find what's causing this?

Would love to help via PR, but need to find the issue first

@mraleph
Copy link
Member

mraleph commented Mar 15, 2021

Debugger is not notified on OOM exceptions because it might OOM itself.

Try running your dart server with --print_stacktrace_at_throw passed to it. It should print a stack trace for OOM.

@matejkramny
Copy link
Author

The error is happening on a dart client, sorry should have specified. Server is using a go library.

let me try make a demo you which can reproduce the issue

@mraleph
Copy link
Member

mraleph commented Mar 15, 2021 via email

@matejkramny
Copy link
Author

Ah right sorry, I misunderstood. Will try that and report back. Thank you

@matejkramny
Copy link
Author

matejkramny commented Mar 16, 2021

I've got something, but not sure what I'm looking at.

Screenshot_20210316_095055
Screenshot_20210316_094434

@mraleph
Copy link
Member

mraleph commented Mar 16, 2021

This looks like we are receiving some garbage from the server which it then tries to misinterpret as a gRPC message. [60, 104, 116, 109, 108] is <html in ASCII.

You could try to patch the package, by adding the following code before Uint8List(...) allocation:

if (dataLength == 1752460652) {
  print(utf8.decode(chunkData));
}

This would print the content that server send back. I guess when server is shutting down it takes grpc handler down before taking down HTTP server and then you receive a normal 404 back or something like that.

I can actually reproduce this behavior if I make a normal HTTP/2 server (not gRPC capable) and then try to use gRPC client to talk to it.

There is no checking in the _GrpcMessageConversionSink which would validate that the stream we are getting is actually following gRPC protocol - it just assumes that it is.

I guess we could put something like this in:

diff --git a/lib/src/shared/streams.dart b/lib/src/shared/streams.dart
index cfd5869..aa6aff1 100644
--- a/lib/src/shared/streams.dart
+++ b/lib/src/shared/streams.dart
@@ -117,7 +117,10 @@ class _GrpcMessageConversionSink extends ChunkedConversionSink<StreamMessage> {
       // TODO(jakobr): Handle duplicate header names correctly.
       headers[ascii.decode(header.name)] = ascii.decode(header.value);
     }
-    // TODO(jakobr): Check :status, go to error mode if not 2xx.
+    final contentType = headers['content-type'];
+    if (contentType == null || !contentType.startsWith('application/grpc')) {
+      throw GrpcError.unimplemented('Received non-gRPC message');
+    }
     _out.add(GrpcMetadata(headers));
   }

@matejkramny
Copy link
Author

Thanks, that's vert helpful!
Turns out the remote nginx proxy was returning HTTP 503 and disregarding any grpc protocols, causing this issue.

1752460652 bytes is ~1.7gb. I'm not sure what a sane default would be though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants