Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][FlightRPC] Segfault when sending record batch >2GB #28935

Closed
asfimport opened this issue Jul 2, 2021 · 3 comments
Closed

[C++][FlightRPC] Segfault when sending record batch >2GB #28935

asfimport opened this issue Jul 2, 2021 · 3 comments

Comments

@asfimport
Copy link
Collaborator

When sending a record batch > 2GiB, the server will segfault. Although Flight checks for this case and returns an error, it turns out that gRPC always tries to increment the refcount of the result buffer whether the serialization handler returned successfully or not:

// From gRPC 1.36
Status CallOpSendMessage::SendMessagePtr(const M* message,
                                         WriteOptions options) {
  msg_ = message;
  write_options_ = options;
  // Store the serializer for later since we have access to the message
  serializer_ = [this](const void* message) {
    bool own_buf;
    // TODO(vjpai): Remove the void below when possible
    // The void in the template parameter below should not be needed
    // (since it should be implicit) but is needed due to an observed
    // difference in behavior between clang and gcc for certain internal users
    Status result = SerializationTraits<M, void>::Serialize(
        *static_cast<const M*>(message), send_buf_.bbuf_ptr(), &own_buf);
    if (!own_buf) {
      // XXX(lidavidm): This should perhaps check result.ok(), or Serialize should
      // unconditionally initialize send_buf_
      send_buf_.Duplicate();
    }
    return result;
  };
  return Status();
}

Hence when Flight returns an error without initializing the buffer, we get a segfault.

Originally reported on StackOverflow: https://stackoverflow.com/questions/68230146/pyarrow-flight-do-get-segfault-when-pandas-dataframe-over-3gb

Reporter: David Li / @lidavidm
Assignee: David Li / @lidavidm

PRs and other links:

Note: This issue was originally created as ARROW-13253. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

David Li / @lidavidm:
Now this is annoying: gRPC fails an assertion later on anyways if the serialization handler raises an error. For Flight to do what it intends, it needs to move the check up. But then we may want to move the computation of the body size up as well, or we'll be duplicating work.

@asfimport
Copy link
Collaborator Author

David Li / @lidavidm:
Linking an upstream bug report.

@asfimport
Copy link
Collaborator Author

yibocai#1:
Issue resolved by pull request 10663
#10663

@asfimport asfimport added this to the 5.0.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants