-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Describe the bug, including details regarding any error messages, version, and platform.
This is the cython code (pyx) implementation of flight_client.authenticate:
arrow/python/pyarrow/_flight.pyx
Line 1440 in 6a28035
| def authenticate(self, auth_handler, options: FlightCallOptions = None): |
1440 def authenticate(self, auth_handler, options: FlightCallOptions = None):
1441 """Authenticate to the server.
1442
1443 Parameters
1444 ----------
1445 auth_handler : ClientAuthHandler
1446 The authentication mechanism to use.
1447 options : FlightCallOptions
1448 Options for this call.
1449 """
1450 cdef:
1451 unique_ptr[CClientAuthHandler] handler
1452 CFlightCallOptions* c_options = FlightCallOptions.unwrap(options)
1453
1454 if not isinstance(auth_handler, ClientAuthHandler):
1455 raise TypeError(
1456 "FlightClient.authenticate takes a ClientAuthHandler, "
1457 "not '{}'".format(type(auth_handler)))
1458 handler.reset((<ClientAuthHandler> auth_handler).to_handler())
1459 with nogil:
1460 check_flight_status(
1461 self.client.get().Authenticate(deref(c_options),
1462 move(handler)))
Note the object being stored inside the handler object is returned by to_handler defined later on the same file:
arrow/python/pyarrow/_flight.pyx
Line 2501 in 6a28035
| cdef PyClientAuthHandler* to_handler(self): |
2482 cdef class ClientAuthHandler(_Weakrefable):
2483 """Authentication plugin for a client."""
2484
[...]
2500
2501 cdef PyClientAuthHandler* to_handler(self):
2502 cdef PyClientAuthHandlerVtable vtable
2503 vtable.authenticate = _client_authenticate
2504 vtable.get_token = _get_token
2505 return new PyClientAuthHandler(self, vtable)
The call to Authenticate on _flight.pyx line 1461 is to grpc_client.cc line 860:
| Status Authenticate(const FlightCallOptions& options, |
860 Status Authenticate(const FlightCallOptions& options,
861 std::unique_ptr<ClientAuthHandler> auth_handler) override {
862 auth_handler_ = std::move(auth_handler);
863 ClientRpc rpc(options);
864 return AuthenticateInternal(rpc);
865 }
Many calls in the same file use the auth_handler_ member of the struct
| std::shared_ptr<ClientAuthHandler> auth_handler_; |
that is being assigned on line 862 above, eg, DoPut:
| RETURN_NOT_OK(rpc->SetToken(auth_handler_.get())); |
997 Status DoPut(const FlightCallOptions& options,
998 std::unique_ptr<internal::ClientDataStream>* out) override {
999 using GrpcStream = ::grpc::ClientReaderWriter<pb::FlightData, pb::PutResult>;
1000
1001 auto rpc = std::make_shared<ClientRpc>(options);
1002 RETURN_NOT_OK(rpc->SetToken(auth_handler_.get()));
1003 std::shared_ptr<GrpcStream> stream = stub_->DoPut(&rpc->context);
1004 *out = std::make_unique<GrpcClientPutStream>(std::move(rpc), std::move(stream));
1005 return Status::OK();
1006 }
SetToken on the same file:
| Status SetToken(ClientAuthHandler* auth_handler) { |
86 /// \brief Add an auth token via an auth handler
87 Status SetToken(ClientAuthHandler* auth_handler) {
88 if (auth_handler) {
89 std::string token;
90 RETURN_NOT_OK(auth_handler->GetToken(&token));
91 context.AddMetadata(kGrpcAuthHeader, token);
92 }
93 return Status::OK();
94 }
There is a race between calling auth_handler_.get() to get out a raw pointer out of the auth_handler_ shared_ptr and using it inside SetToken, and another thread changing the value of auth_handler_ via the its assignment operator call in Authenticate, which can trigger the deletion of the previously held pointer value. That deletion can happen in another thread after the call to auth_handler_.get() to get the raw pointer value and before SetToken using that raw pointer value.
My team builds a server and client libraries based on flight. One of our customers ran into an issue while using hundreds of concurrent sessions to our service. Our client library was using calls to FlightClient.authenticate triggered by a timer once every 5 minutes; that concurrently with hundreds of calls to DoPut triggered the problem. We were able to reproduce the problem by artificially increasing the frequency of calls to FlightClient.authenticate to 3 seconds; we saw the same problem with either concurrent DoPut or DoGet. We saw the problem on pyarrow 16.0.0 on Ubuntu Linux 22.04.
More details about the symptoms we saw here: deephaven/deephaven-core#5489
Component(s)
FlightRPC