Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10487 [FlightRPC][C++] Header-based auth in clients #8724

Conversation

lyndonbauto
Copy link
Contributor

@lyndonbauto lyndonbauto commented Nov 20, 2020

Added support for header based authentication in clients.

  • Added support for base 64 encoded username / password auth to match Java implementation
  • Added bearer token receiving and populating of call options to send back
  • Added unit tests that connects C++ client and a mock C++ server to test authentication

@github-actions
Copy link

@lyndonbauto
Copy link
Contributor Author

@lidavidm
Hi David,
This is currently a work in progress. I was wondering if you could take a look at how this was integrated into the C++ library and let me know if it is okay or if there is a better way to do it? Afterwards I will make any required code changes and fix the code linting/commenting.
Thanks!

@lidavidm lidavidm self-requested a review November 20, 2020 18:49
@@ -51,6 +51,7 @@

#include "arrow/flight/client_auth.h"
#include "arrow/flight/client_middleware.h"
#include "arrow/flight/client_header_auth_middleware.h"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ordering alphabetical? If so this should be up one...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was waiting for feedback on design before proper formatting. I will correct this.

@@ -328,7 +332,7 @@ class GrpcClientInterceptorAdapterFactory
: public grpc::experimental::ClientInterceptorFactoryInterface {
public:
GrpcClientInterceptorAdapterFactory(
std::vector<std::shared_ptr<ClientMiddlewareFactory>> middleware)
std::vector<std::shared_ptr<ClientMiddlewareFactory>>& middleware)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be const?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should either be const reference or a pointer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'd rather we store a reference to this interceptor factory instead of a reference to the middleware with a mutable pointer back, i.e. I'd rather have this class own the middleware as it currently does, and have FlightClient call a method of this class to add more middleware at runtime.

@@ -371,7 +375,7 @@ class GrpcClientInterceptorAdapterFactory
}

private:
std::vector<std::shared_ptr<ClientMiddlewareFactory>> middleware_;
std::vector<std::shared_ptr<ClientMiddlewareFactory>>& middleware_;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intention is to keep the reference and not a copy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a pointer if not a copy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to keep the reference, but you got me thinking more and I think I should just take a copy and then make a function to allow adding to the middleware on the fly, since the issue is that I need to be able to add and remove middleware from the interceptor. I will rejig this.

@@ -963,8 +967,9 @@ class FlightClient::FlightClientImpl {

std::vector<std::unique_ptr<grpc::experimental::ClientInterceptorFactoryInterface>>
interceptors;
middleware = std::move(options.middleware);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goal was so that middleware was retained in this class so I could push and pop it. I am going to adjust the implementation though.

@@ -993,6 +998,30 @@ class FlightClient::FlightClientImpl {
return Status::OK();
}

Status AuthenticateBasicToken(std::string username, std::string password,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should username/password be passed by const ref?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Status AuthenticateBasicToken(std::string username, std::string password,
std::pair<std::string, std::string>* bearer_token) {
// Add bearer token factory to middleware so it can intercept the bearer token.
middleware.push_back(std::make_shared<ClientBearerTokenFactory>(bearer_token));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd to create the shared pointer after you've passed in the raw pointer....it seems like the method itself should take a shared pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raw pointer is unpopulated, so it's passed to the BearerTokenFactory's constructor, which stores it and populated it when it receives the bearer token. I could make the client pass the whole factory in with the bearer token already inside it, but it's more work and requires they understand what's going on more than they otherwise would need to.

void ClientBearerTokenMiddleware::ReceivedHeaders(
const CallHeaders& incoming_headers) {
// Grab the auth token if one exists.
auto bearer_iter = incoming_headers.find(AUTH_HEADER);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

// Check if the value of the auth token starts with the bearer prefix, latch the token.
std::string bearer_val = bearer_iter->second.to_string();
if (bearer_val.size() > BEARER_PREFIX.size()) {
bool hasPrefix = std::equal(bearer_val.begin(), bearer_val.begin() + BEARER_PREFIX.size(), BEARER_PREFIX.begin(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like they use snake_case instead of camelCase.

void ClientBearerTokenMiddleware::CallCompleted(const Status& status) { }

void ClientBearerTokenFactory::StartCall(const CallInfo& info, std::unique_ptr<ClientMiddleware>* middleware) {
*middleware = std::unique_ptr<ClientBearerTokenMiddleware>(new ClientBearerTokenMiddleware(bearer_token));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::make_unique?


void ClientBearerTokenMiddleware::CallCompleted(const Status& status) { }

void ClientBearerTokenFactory::StartCall(const CallInfo& info, std::unique_ptr<ClientMiddleware>* middleware) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to pass a reference instead of a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this is a method of the base class, I think it's done this way to allow you to assign a new unique pointer to it without exposing the other middlewares they are already holding in their vector.

std::string string_format(const std::string& format, const Args... args) {
// Check size requirement for new string and increment by 1 for null terminator.
size_t size = std::snprintf(nullptr, 0, format.c_str(), args ...) + 1;
if(size <= 0){

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: spacing between if (, ){

}

std::string base64_encode(const std::string& input) {
static const std::string base64_chars =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't exist in the codebase already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I found it. Will remove this.

@lidavidm
Copy link
Member

lidavidm commented Nov 20, 2020

Note, for formatting, use clang-format; for naming conventions, please follow Google C++ guidelines. See the developer docs.

In particular, use snake_case, use pointers instead of non-const references, and note that constants are named kCamelCase not UPPER_SNAKE_CASE.

#include <cctype>
#include <string>

const std::string AUTH_HEADER = "authorization";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be in the header, or should they be only in the CC.
If in the header, I think they're created anew for each compilation unit, and thus should be defined as extern with the actual value defined in the CC to avoid multiple instantiations.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

Also:

  • We should have unit tests, not just integration tests
  • If you want this to actually run as part of the integration suite, you'll have to add it to the list of test scenarios and various other places; maybe take a look at the commit:
    0188e45

@@ -328,7 +332,7 @@ class GrpcClientInterceptorAdapterFactory
: public grpc::experimental::ClientInterceptorFactoryInterface {
public:
GrpcClientInterceptorAdapterFactory(
std::vector<std::shared_ptr<ClientMiddlewareFactory>> middleware)
std::vector<std::shared_ptr<ClientMiddlewareFactory>>& middleware)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should either be const reference or a pointer.

@@ -371,7 +375,7 @@ class GrpcClientInterceptorAdapterFactory
}

private:
std::vector<std::shared_ptr<ClientMiddlewareFactory>> middleware_;
std::vector<std::shared_ptr<ClientMiddlewareFactory>>& middleware_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a pointer if not a copy.

@@ -328,7 +332,7 @@ class GrpcClientInterceptorAdapterFactory
: public grpc::experimental::ClientInterceptorFactoryInterface {
public:
GrpcClientInterceptorAdapterFactory(
std::vector<std::shared_ptr<ClientMiddlewareFactory>> middleware)
std::vector<std::shared_ptr<ClientMiddlewareFactory>>& middleware)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'd rather we store a reference to this interceptor factory instead of a reference to the middleware with a mutable pointer back, i.e. I'd rather have this class own the middleware as it currently does, and have FlightClient call a method of this class to add more middleware at runtime.

@@ -65,6 +65,9 @@ class ARROW_FLIGHT_EXPORT FlightCallOptions {

/// \brief IPC writer options, if applicable for the call.
ipc::IpcWriteOptions write_options;

/// \brief Metadata for client to add to context.
std::vector<std::pair<std::string, std::string>> metadata;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we call it headers elsewhere, so this should stay consistent with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will correct this.

@@ -191,6 +194,14 @@ class ARROW_FLIGHT_EXPORT FlightClient {
Status Authenticate(const FlightCallOptions& options,
std::unique_ptr<ClientAuthHandler> auth_handler);

/// \brief Authenticate to the server using the given handler.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no handler in play here.

#include <cctype>
#include <string>

const std::string AUTH_HEADER = "authorization";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be named kAuthHeader etc. and should go in the .cc file unless we actually want to expose these to users.

Or alternatively, the other constants are currently in one of the _internal.h headers.

namespace flight {

// TODO: Need to add documentation in this file.
void ARROW_FLIGHT_EXPORT AddBasicAuthHeaders(grpc::ClientContext* context,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is internal - it shouldn't be in a public header. (Ditto for the grpc include.)

@@ -0,0 +1,78 @@
// Licensed to the Apache Software Foundation (ASF) under one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an _internal.h header since I don't think any of this is intended to be directly used outside of the implementation here.

*bearer_token = std::make_pair("", "");
}

template<typename ... Args>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need an entire templated function to concatenate two strings.

// Check size requirement for new string and increment by 1 for null terminator.
size_t size = std::snprintf(nullptr, 0, format.c_str(), args ...) + 1;
if(size <= 0){
throw std::runtime_error("Error during string formatting. Format: '" + format + "'.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And Arrow disallows exceptions.

@lidavidm
Copy link
Member

By the way, for CMake formatting, please use cmake_format

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just two minor comments (one about the interface, one about a test)

@@ -1198,6 +1227,12 @@ Status FlightClient::Authenticate(const FlightCallOptions& options,
return impl_->Authenticate(options, std::move(auth_handler));
}

Status FlightClient::AuthenticateBasicToken(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this arrow::Result<std::pair<>>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I will make this change.

std::pair<std::string, std::string> bearer_token;
// Note: Status intentionally ignored because it requires C++ server implementation of
// header auth. For now it returns an IOError.
arrow::Status status = client_->AuthenticateBasicToken(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intent is to test that this fails, we can ASSERT_RAISES(IOError, status) - even if we had an implementation of header auth, we'd want this to fail, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was an artifact of my initial test implementation which was incorrect on the test server, and always caused an error. I forgot to remove it from the negative test once I fixed it. I will correct this.

[2] Switched return type of AuthenticateBasicToken to use arrow::Result instead of Status and removed bearer token from parameter list
@lyndonbauto
Copy link
Contributor Author

@lidavidm Hey David, I have made the requested changes. I got a random Javascript failure in one of the CI teardowns, not sure if this is accepted or if you can restart the CI build (I do not have permission to restart it myself).

@lyndonbauto
Copy link
Contributor Author

@lidavidm Hey David, I have made the requested changes. I got a random Javascript failure in one of the CI teardowns, not sure if this is accepted or if you can restart the CI build (I do not have permission to restart it myself).

@lidavidm All tests are now passing 🎉

@lidavidm lidavidm changed the title ARROW-10487 [FlightRPC][C++][Python] Header-based auth in clients ARROW-10487 [FlightRPC][C++] Header-based auth in clients Nov 25, 2020
@lidavidm lidavidm closed this in 65aa527 Nov 25, 2020
@lidavidm
Copy link
Member

Thanks @lyndonb-bq!

GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
Added support for header based authentication in clients.
- Added support for base 64 encoded username / password auth to match Java implementation
- Added bearer token receiving and populating of call options to send back
- Added unit tests that connects C++ client and a mock C++ server to test authentication

Closes apache#8724 from lyndonb-bq/jduo/lyndon/flight-auth-cpp-redesign-client

Authored-by: Lyndon Bauto <lyndonb@bitquilltech.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants