Skip to content

feat(auth): implement SigV4 authentication for REST catalog#616

Open
plusplusjiajia wants to merge 15 commits into
apache:mainfrom
plusplusjiajia:sigv4
Open

feat(auth): implement SigV4 authentication for REST catalog#616
plusplusjiajia wants to merge 15 commits into
apache:mainfrom
plusplusjiajia:sigv4

Conversation

@plusplusjiajia
Copy link
Copy Markdown
Member

Implement AWS SigV4 authentication for the REST catalog client, following Java's RESTSigV4AuthManager and RESTSigV4AuthSession.

  • Extend AuthSession::Authenticate() with HTTPRequestContext (method, url, body) for SigV4 request signing
  • Add SigV4AuthSession: delegate-first auth → relocate conflicting Authorization header → sign with AWS SDK
  • Add SigV4AuthManager: wraps delegate AuthManager (default OAuth2), resolves credentials from properties or default chain
  • Body hash matches Java's SignerChecksumParams output: empty body → hex EMPTY_BODY_SHA256; non-empty body → Base64(SHA256(body))

@plusplusjiajia plusplusjiajia force-pushed the sigv4 branch 9 times, most recently from d1c0732 to 4326282 Compare April 11, 2026 11:01
Comment on lines +120 to +121
ICEBERG_PRECHECK(delegate_type != AuthProperties::kAuthTypeSigV4,
"Cannot delegate a SigV4 auth manager to another SigV4 auth manager");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add delegate_type in error message?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add delegate_type in error message?

Good idea, done.

Comment thread .github/workflows/cpp-linter.yml Outdated
- name: Install dependencies
shell: bash
run: sudo apt-get update && sudo apt-get install -y libcurl4-openssl-dev
run: sudo apt-get update && sudo apt-get install -y libcurl4-openssl-dev ninja-build
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n00b question why is ninja-build required here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question — I added a build step in this PR so the linter can see the SigV4 code (needs compile_commands.json from a real build). I used cmake -G Ninja for speed and to be consistent with the other CI workflows, and Ninja is not preinstalled on ubuntu-24.04, hence the extra ninja-build package. Happy to switch to Make if you'd prefer.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhjwpku — you're right, Ninja is pre-installed on the ubuntu-24.04 runner. Dropped the ninja-build from the apt install step

Comment on lines +252 to +260
if (session_token_it != properties.end() && !session_token_it->second.empty()) {
Aws::Auth::AWSCredentials credentials(access_key_it->second.c_str(),
secret_key_it->second.c_str(),
session_token_it->second.c_str());
return std::make_shared<Aws::Auth::SimpleAWSCredentialsProvider>(credentials);
}
Aws::Auth::AWSCredentials credentials(access_key_it->second.c_str(),
secret_key_it->second.c_str());
return std::make_shared<Aws::Auth::SimpleAWSCredentialsProvider>(credentials);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could do only one return if Credentials are created in the conditional statement.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could do only one return if Credentials are created in the conditional statement.

Nice catch, done.

Comment on lines +287 to +290
auto it = properties.find(AuthProperties::kSigV4SigningName);
if (it != properties.end() && !it->second.empty()) {
return it->second;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(properties.count(AuthProperties::kSigV4SigningName) > 0) {
   // do work
}

might be a little less verbose than

 auto it = properties.find(AuthProperties::kSigV4SigningName);
  if (it != properties.end() && !it->second.empty()) {
    return it->second;
  }

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I chose to keep the !it->second.empty() check on purpose — the intent is for an explicitly-empty value (e.g., a env var set to "") to also fall through to the legacy key / default.

const TableIdentifier& table,
const std::unordered_map<std::string, std::string>& properties,
std::shared_ptr<AuthSession> parent) {
auto* sigv4_parent = dynamic_cast<SigV4AuthSession*>(parent.get());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use checked_pointer_cast instead

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use checked_pointer_cast instead

Done here as well.

Result<std::shared_ptr<AuthSession>> SigV4AuthManager::ContextualSession(
const std::unordered_map<std::string, std::string>& context,
std::shared_ptr<AuthSession> parent) {
auto* sigv4_parent = dynamic_cast<SigV4AuthSession*>(parent.get());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use checked_pointer_cast

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use checked_pointer_cast

Thanks, that's much better! Done.

std::string signing_region_;
std::string signing_name_;
std::shared_ptr<Aws::Auth::AWSCredentialsProvider> credentials_provider_;
/// Shared signer instance, matching Java's single Aws4Signer per manager.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this comment a bit confusing especially given that signer_ is a unique pointer that will be destroyed when SigV4AuthSession is destructed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, sorry about the confusion — the "shared signer" wording was misleading since signer_ is owned per-session via unique_ptr. I've removed the comment.

Comment thread src/iceberg/test/auth_manager_test.cc Outdated

std::unordered_map<std::string, std::string> headers;
ASSERT_THAT(session_result.value()->Authenticate(headers), IsOk());
ASSERT_THAT(session_result.value()->Authenticate(headers, {}), IsOk());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels like Authenticate coudl accept a default value for the second parameter

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks! Done

/// - IOError: Network or connection errors when reaching auth server
/// - RestError: HTTP errors from authentication service
virtual Status Authenticate(std::unordered_map<std::string, std::string>& headers) = 0;
virtual Status Authenticate(std::unordered_map<std::string, std::string>& headers,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current design splits the request context into two separate parameters (headers as in-out + HTTPRequestContext as a separate struct).
The Java implementation uses a cleaner "request-in, request-out" pattern where authenticate() receives the full HTTPRequest and returns a new immutable request with auth headers, I'd suggest aligning with Java by introducing an HTTPRequest type that encapsulates method, url, headers, and body together, and changing the signature to:

virtual Result Authenticate(const HTTPRequest& request) = 0;

I'm open for this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lishuxu ! Agreed — aligning with Java's request-in/request-out pattern is the right call. I'll address this in the current PR: introducing an HTTPRequest type (encapsulating method, url, headers, body), changing the signature to Result Authenticate(const HTTPRequest& request). Will push an update shortly — PTAL when it's ready.

// ---- SigV4 AWS credential entries ----

/// AWS region for SigV4 signing.
inline static const std::string kSigV4SigningRegion = "rest.signing-region";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the legacy key kSigV4Region/kSigV4Service

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +62 to +64
// ---- SigV4 AWS credential entries ----

/// AWS region for SigV4 signing.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names are self-explanatory. I think we can remove them to keep the code concise.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

inline static const std::string kSigV4DelegateAuthType =
"rest.auth.sigv4.delegate-auth-type";

// ---- SigV4 AWS credential entries ----
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the // ---- SigV4 AWS credential entries ---- section header is redundant given the // ---- SigV4 entries ---- block above already covers SigV4 config. Merge them into a single section.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

#ifdef ICEBERG_BUILD_SIGV4
Result<std::unique_ptr<AuthManager>> MakeSigV4AuthManager(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MakeSigV4AuthManager is implemented directly in auth_managers.cc, while all other factory functions (MakeNoopAuthManager, MakeBasicAuthManager, MakeOAuth2Manager) are defined in their own translation units and only declared in auth_manager_internal.h. Suggest moving the implementation to sigv4_auth_manager.cc for consistency.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Done.


#include "iceberg/catalog/rest/auth/auth_manager_internal.h"
#ifdef ICEBERG_BUILD_SIGV4
# include "iceberg/catalog/rest/auth/sigv4_auth_manager.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: auth_properties.h should come before the #ifdef ICEBERG_BUILD_SIGV4 block to maintain alphabetical include order.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching it!

}

{
std::lock_guard<std::mutex> lock(signing_mutex_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutex guards signer_->SignRequest() because AWSAuthV4Signer::SignRequest reportedly mutates internal signer state. However, signer_ is per-session (not shared across sessions), so the mutex only matters if the same SigV4AuthSession instance is called concurrently from multiple threads.

In contrast, Java's RESTSigV4AuthManager shares a single Aws4Signer across all sessions — if the Java signer were stateful, it would need synchronization there. It's worth confirming whether AWSAuthV4Signer::SignRequest actually mutates this or just uses local state — if the latter, the mutex can be removed entirely.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lishuxu Good call — I checked the aws-sdk-cpp source (1.11.x, AWSAuthV4Signer) and you're right: for the symmetric SigV4 path we use, SignRequest does not mutate this, so the mutex is unnecessary. I've dropped it.

@plusplusjiajia plusplusjiajia force-pushed the sigv4 branch 2 times, most recently from 8f61d0e to 655be23 Compare April 15, 2026 04:23
return std::make_shared<Aws::Auth::DefaultAWSCredentialsProviderChain>();
}

std::string SigV4AuthManager::ResolveSigningRegion(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResolveSigningRegion manually reads AWS_REGION / AWS_DEFAULT_REGION and falls back to "us-east-1". Java delegates to DefaultAwsRegionProviderChain which also covers ~/.aws/config, EC2/ECS instance metadata. The AWS C++ SDK has an equivalent Aws::Config::EC2InstanceProfileConfigLoader and Aws::Environment::GetEnv. Consider using the SDK's built-in region resolution instead of reimplementing a subset of it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lishuxu Good point.Switched to: return {Aws::Client::ClientConfiguration().region.c_str()};

Status SigV4AuthManager::Close() { return delegate_->Close(); }

Result<std::shared_ptr<Aws::Auth::AWSCredentialsProvider>>
SigV4AuthManager::MakeCredentialsProvider(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java's AwsProperties.restCredentialsProvider() supports loading a custom AwsCredentialsProvider via a class name property. C++ only supports static credentials and the default chain. This is a known gap — worth a // TODO comment for future extensibility.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Result<std::shared_ptr<AuthSession>> SigV4AuthManager::ContextualSession(
const std::unordered_map<std::string, std::string>& context,
std::shared_ptr<AuthSession> parent) {
auto sigv4_parent = internal::checked_pointer_cast<SigV4AuthSession>(std::move(parent));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked_pointer_cast in ContextualSession/TableSession compiles to static_pointer_cast in Release builds — a wrong type silently causes UB.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Updated.

Comment thread src/iceberg/catalog/rest/http_client.cc Outdated
if (!first) url += "&";
auto ek = EncodeString(k);
auto ev = EncodeString(v);
url += (ek ? *ek : k) + "=" + (ev ? *ev : v);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppendQueryString silently falls back to the raw key/value when EncodeString fails. If encoding fails, the URL passed to Authenticate would differ from what the server receives, causing signature verification to fail. Consider propagating the error instead:
ICEBERG_ASSIGN_OR_RAISE(auto ek, EncodeString(k));
ICEBERG_ASSIGN_OR_RAISE(auto ev, EncodeString(v));
url += ek + "=" + ev;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lishuxu Good catch. Changed AppendQueryString to return Resultstd::string

Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! I have just completed the architectural review and didn't fully review the sigv4 manager yet. I have some preliminary questions here:

  • Should we also be compatible to the legacy rest.sigv4-enabled=true config (and others) when creating a auth manager?
  • How is this tested e2e? Any chance to have an integration test?


function(resolve_aws_sdk_dependency)
find_package(AWSSDK REQUIRED COMPONENTS core)
list(APPEND ICEBERG_SYSTEM_DEPENDENCIES AWSSDK)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it records only AWSSDK for installed-package dependency discovery, while src/iceberg/catalog/rest/CMakeLists.txt exports aws-cpp-sdk-core in the REST install interface. The generated iceberg-config.cmake can only call find_dependency(AWSSDK) without COMPONENTS core, but AWS SDK’s CMake config loads component packages from AWSSDK_FIND_COMPONENTS. A downstream installed SigV4 build can therefore fail to find/link AWS core unless it happens to be on the default linker path.

I'd suggest to special-case find_dependency(AWSSDK COMPONENTS core) in the iceberg-config.cmake.in or otherwise export the AWS SDK dependency component-aware.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wgtmac , good point — handled in iceberg-config.cmake.in by special-casing AWSSDK to call find_dependency(AWSSDK COMPONENTS core) so downstream installed builds bring in aws-cpp-sdk-core.

Comment thread src/iceberg/catalog/rest/auth/auth_manager_internal.h
{AuthProperties::kAuthTypeBasic, MakeBasicAuthManager},
{AuthProperties::kAuthTypeOAuth2, MakeOAuth2Manager},
};
#ifdef ICEBERG_BUILD_SIGV4
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, we don't need to use this macro here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgtmac Done.

Comment thread .github/workflows/cpp-linter.yml Outdated
mkdir build && cd build
cmake .. -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
cmake .. -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
-DICEBERG_BUILD_SIGV4=ON \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need changes in the file? Is it because unrecognized headers from sigv4_auth_manager.cc? Does disabling ICEBERG_BUILD_SIGV4 help in this case? I am thinking if we can add a dedicated ci workflow for aws-related stuff like S3 and SigV4

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgtmac Thanks for the suggestion — reverted cpp-linter.yml to the upstream pattern (no AWS SDK install) and added a dedicated sigv4_test.yml that installs aws-cpp-sdk-core via vcpkg and exercises the SigV4 build unit tests.

Comment thread CMakeLists.txt Outdated
option(ICEBERG_BUILD_BUNDLE "Build the battery included library" ON)
option(ICEBERG_BUILD_REST "Build rest catalog client" ON)
option(ICEBERG_BUILD_REST_INTEGRATION_TESTS "Build rest catalog integration tests" OFF)
option(ICEBERG_BUILD_SIGV4 "Build SigV4 authentication support (requires AWS SDK)" OFF)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase on the latest main branch so we can see the option ICEBERG_S3. I think we should follow the same pattern to name it ICEBERG_SIGV4.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgtmac Rebased on latest main and renamed ICEBERG_BUILD_SIGV4 → ICEBERG_SIGV4 to match the existing ICEBERG_S3 option.

cpr_params.Add({key, val});
if (params.empty()) return base_url;
std::map<std::string, std::string> sorted(params.begin(), params.end());
std::string url = base_url + "?";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we assume that base_url will never contain & which is true for rest catalog use case but HttpClient is an exported class so it is worth adding a comment to avoid misuse.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgtmac Thanks! Added a doc comment: base_url must not already contain a query string (? or &).

SigV4AuthSession(
std::shared_ptr<AuthSession> delegate, std::string signing_region,
std::string signing_name,
std::shared_ptr<Aws::Auth::AWSCredentialsProvider> credentials_provider,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally it is not a good practice to expose AWS sdk internals like this. We can consider changing to sigv4_auth_manager_internal.h so we don't install it any more. This is also related to my other comment about installed headers.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wgtmac, good point — done.

namespace {

/// \brief Ensures AWS SDK is initialized exactly once per process.
/// ShutdownAPI is intentionally never called (leak-by-design) to avoid
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the recommended approach?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgtmac Thanks — went with the Arrow / Velox approach. Added a new public header auth/aws_sdk.h with InitializeAwsSdk() / FinalizeAwsSdk() / IsAwsSdkInitialized() / IsAwsSdkFinalized(). Lazy init is preserved as a fallback. Finalize uses a session ref-count (Velox-style) to refuse shutdown while any SigV4AuthSession is alive. ICEBERG_SIGV4=OFF returns NotSupported stubs so callers don't need #ifdef

Comment thread src/iceberg/test/sigv4_auth_test.cc Outdated
Comment on lines +226 to +228
// ---------- Tests ported from Java TestRESTSigV4AuthSession ----------

// Java: authenticateWithoutBody
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ---------- Tests ported from Java TestRESTSigV4AuthSession ----------
// Java: authenticateWithoutBody

Let's remove comments like this and below.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgtmac Done.

auto delegate_session,
delegate_->TableSession(table, properties, sigv4_parent->delegate()));

auto merged = MergeProperties(sigv4_parent->effective_properties(), properties);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: is it intentional that table sessions inherit the parent session's effective SigV4 properties, including contextual overrides?

This seems slightly different from Java's current RESTSigV4AuthManager, where tableSession merges table properties with catalogProperties. The C++ precedence of catalog < context < table looks reasonable to me, but if this is deliberate, could we document it or keep the test that makes this behavior explicit?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgtmac Thanks — agreed cross-runtime consistency wins here. Reverted to Java's behavior:

  • CatalogSession stores catalog_properties_ on the manager.
  • ContextualSession / TableSession both merge from catalog_properties_ + their own props (no parent inheritance for property merging).
  • Dropped the now-unused effective_properties() accessor.

@plusplusjiajia
Copy link
Copy Markdown
Member Author

Thanks for adding this! I have just completed the architectural review and didn't fully review the sigv4 manager yet. I have some preliminary questions here:

  • Should we also be compatible to the legacy rest.sigv4-enabled=true config (and others) when creating a auth manager?
  • How is this tested e2e? Any chance to have an integration test?

@wgtmac Thanks for taking a look!
Legacy rest.sigv4-enabled: Java does carry a deprecation alias here. I'd rather not pull that over into a fresh client — iceberg-cpp has no historical users to preserve, and inheriting a property name Java is already trying to retire feels like the wrong direction. Happy to revisit if there's a concrete cross-runtime config-sharing case.
E2E: Java's TestRESTSigV4AuthSession is also header-only — it checks the headers the signer produces but never sends a real HTTP request. The first 12 of our 21 cases are direct ports, so we're at parity with Java's actual coverage.

@plusplusjiajia plusplusjiajia force-pushed the sigv4 branch 2 times, most recently from 4872fc8 to fe41ba7 Compare May 17, 2026 00:44
@plusplusjiajia plusplusjiajia force-pushed the sigv4 branch 3 times, most recently from 6fda557 to aca047e Compare May 17, 2026 02:40
@plusplusjiajia
Copy link
Copy Markdown
Member Author

Thanks all — all review comments addressed; PTAL @wgtmac @lishuxu @evindj @zhjwpku

std::atomic<LifecycleState> g_state{LifecycleState::kUninitialized};
std::mutex g_lifecycle_mutex;
Aws::SDKOptions g_sdk_options;
std::atomic<size_t> g_active_session_count{0};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can avoid these global states by wrapping them in a Singleton instance.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhjwpku — done, wrapped the four globals into an AwsSdkLifecycle singleton.

Comment thread src/iceberg/catalog/rest/CMakeLists.txt Outdated
rest_util.cc
types.cc)

list(APPEND ICEBERG_REST_SOURCES auth/sigv4_auth_manager.cc)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just add this to the above set?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhjwpku Good nit — done.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhjwpku Yes, intentional — addresses @wgtmac's earlier comment about not exposing AWS SDK types in the public install. iceberg_install_all_headers auto-skips any header containing internal.

endforeach()
endif()

iceberg_install_all_headers(iceberg/catalog/rest)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but header name with _internal.h won't be installed, is that what you want?

wgtmac

This comment was marked as outdated.

delegate_->TableSession(table, properties, sigv4_parent->delegate()));

auto merged = MergeProperties(catalog_properties_, properties);
return WrapSession(std::move(delegate_session), merged);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance & Design Flaw: In the Java implementation, TableSession and ContextualSession directly reuse the Aws4Signer and AWS properties from the parent session. Here, WrapSession is called every time to recreate the RestSigV4Signer and MakeCredentialsProvider. If DefaultAWSCredentialsProviderChain is used, this could trigger expensive EC2 IMDS network requests or file reads on every table session creation, leading to severe performance degradation. Consider reusing the signer from the parent session like Java does.

Result<std::shared_ptr<AuthSession>> SigV4AuthManager::InitSession(
HttpClient& init_client,
const std::unordered_map<std::string, std::string>& properties) {
ICEBERG_RETURN_UNEXPECTED(AwsSdkLifecycle::Instance().EnsureInitialized());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race Condition: There is no lock protecting the gap between EnsureInitialized() and WrapSession (which creates the SigV4AuthSession and increments active_session_count_). If another thread calls FinalizeAwsSdk() in this brief window, it could successfully shut down the AWS SDK (since count is still 0), and then this thread would proceed to create a session and use the closed SDK, causing a crash. Consider strengthening the lock protection around the session lifecycle and counter increment.

auto sha256 = Aws::Utils::HashingUtils::CalculateSHA256(
Aws::String(delegate_request.body.data(), delegate_request.body.size()));
aws_request->SetHeaderValue("x-amz-content-sha256",
Aws::Utils::HashingUtils::Base64Encode(sha256));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec Violation: AWS SigV4 specification strictly requires x-amz-content-sha256 to be a lowercase hexadecimal string, not Base64. Although the comment mentions this aligns with the Java implementation (which might incorrectly output Base64 due to misuse of SignerChecksumParams), blindly matching this bug is problematic. If the backend is a standard AWS service (like API Gateway), the Base64 header will cause signature validation to fail (SignatureDoesNotMatch). This should be fixed to use Hex encoding.

bool IsFinalized() const { return state_.load() == State::kFinalized; }

void IncrementSessionCount() {
active_session_count_.fetch_add(1, std::memory_order_relaxed);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrency Flaw: Using std::memory_order_relaxed for the atomic counter is unsafe here. In a multi-core environment, relaxed does not provide memory visibility guarantees. The thread executing FinalizeAwsSdk() might read a stale value (e.g., 0) from the CPU cache and prematurely close the SDK while sessions are still active. Consider using std::memory_order_acquire and std::memory_order_release at a minimum.

std::string value(aws_value.c_str(), aws_value.size());
for (const auto& [orig_name, orig_value] : original_headers) {
if (StringUtils::EqualsIgnoreCase(orig_name, name) && orig_value != value) {
signed_request.headers[std::string(kRelocatedHeaderPrefix) + orig_name] =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant Logic: Because the earlier logic already relocated the original Authorization header to Original-Authorization in signing_headers, this loop will copy Original-Authorization once when it encounters it in aws_request->GetHeaders(). Later, when it encounters the newly generated Authorization header, this inner loop condition orig_value != value will match again, causing Original-Authorization to be redundantly overwritten with the same value. While functionally harmless, this double-insertion is unnecessary.

// Delegates the full resolution chain (AWS_DEFAULT_REGION / AWS_REGION env,
// ~/.aws/config profile, EC2/ECS IMDS, fallback us-east-1) to the AWS SDK.
// Set AWS_EC2_METADATA_DISABLED=true to skip IMDS on non-EC2 hosts.
return {Aws::Client::ClientConfiguration().region.c_str()};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we resolve the default region through the AWS region provider chain here? Java's AwsProperties.restSigningRegion() uses DefaultAwsRegionProviderChain when rest.signing-region is unset. ClientConfiguration().region defaults to us-east-1 in the C++ SDK, so configs that rely on AWS_REGION, AWS_DEFAULT_REGION, or the selected profile would sign requests for the wrong region.

{AuthProperties::kAuthTypeNone, MakeNoopAuthManager},
{AuthProperties::kAuthTypeBasic, MakeBasicAuthManager},
{AuthProperties::kAuthTypeOAuth2, MakeOAuth2Manager},
{AuthProperties::kAuthTypeSigV4, MakeSigV4AuthManager},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we keep the Java legacy flag here as well? Java still treats rest.sigv4-enabled=true as SigV4 and strips that key before loading the delegate. With this path, a catalog that uses the old property but no rest.auth.type falls through to none/oauth2 and sends unsigned requests.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the legacy flag is still needed because we want to be compatible if users have that defined in their table properties.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds AWS SigV4 request signing support to the REST catalog client by extending authentication from “mutate headers” to “authenticate a full HTTP request (method/url/headers/body)”, enabling correct SigV4 canonicalization and signing. It also wires optional AWS SDK dependencies into both CMake and Meson builds and introduces dedicated SigV4 unit tests plus CI coverage.

Changes:

  • Introduce HttpRequest + update AuthSession::Authenticate() to return an authenticated request (required for SigV4 signing inputs).
  • Add SigV4AuthManager/SigV4AuthSession with AWS SDK lifecycle helpers, credential resolution, and header relocation behavior.
  • Update build system + CI to optionally enable SigV4 and run SigV4 unit tests when AWS SDK is available.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/iceberg/test/sigv4_auth_test.cc New unit tests covering SigV4 signing behavior, lifecycle, and header relocation.
src/iceberg/test/meson.build Conditionally adds SigV4 auth test when aws-sdk-core is found.
src/iceberg/test/CMakeLists.txt Adds SigV4 auth test target when ICEBERG_SIGV4 is enabled.
src/iceberg/test/auth_manager_test.cc Updates tests to new Authenticate(HttpRequest) API.
src/iceberg/iceberg-config.cmake.in Ensures exported CMake config properly finds AWSSDK core dependency.
src/iceberg/catalog/rest/meson.build Adds SigV4 source, optional AWS SDK dep, ICEBERG_SIGV4 compile define, installs new headers.
src/iceberg/catalog/rest/http_request.h Introduces public HttpRequest and HttpMethod used by signing/auth.
src/iceberg/catalog/rest/http_client.cc Refactors request preparation to authenticate/sign full requests; adds query-string building.
src/iceberg/catalog/rest/endpoint.h Moves HttpMethod/ToString declaration dependency to http_request.h.
src/iceberg/catalog/rest/CMakeLists.txt Adds SigV4 source, links AWS SDK core when enabled, exports ICEBERG_SIGV4 definition.
src/iceberg/catalog/rest/auth/sigv4_auth_manager.cc Implements SigV4 manager/session, AWS SDK lifecycle, signing and header relocation.
src/iceberg/catalog/rest/auth/sigv4_auth_manager_internal.h Declares SigV4 manager/session types and constants.
src/iceberg/catalog/rest/auth/aws_sdk.h Public API for AWS SDK init/finalize helpers.
src/iceberg/catalog/rest/auth/auth_session.h Updates AuthSession API to authenticate full requests (not just headers).
src/iceberg/catalog/rest/auth/auth_session.cc Updates default session implementation for new request-based API.
src/iceberg/catalog/rest/auth/auth_properties.h Adds SigV4-related properties (region/service/credentials/session token).
src/iceberg/catalog/rest/auth/auth_managers.cc Registers SigV4 auth manager factory in default registry.
src/iceberg/catalog/rest/auth/auth_manager_internal.h Declares SigV4 auth manager factory hook.
meson.options Adds sigv4 Meson feature toggle.
CMakeLists.txt Adds ICEBERG_SIGV4 CMake option.
cmake_modules/IcebergThirdpartyToolchain.cmake Adds AWS SDK dependency resolution when SigV4 is enabled.
ci/scripts/build_iceberg.sh Adds sigv4 parameter and forwards toolchain file for non-Windows builds.
.github/workflows/sigv4_test.yml New workflow building with SigV4 enabled and running unit tests on Ubuntu.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +142 to +146

EXPECT_NE(headers.find("authorization"), headers.end());
EXPECT_TRUE(headers.at("authorization").starts_with("AWS4-HMAC-SHA256"));
EXPECT_NE(headers.find("original-authorization"), headers.end());
EXPECT_EQ(headers.at("original-authorization"), "Bearer my-oauth-token");
Comment on lines +212 to +213

EXPECT_EQ(headers.find("original-authorization"), headers.end());
Comment on lines +329 to +335
// Relocated delegate header should be in SignedHeaders
EXPECT_TRUE(auth_it->second.find("original-authorization") != std::string::npos)
<< "SignedHeaders should include 'original-authorization', got: "
<< auth_it->second;

// Relocated Authorization present
auto orig_it = headers.find("original-authorization");
Comment on lines +404 to +406

EXPECT_NE(headers.find("authorization"), headers.end());
EXPECT_EQ(headers.find("original-authorization"), headers.end());
cpr::Parameters cpr_params;
for (const auto& [key, val] : params) {
cpr_params.Add({key, val});
if (params.empty()) return base_url;
enum class HttpMethod : uint8_t { kGet, kPost, kPut, kDelete, kHead };

/// \brief Convert HttpMethod to string representation.
constexpr std::string_view ToString(HttpMethod method);
# under the License.

# SigV4 build + unit tests (Linux only; aws-cpp-sdk-core via vcpkg).
name: SigV4 Tests
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems sigv4_test.yml overlaps with s3_test.yml. Could we merge them into a dedicated aws_test.yml workflow?

Currently, ICEBERG_S3 pulls the AWS SDK via Arrow, while ICEBERG_SIGV4 pulls it via vcpkg. Testing them separately hides a major risk: if a user enables both, mismatched AWS SDK versions could cause severe symbol conflicts or linking errors. By merging them, we can:

  1. Enable both -DICEBERG_S3=ON and -DICEBERG_SIGV4=ON in the same build to ensure they can safely co-exist.
  2. Save CI time by only building and running the related test targets (e.g., cmake --build . --target s3_test sigv4_auth_test followed by ctest -R "s3|sigv4").

Comment on lines +63 to +65
if [[ -n "${CMAKE_TOOLCHAIN_FILE:-}" ]]; then
CMAKE_ARGS+=("-DCMAKE_TOOLCHAIN_FILE=${CMAKE_TOOLCHAIN_FILE}")
fi
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [[ -n "${CMAKE_TOOLCHAIN_FILE:-}" ]]; then
CMAKE_ARGS+=("-DCMAKE_TOOLCHAIN_FILE=${CMAKE_TOOLCHAIN_FILE}")
fi

They are not neccessary.

endmacro()

# Find system dependencies
# AWSSDK's CMake config dispatches sub-package finds via AWSSDK_FIND_COMPONENTS,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the original # Find system dependencies comment?

# so a plain find_dependency would not bring in aws-cpp-sdk-core.
if("AWSSDK" IN_LIST ICEBERG_SYSTEM_DEPENDENCIES)
list(REMOVE_ITEM ICEBERG_SYSTEM_DEPENDENCIES AWSSDK)
include(CMakeFindDependencyMacro)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
include(CMakeFindDependencyMacro)

This duplicates line 39 above.

# Find system dependencies
# AWSSDK's CMake config dispatches sub-package finds via AWSSDK_FIND_COMPONENTS,
# so a plain find_dependency would not bring in aws-cpp-sdk-core.
if("AWSSDK" IN_LIST ICEBERG_SYSTEM_DEPENDENCIES)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This special case does not look that elegant. Perhaps we can do the following instead:

In this file src/iceberg/iceberg-config.cmake.in:

  Remove lines 74-80 (the AWSSDK special case and redundant include):

  -# AWSSDK's CMake config dispatches sub-package finds via AWSSDK_FIND_COMPONENTS,
  -# so a plain find_dependency would not bring in aws-cpp-sdk-core.
  -if("AWSSDK" IN_LIST ICEBERG_SYSTEM_DEPENDENCIES)
  -  list(REMOVE_ITEM ICEBERG_SYSTEM_DEPENDENCIES AWSSDK)
  -  include(CMakeFindDependencyMacro)
  -  find_dependency(AWSSDK COMPONENTS core)
  -endif()
   iceberg_find_dependencies("${ICEBERG_SYSTEM_DEPENDENCIES}")

  Modify the foreach inside iceberg_find_dependencies (line 50):

     foreach(dependency ${dependencies})
  -    find_dependency(${dependency})
  +    find_dependency(${dependency} ${ICEBERG_FIND_EXTRA_ARGS_${dependency}})
     endforeach()

In the cmake_modules/IcebergThirdpartyToolchain.cmake

   function(resolve_aws_sdk_dependency)
     find_package(AWSSDK REQUIRED COMPONENTS core)
     list(APPEND ICEBERG_SYSTEM_DEPENDENCIES AWSSDK)
     set(ICEBERG_SYSTEM_DEPENDENCIES
         ${ICEBERG_SYSTEM_DEPENDENCIES}
         PARENT_SCOPE)
  +  set(ICEBERG_FIND_EXTRA_ARGS_AWSSDK "COMPONENTS;core" PARENT_SCOPE)
   endfunction()


AuthManagerRegistry CreateDefaultRegistry() {
return {
AuthManagerRegistry registry = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding this indirection?

ICEBERG_ASSIGN_OR_RAISE(auto delegate_session, delegate_->ContextualSession(
context, sigv4_parent->delegate()));

auto merged = MergeProperties(catalog_properties_, context);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also account for contextual credentials? Java builds the SigV4 AwsProperties from catalog properties plus both context.properties() and context.credentials(). If C++ keeps credentials separate from the context properties map, those credentials won't be able to override the catalog signing credentials here.

struct ICEBERG_REST_EXPORT HttpRequest {
HttpMethod method = HttpMethod::kGet;
std::string url;
std::unordered_map<std::string, std::string> headers;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loses one bit of behavior that Java's HTTPHeaders keeps: repeated headers. That matters for the SigV4 conflict path, where Java can preserve multiple values for the same name and relocate only the conflicting originals. With an unordered_map, a delegate-provided duplicate header is collapsed before signing, so we can't match Java's updateRequestHeaders behavior in those cases.

Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed SigV4 core flow against Java reference (RESTSigV4AuthSession / RESTSigV4AuthManager). A few things to address:

auto sha256 = Aws::Utils::HashingUtils::CalculateSHA256(
Aws::String(delegate_request.body.data(), delegate_request.body.size()));
aws_request->SetHeaderValue("x-amz-content-sha256",
Aws::Utils::HashingUtils::Base64Encode(sha256));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces Base64-encoded SHA256. The AWS SigV4 spec and Java's SignerChecksumParams(algorithm=SHA256) both produce hex-encoded SHA256 for x-amz-content-sha256. When a C++ client sends a signed request with a body to a Java REST catalog (or any spec-compliant server), signature verification will fail because the canonical request includes this header value.

Use HashingUtils::HexEncode(sha256) instead of Base64Encode.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wgtmac — this is intentional Java parity. Iceberg's Java RESTSigV4AuthSession routes through SignerChecksumParams.checksumHeaderName → AbstractAws4Signer.putChecksumHeader → BinaryUtils.toBase64, soJava also produces Base64. The RestSigV4Signer subclass exists because the C++ SDK has no checksumHeaderName equivalent. Changing C++ to hex would break interop with servers verifying via Java's path.


/// Matches Java RESTSigV4AuthSession: canonical headers carry
/// Base64(SHA256(body)), canonical request trailer uses hex.
class RestSigV4Signer : public Aws::Client::AWSAuthV4Signer {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subclass exists solely to set m_includeSha256HashHeader = false, which is an SDK internal field. Java doesn't need this workaround — it passes the body via contentStreamProvider and lets the signer compute the hash itself through SignerChecksumParams.

Consider doing the same: only manually set x-amz-content-sha256 for the empty-body workaround, and let the signer handle non-empty bodies. That eliminates the SDK-internal dependency.

}

HttpRequest signed_request{.method = delegate_request.method,
.url = std::move(delegate_request.url),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java relocates AuthorizationOriginal-Authorization before signing (in convertHeaders). The signer never sees the delegate's Authorization.

Here the relocation happens after signing — the AWS signer signs a request that still contains the delegate's Authorization header. It works because the signed value always differs, but the signer is including a stale header in the canonical request. Cleaner to relocate before building the AWS request.

context, sigv4_parent->delegate()));

auto merged = MergeProperties(catalog_properties_, context);
return WrapSession(std::move(delegate_session), merged);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java merges catalog ∪ (context.properties ∪ context.credentials)SessionContext has two override sources. Here the merge is just catalog ∪ context, which is a flat map. Not necessarily wrong given the C++ API, but worth a comment noting the difference since context.credentials overrides are lost.

auto manager_result = AuthManagers::Load("test-catalog", properties);
ASSERT_THAT(manager_result, IsOk());

auto catalog_session = manager_result.value()->CatalogSession(client_, properties);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This asserts Base64 length (44 chars), but it should be asserting hex length (64 chars). The test is validating the wrong encoding — it would pass even if the hash value were garbage, as long as it's 44 characters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants