New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: continuation of the auth rework -- AWSv4 #14885

Merged
merged 69 commits into from Jun 13, 2017

Conversation

Projects
None yet
5 participants
@rzarzynski
Contributor

rzarzynski commented Apr 30, 2017

This PR is the third part of the authentication & authorisation rework. It's focused on extending support for the AWSv4 schema across S3 auth engines.

Many low-level primitives have been dissected as-is from the current code and just encapsulated behind new interfaces. Potential clean-ups (like switching from std::string to boost::string_ref) will be pushed to this PR in separated commits.

The branch has been verified with s3-tests (both in AWSv2 and AWSv4), Tempest and AWS Java SDK (the AWSv4's streaming mode: UploadObjectSingleOperation.java).

TODO:

  • more testing: Keystone & AWSv4, Browser Upload & AWSv4, ...
  • clean-ups.

Depends on: #11179, #14432.

CC: @yehudasa, @mattbenjamin.

qsr = true;
}
if ((now_req < now - RGW_AUTH_GRACE_MINS * 60 ||

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

It might be possible to unify this expiration check with its AWSv2 counterpart.

return -EINVAL;
}
list<string> auth_list;

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

Switch to std::vector.

}
}
credential = std::move(kv["Credential"]);

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

Switch to boost::string_ref.

/* grab access key id */
const size_t pos = credential.find("/");
access_key_id = credential.substr(0, pos);

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

Switch to boost:string_ref.

This comment has been minimized.

@rzarzynski

rzarzynski May 17, 2017

Contributor

DONE. Switched to boost::string_view for both access_key_id and credential as well.

return true;
}
static inline void aws4_uri_encode(const string& src, string& dst)

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

Rework the function's signature.

while (getline(cqs, keyval, '&')) {
string key, val;
istringstream kv(keyval);

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

Rework the parsing.

map<string, string>::iterator last = canonical_qs_map.end();
--last;
for (map<string, string>::iterator it = canonical_qs_map.begin();

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

Use the range for.

return canonical_hdrs;
}
std::string hash_string_sha256(const char* const data, const int len)

This comment has been minimized.

@rzarzynski

rzarzynski Apr 30, 2017

Contributor

Replace with appropriate variant of calc_hash_256.

string auth_str = http_auth;
#define AWS4_HMAC_SHA256_STR "AWS4-HMAC-SHA256"
#define CREDENTIALS_PREFIX_LEN (sizeof(AWS4_HMAC_SHA256_STR) - 1)

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

Make this global, staticand constexpr.

const char *canonical_hdrs, const char *signed_hdrs, const char *request_payload_hash,
string& dest_str)
static std::string assemble_v4_canonical_request(
const char* const method,

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

We're losing the information about length here.

string dest;
static std::string rgw_assemble_s3_v4_string_to_sign(
const char* const algorithm,
const char* const request_date,

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

The same info drop here.

const std::string& hashed_qr)
{
const auto string_to_sign = \
rgw_assemble_s3_v4_string_to_sign(

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

Maybe avoid the rgw_assemble_s3_v4_string_to_sign indirection.

char signature_k[CEPH_CRYPTO_HMACSHA256_DIGESTSIZE];
/* FIXME(rzarzynski): eradicate the reinterpret_cast. */
calc_hmac_sha256(reinterpret_cast<const char*>(signing_key.data()), CEPH_CRYPTO_HMACSHA256_DIGESTSIZE,

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

Switch to the new variant of calc_hmac_256.

@@ -576,23 +576,18 @@ int RGWOp::verify_op_mask()
int RGWOp::do_aws4_auth_completion()

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

In the future we might want to replace this with a more generic mechanism like RGWOp::complete_auth.

}
return -EINVAL;
return authorize_v2(store, auth_registry, s);

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

Drop the _v2 suffix.

/* craft canonical headers */
boost::optional<std::string> canonical_headers = \
rgw::auth::s3::get_v4_canonical_headers(s->info, signed_hdrs, using_qs,
true /* FIXME: force_boto2_compat*/);

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

IMPORATANT: force_boto2_compat.

ldout(s->cct, 10) << "delaying v4 auth" << dendl;
/* payload in a single chunk */
switch (s->op_type)

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

Move the checks like this one into separated functions.

null_completer_factory);
}
std::tuple<AWSVerAbstractor::access_key_id_t,

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

Too many items in the tuple. Maybe replace with a struct?

@rzarzynski

This comment has been minimized.

Contributor

rzarzynski commented May 1, 2017

The Teuthology run (http://pulpito.ceph.com/rzarzynski-2017-04-30_16:48:24-rgw-wip-rgw-auth-rework-cont-3-awsv4---basic-mira/) hasn't found a regression. The failures were caused by:

  • the well-known issue with 100 Continue on Apache (s3tests.functional.test_s3.test_100_continue),
  • the well-known Valgrind problem (strange leak of std::string memory from md_config_t seen in radosgw). Valgrind signalising a non-related leak outside of RadosGW (mons).
rgw::auth::s3::get_v4_signing_key(s->cct, credential_scope, *secret_key);
return rgw::auth::Completer::cmplptr_t(
new AWSv4ComplMulti(s,

This comment has been minimized.

@cbodley

cbodley May 1, 2017

Contributor

make_shared here?

This comment has been minimized.

@rzarzynski

rzarzynski May 1, 2017

Contributor

TODO.

This comment has been minimized.

@rzarzynski

rzarzynski May 4, 2017

Contributor

DONE in the exchange of making the completers' constructors public.

@rzarzynski

This comment has been minimized.

Contributor

rzarzynski commented May 10, 2017

@jmunhoz: could you please take a look?

@jmunhoz

This comment has been minimized.

Contributor

jmunhoz commented May 12, 2017

@rzarzynski sure, I will have that look in the next days

string_to_sign.append(date + "\n");
string_to_sign.append(credential_scope + "\n");
string_to_sign.append(prev_chunk_signature + "\n");
string_to_sign.append(std::string(AWS4_EMPTY_PAYLOAD_HASH) + "\n");

This comment has been minimized.

@rzarzynski

rzarzynski May 12, 2017

Contributor

FIXME.

/* It's expected that Completers would tend to implement many interfaces
* and be used not only in req_state::auth::completer. Ref counting their
* instances woild be helpful. */
typedef std::shared_ptr<Completer> cmplptr_t;

This comment has been minimized.

@cbodley

cbodley May 12, 2017

Contributor

re: the comment about shared vs. unique, where is this pointer shared outside of req_state::completer?

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

It's passed to the front-end subsystem as the AWSv4 completers are also filters over IO.

This comment has been minimized.

@cbodley

cbodley May 15, 2017

Contributor

okay, thanks. i was hoping that we could stick with unique_ptr for req_state::completer and rely on the lifetime of req_state to share raw pointers elsewhere, but i see that RGWRestfulIO will outlive req_state. so using shared_ptr is probably easier than allowing filters to be removed from RGWRestfulIO before the req_state goes away

.append("\n")
.append(signed_hdrs)
.append("\n")
.append(request_payload_hash.data(), request_payload_hash.length());

This comment has been minimized.

@cbodley

cbodley May 12, 2017

Contributor

this pattern of chained append()s will likely require multiple (re)allocations. can you calculate the total size and reserve that much in advance?

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

Sure! :-)

ldout(cct, 10) << "canonical request = " << canonical_req << dendl;
ldout(cct, 10) << "canonical request hash = "
<< buf_to_hex(canonical_req_hash).data() << dendl;

This comment has been minimized.

@cbodley

cbodley May 12, 2017

Contributor

this doesn't look safe - data() will return a pointer to memory that's destructed before it's passed as an argument to operator<<()

instead, consider writing a stream output operator for sha256_digest_t (or for a struct as_hex that wraps a const sha256_digest_t&, i.e. ldout(cct, 10) << as_hex{canonical_req_hash} << dendl;)

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

this doesn't look safe - data() will return a pointer to memory that's destructed before it's passed as an argument to operator<<()

buf_to_hex constructs a temporary object. I believe its lifetime ends at the end of the full-expression. Let's verify that with the following snippet:

#include <iostream>

class PrintingDtor {
public:
  const char* data() const {
    return "PrintingDtor: data() called";
  }

  ~PrintingDtor() {
    std::cout << "PrintingDtor: dtor called" << std::endl;
  }
};

static PrintingDtor lf_time() {
  return PrintingDtor();
}

int main (void) {
  std::cout << "+++ OBJ LIFETIME TEST -- start" << std::endl;
  std::cout << "+++ OBJ LIFETIME TEST: "
                 << lf_time().data() << "; print end" << std::endl;
  std::cout << "+++ OBJ LIFETIME TEST -- end" << std::endl;
}
$ /tmp/templf 
+++ OBJ LIFETIME TEST -- start
+++ OBJ LIFETIME TEST: PrintingDtor: data() called; print end
PrintingDtor: dtor called
+++ OBJ LIFETIME TEST -- end

The binary has been produced by GCC 4.8.4 on AMD64.

This comment has been minimized.

@cbodley

cbodley May 15, 2017

Contributor

thanks, you're right. might still be worth changing as a cleanup/optimization

rgw::auth::s3::get_v4_signature(cct, std::move(signing_key),
string_to_sign);
return std::string(server_signature.data(), server_signature.size() - 1);

This comment has been minimized.

@cbodley

cbodley May 12, 2017

Contributor

i don't see any other callers of rgw::auth::s3::get_v4_signature() (correct me if i'm wrong) - can that just return std::string instead of std::array so you can avoid this conversion?

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

Yup, that's the sole user of rgw::auth::s3::get_v4_signature. The idea is switch to sstring here and for rgw::auth::s3::get_v2_signature as the necessary size is known a priori (both for v2 and v4) .

const bool using_qs,
const bool force_boto2_compat)
{
map<string, string> canonical_hdrs_map;
istringstream sh(signedheaders);
istringstream sh(signedheaders.to_string());

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

TODO: rework the parsing. Kill istringstream; use the new variant of parse_key_value instead.

This comment has been minimized.

@mattbenjamin
string_to_sign.append(credential_scope + "\n");
string_to_sign.append(prev_chunk_signature + "\n");
string_to_sign.append(std::string(AWS4_EMPTY_PAYLOAD_HASH) + "\n");
string_to_sign.append(date.data(), date.length());

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

TODO: don't overuse the identifier.

* issuing dynamic allocations. */
template<std::size_t N = 128>
static inline boost::container::small_vector<char, N>
sview2cstr(const boost::string_view& sv)

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

TODO: this might be unnecessary thanks to sstring. Switch to it if possible.

std::string access_key;
std::string signature;
std::string x_amz_algorithm;

This comment has been minimized.

@rzarzynski

rzarzynski May 15, 2017

Contributor

Maybe boost::string_view also here.

@rzarzynski

This comment has been minimized.

Contributor

rzarzynski commented May 18, 2017

jenkins retest this please (Agent went offline during the build)

const auto string_to_sign = std::string()
/* We aren't adding the extra byte for the terminating 0 as it would be
* annihilated by the "-1" coming from hexed_cr_hash. */

This comment has been minimized.

@cbodley

cbodley May 19, 2017

Contributor

note that string::reserve(n) guarantees room for n characters. the implementation will add 1 internally for the null terminator

  std::string a;
  a.reserve(36);
  std::cout << a << '[' << a.capacity() << ']' << std::endl;
  a.append(36, 'a');
  std::cout << a << '[' << a.capacity() << ']' << std::endl;

produces:

[36]
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa[36]

This comment has been minimized.

@rzarzynski

rzarzynski May 20, 2017

Contributor

Thanks for pointing this out and for sharing the knowledge! :-) I've just pushed the amended commit. Also a trivial merge conflict in rgw_op.cc has been resolved.

This comment has been minimized.

@jmunhoz

jmunhoz May 22, 2017

Contributor

@rzarzynski the new code looks good to me (great work!) Bear in mind I am also a bit concerned about the new code revamp, it adds lots of c++ stuff and while it adds great structure, cleanup, etc. I have the feeling we are losing control over simple things as the variable life scope, string lengths, etc.

Maybe we could stick to c++ advantages on the general structure and keeping the string handling and low level stuff under control with simple and explicit allocations in stack/heap where we can count bytes, knowing the pointer/ref is valid, etc. Not sure though. Just a suggestion to avoid possible off-by-one and friends.

This comment has been minimized.

@rzarzynski

rzarzynski May 22, 2017

Contributor

Hello @jmunhoz. Thanks for your review!

I've delegated the very-low-level things to dedicated functions (like the new variants of buf_to_hex, calc_hmac_sha256, parse_key_value, etc.) to avoid mixing them with the AWSv4 logic. Tracking such number of details on the business logic level seems overhelming to me.

The lifetime of objects typed as boost::string_view stays under assumption that req_info outlives the authentication phase.

This comment has been minimized.

@jmunhoz

jmunhoz May 22, 2017

Contributor

@rzarzynski sounds good. thanks!

@@ -635,24 +636,38 @@ namespace rgw {
namespace auth {
namespace s3 {
template<class T>
constexpr const T& max(const T& a, const T& b) {

This comment has been minimized.

@rzarzynski

rzarzynski May 23, 2017

Contributor

Can we move to src/common/backport14.h?

@@ -635,24 +636,38 @@ namespace rgw {
namespace auth {
namespace s3 {
template<class T>
constexpr const T& max(const T& a, const T& b) {
return a < b ? a : b;

This comment has been minimized.

@tchaikov

tchaikov May 23, 2017

Contributor

should be

return a < b ? b : a;

This comment has been minimized.

@rzarzynski

jmunhoz and others added some commits Mar 10, 2017

rgw: aws4: add AWS4 auth support for S3 Post Object API
RGW S3 supports HTTP POST requests so users can upload content directly.

This patch adds AWS v4 to handle form data properly.

Fixes: http://tracker.ceph.com/issues/18800

Signed-off-by: Javier M. Mellid <jmunhoz@igalia.com>
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: rework interfaces of AWSv4 helper primitives.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: get_v4_canonical_request_hash doesn't depend on req_state anymore.
In AWSv4 the hash of real, transfered payload IS NOT necessary to form
a Canonical Request, and thus verify a Signature. x-amz-content-sha256
header lets get the information very early -- before seeing first byte
of HTTP body. As a consequence, we can decouple Signature verification
from payload's fingerprint check. Although RadosGW doesn't do that for
now, the situation will definitely change in the future.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: dissect AWSv4's Canonical URI crafting into a separated function.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: dissect AWSv4's Canonical QS crafting into a separated function.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: dissect AWSv4's Canonical Headers crafting into a separated func…
…tion.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: eradicate req_state::http_auth.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: dissect basic AWSv4's credentials parsing into separated function.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: rgw::auth::s3::get_v4_signature doesn't depend on req_state anym…
…ore.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: drop req_state::aws4_auth::payload_hash as it doesn't need to be…
… global.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>

rzarzynski added some commits May 2, 2017

rgw: clean-up rgw::auth::s3::get_v4_signature().
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: use std::make_shared for AWSv4 completers creation.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: clean-up AWSv4's Canonical QS crafting.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: optimize and clean-up the AWSv4 signature processing.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: introduce rgw::auth::s3::AWS4_HMAC_SHA256_STR to kill magics.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: switch from boost::string_ref to string_view in AWSv4-related code.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: switch from boost::string_ref to string_view in AWSv4-related co…
…de (part 2).

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: remove the duplicative trim_whitespace from rgw_common.cc.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: drop 'using ceph::crypto::SHA256' from rgw_common.h.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: rework and optimise crafting of AWSv4's canonical query string.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw; rework interface and implementation of url_decode.
This commit alters the url_decode() to remove its dependency
on Variable Length Array and unnecessary memory allocations.
It also adjust its signature to the sole usage pattern which
is spread across the code.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: rework the implementation of rgw::auth::s3::get_v4_canonical_hea…
…ders().

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: use preallocated std::strings when concatenating in AWSv4.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: replace magic strings in the AWSv4 code.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
common/sstring: switch to boost::string_view as string_ref is depreca…
…ted.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: switch to Ceph's sstring in AWS signature generation process.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
common/backport14: add the constexpr-capable variant of std::max().
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: optimize AWSv4 parsing with Boost's small_vector.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: only rename AWSv2AuthStrategy -> AWSAuthStrategy.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: introduce string_to_sign_t abstraction to the AWS auth.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: introduce rgw::auth::Strategy::apply() to deduplicate code.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
rgw: handle the Boto2 compatibility of AWSv4 in an abstract way.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
@mattbenjamin

I think this changeset is ready to merge, and in particular, that @tchaikov's point was addressed. In addition to posted Teuthology runs, I've run this code and seen no regression (I'm looking at a problem with a Java-initiated streaming upload in my test setup, but it reproduces identically with current master as with this change).

* doesn't provide the chaining ability in the type append(). It's required
* to concatenate string without reallocations in a way const-correct manner. */
template <class StringT>
static inline StringT create_n_reserve(const size_t reserve_len)

This comment has been minimized.

@mattbenjamin

mattbenjamin Jun 12, 2017

Contributor

s/n/and/;

map<string, string>::iterator last = canonical_qs_map.end();
--last;
if (using_qs && key == "X-Amz-Signature") {

This comment has been minimized.

@mattbenjamin

mattbenjamin Jun 12, 2017

Contributor

doesn't Amazon describe lowering these? initial caps looks strange in the code

const bool using_qs,
const bool force_boto2_compat)
{
map<string, string> canonical_hdrs_map;
istringstream sh(signedheaders);
istringstream sh(signedheaders.to_string());

This comment has been minimized.

@mattbenjamin

@mattbenjamin mattbenjamin self-assigned this Jun 12, 2017

@mattbenjamin

This comment has been minimized.

Contributor

mattbenjamin commented Jun 13, 2017

@tchaikov can you confirm your issues are addressed?

we believe this proven addressed

@mattbenjamin

This comment has been minimized.

Contributor

mattbenjamin commented Jun 13, 2017

retest this please

@mattbenjamin mattbenjamin merged commit 2b4b718 into ceph:master Jun 13, 2017

5 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
Unmodified Submodules submodules for project are unmodified
Details
default Build finished.
Details
make check make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment