Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Stats::RawStatData into a StatsOptions struct #3629

Merged
merged 24 commits into from
Jul 13, 2018

Conversation

ambuc
Copy link
Contributor

@ambuc ambuc commented Jun 14, 2018

WIP: Refactor Stats::RawStatData into a Stats::StatsOptions struct.

Description:
This change allows us to deprecate the statics inside Stats::RawStatData. Some side effects of this change are:

  • a) HeapRawStatDataAllocator no longer performs stat name truncation,
  • b) we now construct BlockMemoryHashSet, HotRestartImpl, C/L/RdsSubscription, and ThreadLocalStoreImpl as functions of this Stats::StatsOptions struct, and
  • c) Stats::RawStatData now looks more like a set of libraries for computing stat padding, as opposed to a source of truth for the maximum allowable name lengths.
  • Finally, a chain of functions starting under server.cc (translateBootstrap, translateClusterManagerBootstrap, translateListener, translateCluster, translateVirtualHost) have had Stats::StatsOptions& added to their interfaces, so that Utility::checkObjNameLength() can be called with the necessary context.

Risk Level: Medium

Fixes: #3508

absl::string_view key() const {
return absl::string_view(name_, strnlen(name_, maxNameLength()));
absl::string_view keyGivenStatsOptions(const StatsOptions& stats_options) const {
return absl::string_view(name_, strnlen(name_, stats_options.maxNameLength()));
Copy link
Contributor

@jmarantz jmarantz Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this really could be fixed in a separate PR, but the signature change makes it more obvious; IMO we should spend a few bytes and retain the actual length in the structure, rather than a nul byte, rather than having to call strlen on every access.

I think we need to support >256 characters so we should probably spend a few bytes and retain the length explicitly in the structure.

That will improve the speed (at a moderate cost of memory) and could be done in a very isolated way, reducing the # of changes in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kinda on-board with this. Can we not just pre-truncate the name? As in, why can't we just pass an optional length to which we want name_ truncated? As @jmarantz mentioned, this will require us to make name_ null-terminated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets truncated at write time in initialize() / safeInitialize(); i think we can remove the keyGivenStatsOptions entirely since name_ is already being null-terminated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this works! I will replace all instances of keyGivenStatsOptions with key().

class StatsOptions {
public:
virtual ~StatsOptions() {}
virtual size_t maxNameLength() const PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document these methods.


/**
* Returns the options structure that was used to construct the set.
*/
const BlockMemoryHashSetOptions& options() const { return control_->options; }
const BlockMemoryHashSetOptions& hash_set_options() const { return control_->hash_set_options; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: hashSetOptions()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think i'm going to get rid of this function altogether, since it doesn't actually get called anywhere. If we need it later we can put it back.

absl::string_view key() const {
return absl::string_view(name_, strnlen(name_, maxNameLength()));
absl::string_view keyGivenStatsOptions(const StatsOptions& stats_options) const {
return absl::string_view(name_, strnlen(name_, stats_options.maxNameLength()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kinda on-board with this. Can we not just pre-truncate the name? As in, why can't we just pass an optional length to which we want name_ truncated? As @jmarantz mentioned, this will require us to make name_ null-terminated.

@@ -62,7 +61,8 @@ MainCommonBase::MainCommonBase(OptionsImpl& options) : options_(options) {
auto local_address = Network::Utility::getLocalAddress(options_.localAddressIpVersion());
Logger::Registry::initialize(options_.logLevel(), options_.logFormat(), log_lock);

stats_store_.reset(new Stats::ThreadLocalStoreImpl(restarter_->statsAllocator()));
stats_store_.reset(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

stats_store_ = std::make_unique<Stats::ThreadLocalStoreImpl>(restarter_->statsAllocator());

@@ -65,7 +67,7 @@ class HistogramTest : public testing::Test, public RawStatDataAllocator {
}));

EXPECT_CALL(*this, alloc("stats.overflow"));
store_.reset(new ThreadLocalStoreImpl(*this));
store_.reset(new ThreadLocalStoreImpl(options_, *this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: std::make_unique

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we use std::make_unique here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -477,15 +477,6 @@ TEST(TagProducerTest, CheckConstructor) {
"No regex specified for tag specifier and no default regex for name: 'test_extractor'");
}

// Validate truncation behavior of RawStatData.
TEST(RawStatDataTest, Truncate) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an equivalent test that tests the truncation wherever it happens now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, HotRestartImplTest.truncateKey also tests this, but I'll put back in a native RawStatDataTest.Truncate test.

@@ -47,7 +47,7 @@ class AdminStatsTest : public testing::TestWithParam<Network::Address::IpVersion
}));

EXPECT_CALL(*this, alloc("stats.overflow"));
store_.reset(new Stats::ThreadLocalStoreImpl(*this));
store_.reset(new Stats::ThreadLocalStoreImpl(options_, *this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: std::make_unique

@@ -153,14 +136,9 @@ bool TagExtractorImpl::extractTag(const std::string& stat_name, std::vector<Tag>
}

RawStatData* HeapRawStatDataAllocator::alloc(const std::string& name) {
RawStatData* data = static_cast<RawStatData*>(::calloc(RawStatData::size(), 1));
RawStatData* data = static_cast<RawStatData*>(::calloc(RawStatData::sizeGivenName(name), 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this no longer truncates, the names of the stats that overflow into the heap allocator in the hot restart case will no longer be truncated. This means that the overflowed names will be inconsistent with those that were allocated in the block allocator. Can we make sure that we truncate in that case to avoid the inconsistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good. Rather than propagate a truncation throughout ThreadLocalStore, I've put the .substr() fn inline where make_stat calls the heap_allocator in `thread_local_store.cc.

@ambuc
Copy link
Contributor Author

ambuc commented Jun 18, 2018

I'm going to hold off on this until #3606, and maybe even close-and-reopen depending on how big the differential is.

@mattklein123 mattklein123 self-assigned this Jun 19, 2018
@ambuc ambuc force-pushed the refactor-StatsOptions branch 2 times, most recently from 9daa293 to 1c2620a Compare June 25, 2018 13:36
@ambuc
Copy link
Contributor Author

ambuc commented Jun 25, 2018

@mattklein123 This PR removes truncation from HeapRawStatDataAllocator, since it doesn't have context from the top-level options struct, and we want to get rid of deeply nested statics. Am I correct in thinking that this stat name truncation isn't necessary here? @jmarantz @mrice32

@mattklein123
Copy link
Member

Am I correct in thinking that this stat name truncation isn't necessary here?

I haven't looked at this PR yet. Do you mean is it not necessary because there is no limit inside the heap allocator? I don't think there is any particular requirement to do truncation.

@ambuc ambuc changed the title WIP: Refactor Stats::RawStatData into a StatsOptions struct. Refactor Stats::RawStatData into a StatsOptions struct Jun 25, 2018
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key question here: does this get a whole lot simpler given that @mattklein123 has indicated that no truncation is required except in the hot-restart allocator.

And in that case, it has the options already, so I think this PR can be a lot simpler.

*/
static size_t maxStatSuffixLength() { return MAX_STAT_SUFFIX_LENGTH; }
static uint64_t sizeGivenName(const std::string& name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name could be specified as absl::string_view with no perf or compatibility downside here, I think.


void RawStatData::safeInitialize(absl::string_view key, const StatsOptions& stats_options) {
ASSERT(!initialized());
if (key.size() > stats_options.maxNameLength()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe be more explicit here naming this function:

  truncateAndInit()

@ambuc
Copy link
Contributor Author

ambuc commented Jun 26, 2018

I think this PR doesn't get that much simpler given that only hot restart needs truncation. A lot of the complexity of this PR comes from Utility::checkObjNameLength() needing access to stats_options, which means CdsJson::translateCluster/LdsJson::translateListener/RdsJson::translateVirtualHost need to be passed stats_options, which means CdsSubscription needs a Stats::Scope, and that BootstrapJson::translateBootstrap needs stats_options... but the buck stops there, since translateBootstrap gets called in the top-level server.cc. I don't think this is too bad, since it allows us to preserve the same level of verbosity to users accidentally name clusters (or routes, or listeners) something too long.

@ambuc
Copy link
Contributor Author

ambuc commented Jun 26, 2018

@mattklein123 This is ready for review.

@@ -24,7 +25,7 @@ void BootstrapJson::translateClusterManagerBootstrap(
auto* cluster = bootstrap.mutable_static_resources()->mutable_clusters()->Add();
Config::CdsJson::translateCluster(*json_sds->getObject("cluster"),
absl::optional<envoy::api::v2::core::ConfigSource>(),
*cluster);
*cluster, stats_options);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of an interesting maze to follow for those of us that don't know this part of the code deeply. But with a bit of grepping I think you can or should be able to pull the stats options from Cluster, which knows about ClusterManager, which knows about a bunch of things. I'm not sure if you can exactly get to the OptionsImpl from any of those, but it given all the things passed into ClusterManagerImpl's ctor, I think it would not be unreasonable to also have it see the options.

source/common/upstream/cluster_manager_impl.h

WDYT?

Copy link
Contributor Author

@ambuc ambuc Jun 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be reasonable for ClusterManager and ListenerManager and HttpConnectionManager (for routes) to know about some higher-level thing with access to a StatsOptions (such as ServerImpl). Right now I've embedded StatsOptions inside Scope, and Scope as a member of a Server::Configuration::FactoryContext, which is already present in a lot of these locations. I'm just not sure it would really save us that much complexity.

@mattklein123
Copy link
Member

@ambuc @mrice32 @jmarantz what's the status of this PR? Does it need a final review? Looks like it needs a master merge?

@ambuc ambuc force-pushed the refactor-StatsOptions branch 2 times, most recently from d3e0880 to 2de48fe Compare July 6, 2018 14:10
@ambuc
Copy link
Contributor Author

ambuc commented Jul 6, 2018

@mattklein123 Sorry for dropping the ball on this. I've done a merge, so this PR is now ready for a final review.

As noted above, the bulk of this PR isn't actually due to the addition of a statsOptions struct. Most of the delta comes from the eradication of a static call inside Utility::checkObjNameLength(). That utility now needs access to a top-level statsOptions struct for consistency, and I've had to modify a number of interfaces to accommodate this. If this makes the PR harder to read, I could tease it out into a separate one, but let me know what you think.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through at a high level and have some small comments. This is tricky code so I'm mostly assuming that @jmarantz and @mrice32 already went through in detail. Generally LGTM, nice work!

virtual size_t maxNameLength() const PURE;

/**
* The max allowed length of the object part of a stat name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the object part and the suffix part of a stat? Can you potentially provide an example in the comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's more color in /source/common/stats/stats_impl.h, which I'm not sure how much of to duplicate in the include file. So the object part is something like cluster.<cluster_name>.outlier_detection, and the suffix part is one of the statistics listed in the docs here like ejections_enforced_consecutive_gateway_failure. We just want to limit the lengths of these halves separately. Do you think it's worth adding that to the include too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just comment about where the other comments are? In general I would expect the public includes to have the detail so I can understand what things mean, but fine to link to other places. Whatever works. It's mostly that this nomenclature caught me off guard so I think it might confuse others also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me. I did some work in 732af4f.

ref_count_ = 1;

uint64_t xfer_size = key.size();
memcpy(name_, key.data(), xfer_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so where is initialize called now? Does this need some safety guard for cope size? If not, can you add a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are two initialize fns now -- one, RawStatData::initialize(key), which is used in cases where no truncation is performed (like in HeapRawStatDataAllocator::alloc(name)). The other, RawStatData::truncateAndInit(key, stats_options), gets called in BlockMemoryHashSet::insert(key), which has access to a local stats_options_ struct. So the former case doesn't guard for some max size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we somehow assert that we aren't going to overwrite memory here? It's not super clear to me how we know that this is a safe copy (or if heap allocated pass in the size of the heap allocation and assert that)?

@@ -55,6 +56,7 @@ class CdsSubscription : public Http::RestApiFetcher,
Config::SubscriptionCallbacks<envoy::api::v2::Cluster>* callbacks_ = nullptr;
Config::SubscriptionStats stats_;
const absl::optional<envoy::api::v2::core::ConfigSource>& eds_config_;
const Stats::Scope& scope_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of storing a reference to the scope here, would it better to store reference to the stat options? We are slightly splitting hairs but it potentially seems a bit clearer/safer to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense -- I put statsOptions into the scope struct, and found it made more contextual sense to add scope as an argument to a lot of higher-level constructors or factories. But as those factories call lower-level utilities (parseHttpConnectionManagerFromJson, it makes more sense to give them the specific arguments they need, like the statsOptions struct. I've changed the interfaces for CdsJson::translateCluster, LdsJson::translateListener, and some others.

return fmt::format("{}.{}.{}.{}", VERSION, sizeof(SharedMemory), max_num_stats,
max_stat_name_len);
stats_options.maxNameLength());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't change the shared memory layout at all, right? Just being extra paranoid I would double check to make sure nothing changes w/ hot restart and with the output versions, as we have had issues in this area in the past. I'm not sure what our test status is on hot restart versions changing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't change that last argument to fmt:;format(), since that value max_stat_name_len used to be the sum of max_obj_name_len and maxStatSuffixLength(), and it still is. But I'll rerun the hot_restart_impl_test on this and upstream master and just make sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reran this test for master and this branch and the hot restart versionString is the same. 😄 In master right now, the server/mocks.cc sets maxObjNameLength() to return 150 by default. This new test does not, so the versionStrings look different -- but if we set

stats_options_.max_obj_name_length_ = 150

somewhere in TEST_F(HotRestartImplTest, versionString), then the versionStrings are identical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat unrelated to your change, but WDYT about actually committing the hot restart version string into a test somewhere and verifying that it is what we expect it to be? That would warn us about a change in the hot restart version that we didn't intend? Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Perhaps this goes in the hot restart test itself)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this -- I could hardcode that string into hotrestart_test.sh for comparison, merge in that PR first, and then verify that this PR doesn't need any changes to pass that test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -51,6 +52,7 @@ class LdsSubscription : public Http::RestApiFetcher,
const LocalInfo::LocalInfo& local_info_;
Config::SubscriptionCallbacks<envoy::api::v2::Listener>* callbacks_ = nullptr;
Config::SubscriptionStats stats_;
const Stats::Scope& scope_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment here about scope (and options), maybe applies elsewhere also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed -- see #3629 (comment).

memcpy(name, key.data(), xfer);
name[xfer] = '\0';
}
static uint64_t size() { return sizeof(TestValue); }
static uint64_t sizeGivenStatsOptions(const Stats::StatsOptions& stats_options) {
(void)stats_options;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: UNREFERENCED_PARAMATER or just delete arg variable name. Same elsewhere if I missed it.

This change allows us to deprecate the statics inside Stats::RawStatData. Some side effects of this change are: a) HeapRawStatDataAllocator no longer performs stat name truncation, b) we now construct BlockMemoryHashSet, HotRestartImpl, C/L/RdsSubscription, and ThreadLocalStoreImpl as functions of this Stats::StatsOptions struct, and c) Stats::RawStatData now looks more like a set of libraries for computing stat padding, as opposed to a source of truth for the maximum allowable name lengths. Finally, a chain of functions starting under server.cc (translateBootstrap, translateClusterManagerBootstrap, translateListener, translateCluster, translateVirtualHost) have had Stats::StatsOptions& added to their interfaces, so that Utility::checkObjNameLength() can be called with the necessary context.

Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some small nits. Given the risk of this type of change would definitely like @mrice32 to take another pass though. Thank you!

* object name length can be overridden. The default initialization is used in IsolatedStatImpl, and
* the user-overridden struct is stored in Options.
*
* As noted in the comment above StatsOptionsImpl in source/common/stats/stats_impl.h, a stat name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: errant space, start of sentence

* As noted in the comment above StatsOptionsImpl in source/common/stats/stats_impl.h, a stat name
* often contains both a string whose length is user-defined (cluster_name in the below example),
* and a specific statistic name generated by Envoy, which are often as long as
* `upstream_cx_connect_attempts_exceeded`. To make room for growth on both fronts, we limit the max
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not sure the example here is really helpful, I would just delete it.

ref_count_ = 1;

uint64_t xfer_size = key.size();
memcpy(name_, key.data(), xfer_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we somehow assert that we aren't going to overwrite memory here? It's not super clear to me how we know that this is a safe copy (or if heap allocated pass in the size of the heap allocation and assert that)?

Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
@ambuc
Copy link
Contributor Author

ambuc commented Jul 11, 2018

@mattklein123 I added an ASSERT in 2aa9d8c. HeapRawStatData::Allocator calls RawStatData::initialize() only after figuring out how many bytes to allocate, for which it uses RawStatData::sizeGivenName(name). sizeGivenName(name) does the round-up arithmetic and always returns a number larger than name.size(). This plus the ASSERT() makes me feel better about it.

mattklein123
mattklein123 previously approved these changes Jul 11, 2018
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but would still like a final approval from @mrice32. Thank you!

@ambuc
Copy link
Contributor Author

ambuc commented Jul 11, 2018

@mattklein123 @mrice32 I want to write a golden file test for the internal representation (as hex maybe? not sure) of a stat just to be extremely sure that this didn't change. Thoughts on the maintainability of such a test?

@mattklein123
Copy link
Member

@mattklein123 @mrice32 I want to write a golden file test for the internal representation (as hex maybe? not sure) of a stat just to be extremely sure that this didn't change. Thoughts on the maintainability of such a test?

As long as it's well commented and someone other than you can understand the test and how to modify it, no objections from me.

Copy link
Member

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool - I think it's simpler and more cohesive now! A few minor comments. I'm still looking through how the various options are set to make sure the behavior doesn't change, but it looks good so far.

// expensive to performing a map lookup, since both require copying a truncated version of the
// string before doing the hash lookup.
uint64_t num_bytes_to_allocate = RawStatData::sizeGivenName(name);
RawStatData* data = static_cast<RawStatData*>(::calloc(num_bytes_to_allocate, 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this wasn't being done before, but shouldn't we return early if calloc returns a nullptr so we don't segfault by trying to write to it? If we're not equipped to handle this failure at the callsite, then maybe we should just throw instead of returning nullptr?

name_[xfer_size] = '\0';
}

void RawStatData::truncateAndInit(absl::string_view key, const StatsOptions& stats_options) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a subtlety here that I'm missing here, but can we just truncate the string_view, and pass it to initialize()? It seems like the majority of initialize() and truncateAndInit() are the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually really like this. I will refactor truncateAndInit() and checkAndInit() to use the same (private) initialize() method under the hood -- this makes the difference in logic very explicit.

ref_count_ = 1;

uint64_t xfer_size = key.size();
ASSERT(xfer_size <= num_bytes_allocated);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert seems strange - isn't num_bytes_allocated the number of bytes allocated to store the entire stat object rather than just the portion reserved for name_?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you are correct. This should be
ASSERT(sizeof(RawStatData) + xfer_size + 1 <= num_bytes_allocated). I'll change it.


std::string stats_table_prefix = fmt::format("{}table.{}", stat_prefix, table_name);
// Truncate the table prefix if the current string is too large.
if (stats_table_prefix.size() > remaining_size) {
stats_table_prefix = stats_table_prefix.substr(0, remaining_size);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've convinced myself that the hot restart layout hasn't changed. Outside of the comments above, LGTM.

@@ -37,7 +38,7 @@ class StatsThreadLocalStoreTest : public testing::Test, public RawStatDataAlloca
}));

EXPECT_CALL(*this, alloc("stats.overflow"));
store_.reset(new ThreadLocalStoreImpl(*this));
store_.reset(new ThreadLocalStoreImpl(options_, *this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, tossing in a new comment because replies don't seem to show up at the bottom. nit: can we use std::make_unique here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ambuc
Copy link
Contributor Author

ambuc commented Jul 11, 2018

@mattklein123 See #3843 for the internal memory representation test. I checked locally and the hash doesn't change, but I still want to merge that PR in first as proof.

…ion logic

Signed-off-by: James Buckland <jbuckland@google.com>
Signed-off-by: James Buckland <jbuckland@google.com>
htuch pushed a commit that referenced this pull request Jul 11, 2018
…ntation (#3843)

This PR adds a test to ensure that the internal memory representation of a RawStatData hasn't unintentionally changed. We take a hash of a struct by getting the Hex::encode() hex string representation of it, and then taking the HashUtil::xxHash64() hash of it. This will help ensure that #3629 and other PRs which deal with stats don't accidentally change RawStatData.

Risk Level: Low
Testing: N/A
Docs Changes: None
Release Notes: None

Signed-off-by: James Buckland <jbuckland@google.com>
…ions

Signed-off-by: James Buckland <jbuckland@google.com>

void RawStatData::checkAndInit(absl::string_view key, uint64_t num_bytes_allocated) {
uint64_t xfer_size = key.size();
ASSERT(sizeof(RawStatData) + xfer_size + 1 <= num_bytes_allocated);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we compute the in-memory size of a particular RawStatData often. Do you think it's worth making some utility function that takes the length of the string (without the null terminator) and computes the size so we make sure we do it the same way every time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this note. I did some refactoring work in aeb9251.

…structSizeWithOptions()

Signed-off-by: James Buckland <jbuckland@google.com>
mrice32
mrice32 previously approved these changes Jul 12, 2018
Copy link
Member

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a little weirdness in ThreadLocalStoreImpl. Thanks for working on this - this is awesome!

@@ -182,7 +183,8 @@ StatType& ThreadLocalStoreImpl::ScopeImpl::safeMakeStat(
if (stat == nullptr) {
parent_.num_last_resort_stats_.inc();
stat =
make_stat(parent_.heap_allocator_, name, std::move(tag_extracted_name), std::move(tags));
make_stat(parent_.heap_allocator_, name.substr(0, parent_.statsOptions().maxNameLength()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fishy. Since we now throw if the heap allocator fails to allocate, this statement will never be called in the non-hot restart case. However, it's still an embedded assumption about the specific behavior of all allocators - fixed length if they can fail and non-fixed length if they cannot fail. Is there any problem with changing the hot restart allocator to perform the fallback rather than relying on the caller? Since this is technically correct, I think it's fine to leave it this way for now, but please toss in a TODO to get rid of this assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this todo in cb0fd8a.

Signed-off-by: James Buckland <jbuckland@google.com>
RawStatData* data = static_cast<RawStatData*>(::calloc(num_bytes_to_allocate, 1));
data->initialize(name, num_bytes_to_allocate);
if (data == nullptr) {
throw EnvoyException("HeapRawStatDataAllocator: unable to allocate a new stat");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be caught. Can you throw some other type of exception? (Like whatever type new() throws).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to std::bad_alloc() in 4cad816.

Signed-off-by: James Buckland <jbuckland@google.com>
…ions

Signed-off-by: James Buckland <jbuckland@google.com>
…ions

Signed-off-by: James Buckland <jbuckland@google.com>
@mattklein123 mattklein123 merged commit c92a301 into envoyproxy:master Jul 13, 2018
@ambuc ambuc deleted the refactor-StatsOptions branch July 13, 2018 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants