Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HeapStatData with a distinct allocation mechanism for RawStatData #3710

Merged
merged 35 commits into from
Jul 31, 2018

Conversation

jmarantz
Copy link
Contributor

@jmarantz jmarantz commented Jun 24, 2018

Description:

This PR sets the stage for a memory-friendly stat-allocator for use with hot-restart disabled.

  • hot-restart stats are still in RawStatData, non-hot-restart stats are in HeapStatData. Previously they were in HeapRawStatData which derived from RawStatData, whereas HeapStatData is an alternate impl, not derived from RawStatData.
  • allocators are now responsible for truncation, and require StatsOptions to be passed on construction
  • IsolatedStore now truncates at 1M chars, rather than special-casing no-truncation. This is now tested.
  • some tests moved around as there is no allocator for RawStatData in stats_impl.cc anymore; it's
    in hot_restart_impl.cc
  • factored out a mocked testing stats-allocator from 3 different tests into a common one, changing
    it from being multi-inherited with the testing framework to being instantiated. This was needed because stat-allocators now require StatOptions in their constructor.

This is another step toward addressing #3585

Risk Level: Medium
Testing: //test/...
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…em in the .h

Also simplifies the prometheus tests in admin_test.cc for readability,
which seemed natural to me as I removed the references to CounterImpl
and GaugeImpl.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…cator, factoring out the truncation to the base class. Committed for mainly for @ambuc's consideration.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@mattklein123
Copy link
Member

@dnoe please take a first pass on this

@jmarantz
Copy link
Contributor Author

@mattklein123 @dnoe I mainly issued this to share the ideas here with @ambuc -- current thinking is we want his #3629 to go in first, I think.

@stale
Copy link

stale bot commented Jul 3, 2018

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jul 3, 2018
@stale
Copy link

stale bot commented Jul 10, 2018

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot closed this Jul 10, 2018
@jmarantz
Copy link
Contributor Author

does pinging this re-open the PR? Now that #3629 is in I want to resolve conflicts & get this rolling.

@htuch htuch reopened this Jul 18, 2018
@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jul 18, 2018
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz changed the title WIP stats: Create HeapStatData with a distinct allocation mechanism for RawStatData HeapStatData with a distinct allocation mechanism for RawStatData Jul 19, 2018
@jmarantz
Copy link
Contributor Author

I think this is ready for a look, but note that I may have regressed the behavior of IsolatedStore. See comments at the relevant line.

// https://github.com/envoyproxy/envoy/pull/3629, which removed the shadowing
// of option size in a static.

/* explicit */ IsolatedStoreImpl(/*Stats::StatsOptions& stats_options*/)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note behavior regression here:

If someone (@ambuc @mattklein123 @mrice32) can share the desired behavior (not truncated at all vs truncated with the other stats) I will fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comes down to the question of whether or not we expect an IsolatedStore to have stats which are interoperable with other stats. My instinct is no, due to its isolated nature -- in which case this behavior is not something anyone relies upon. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion, I force the truncation at 1M bytes, so effectively we are not truncating in this PR.

@@ -478,40 +478,26 @@ TEST(TagProducerTest, CheckConstructor) {
"No regex specified for tag specifier and no default regex for name: 'test_extractor'");
}

// Check consistency of internal stat representation
TEST(RawStatDataTest, Consistency) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: moved to hot_restart_impl_test.cc, which allocates RawStatData*. stats_impl.cc no longer provides an allocator that allocates RawStatData*.

Copy link
Contributor Author

@jmarantz jmarantz Jul 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test -- still in hot_restart_impl_test.cc -- no longer covers truncation. That's covered in a separate new test in thread_local_store_test.cc, which is solely responsible for truncation.

This test's sole purpose is to make sure that the bit-pattern of a shared-memory stat allocation remains unchanged across releases, unless we also update SharedMemory::VERSION.

@@ -371,6 +374,27 @@ class TestAllocator : public RawStatDataAllocator {
std::unordered_map<std::string, CSmartPtr<RawStatData, freeAdapter>> stats_;
};

class MockedTestAllocator : public RawStatDataAllocator {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this mocked allocator was previously repeated 3x in the codebase, all as part of multiple inheritance with a test class. What motivated the refactor in this PR, though, was the need to pass a StatsOptions& to the constructor, which is annoying to do for a test-class.

If desired I could split that out as a separate PR.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from a skim, found two small things.

* in a shared memory block without internal pointers.
*/
class RawStatDataAllocator : public StatDataAllocator {
// Partially implements a StaDataAllocator, leaving alloc & free for subclasses.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo StaDataAllocator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

RawStatData* alloc(const std::string& name) override;
void free(RawStatData& data) override;
explicit HeapStatDataAllocator(const StatsOptions& options) : StatDataAllocatorImpl(options) {}
// HeapStatDataAllocator() {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…inheritance for StatData.

Signed-off-by: Joshua Marantz <jmarantz@google.com>

HeapStatData* HeapStatDataAllocator::alloc(const std::string& name) {
Copy link
Member

@mrice32 mrice32 Jul 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this earlier, but the comment got buried under some other changes. I'm relatively confident that adding truncation in HeapStatDataAllocator is a user-facing change. Previous to this change we did not truncate stats in non hot restart mode. This behavior change should be documented in the release notes. I'm also not sure it makes sense to change this behavior if we ultimately do not want non hot restart stats to be truncated as is suggested in #3965 (since I believe that this is the existing behavior). I could be wrong about the behavior changing, but if so, please point out where they are currently being truncated on master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading again I think you are right; it's pretty subtle though. In current master, we truncate when making a RawStatData in RawStatData::truncateAndInit(), but asymmetrically, we truncate at the call-site in ThreadLocalStoreImpl::safeMakeStat() only when calling the fallback. OK let me fix that; it's a little cleaner to just do this right.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

jmarantz commented Jul 29, 2018

I restored the previous functionality, and added explicit tests for it. I made all stat-name truncation happen only in ThreadLocalStore, and added a Stat interface for determining whether an allocator requires truncation.

This changed a bunch of things -- including some changes I had previously made reverting back to the state they were in 'master'. But some interface changes, such as std::string --> absl::string_view and making the mock test allocator a separate class -- means the file-count is still 19. I think it's easiest to review this as changes from master, rather than reviewing the incremental commits.

…ge, rather than asserting, as thi is easier to test in release mode.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks good, but a bunch of minor comments on this pass. We really should just merge this and continue iterations in the next PR :)

* @param stats_options The stat optinos, which must live longer than the allocator.
*/
explicit StatDataAllocatorImpl(/*const StatsOptions& stats_options*/) {}
//: stats_options_(stats_options) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public:
/**
* @param stats_options The stat optinos, which must live longer than the allocator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/optinos/options/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/**
* @param stats_options The stat optinos, which must live longer than the allocator.
*/
explicit StatDataAllocatorImpl(/*const StatsOptions& stats_options*/) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: empty arg default constructor and explicit..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

name = absl::string_view(name.data(), max_length);
}
}
return name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought that is unrelated to this PR (existing issue). How confident are we that two truncated stats won't conflict, e.g. if I truncated a.b.c.d and a.b.c.e to 5 characters, we get overlap.. it seems this is double bad, since we're both truncating and conflating..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah...in the past if I wanted to truncate to 'n' I'd actually truncate to n-k, and use a k-character base64 hash of the untruncated string to disambiguate. That would still be pretty awful for stats as you'd have no idea which one you were looking at. Basically, truncation is bad and should be avoided.

@@ -178,18 +198,13 @@ StatType& ThreadLocalStoreImpl::ScopeImpl::safeMakeStat(
if (!central_ref) {
std::vector<Tag> tags;
std::string tag_extracted_name = parent_.getTagsForName(name, tags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining why it's not necessary to use the truncated name for tag extraction?


class MockedTestAllocator : public RawStatDataAllocator {
public:
MockedTestAllocator(const StatsOptions& stats_options) : alloc_(stats_options) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Changes in this code/area make my eyes bleed so just need to pull the trigger. Few small things.

/**
* @param stats_options The stat options, which must live longer than the allocator.
*/
StatDataAllocatorImpl() {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is this default constructor needed? Doc comment seems out of date also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops; not needed; in an earlier rev I needed a 1-arg constructor and removing the arg, I forgot to remove the whole thing.


// Note that the heap-allocator does not truncate itself; we have to
// truncate here if we are using heap-allocation as a fallback due to an
// exahusted shared-memroy block
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo memroy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

key.remove_suffix(key.size() - options_.statsOptions().maxNameLength());
// The name is truncated in ThreadLocalStore before this is called.
size_t max_length = options_.statsOptions().maxNameLength();
if (name.length() > max_length) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this happen then? Should this just be an assert? The comment is a bit confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not happen in production. It can happen in a unit test. It could be an ASSERT, but then it can't be tested. It could be a RELEASE_ASSERT, and be tested with a DEATH_TEST.

I'm happy to go with whatever is the best practice of those for the project.

I'll try to clarify the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably just do a normal ASSERT as this seems like a programming error. If you feel strongly that it should be a RELEASE_ASSERT w/ a test that's fine also.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
mattklein123
mattklein123 previously approved these changes Jul 31, 2018
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤞 😉

@mrice32 can you approve the most recent changes? Thank you for powering through this review!

@mattklein123 mattklein123 self-assigned this Jul 31, 2018
htuch
htuch previously approved these changes Jul 31, 2018
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending @mrice32 final pass.

Copy link
Member

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks. A few small comments. I think that this code has gotten more complicated as we've added flexibility, and I think it could definitely use one or two PRs of work on simplification.

void free(Stats::RawStatData& data) override;
bool requiresBoundedStatNameSize() const override { return true; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this overridden here and in RawStatDataAllocator? Also why even have the RawStatDataAllocator intermediate class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - I didn't need this extra impl of requiresBoundedStatNameSize here, but I do need RawStatDataAllocator as it is used as a base class for test allocators as well as this.

@@ -319,47 +320,28 @@ TagProducerImpl::addDefaultExtractors(const envoy::config::metrics::v2::StatsCon
return names;
}

void HeapRawStatDataAllocator::free(RawStatData& data) {
// TODO(jmarantz): move this below HeapStatDataAllocator::alloc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? Seems like a strange TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally it makes sense to group alloc/free together of a single class, but these got separated in a prior PR; leaving this here now for easier diffs. Will fix later in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It seemed so trivial, I was a little confused as to why you didn't just do it now. Easier review is a good reason. That makes sense.


HeapStatData* HeapStatDataAllocator::alloc(absl::string_view name) {
// Do not truncate here for non-hot-restart case; truncate at the call-site in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a weird nit, but I think this comment is adding confusion. If we want to delegate truncation to the callsite, I think we can just make the comment (or something similar) since the truncation logic isn't really relevant here:

// Any expected truncation of name is done at the callsite. No truncation is
// required to use this allocator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// An unordered set of RawStatData pointers which keys off the key()
// TODO(jmarantz): See https://github.com/envoyproxy/envoy/pull/3927 which can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like more of a TODO for @ambuc :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, TODO's are generally signed by the author of the TODO, not necessarily by the person who is expected to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave a comment linking to issue #3585 which will address this in #3927, hopefully.

absl::string_view ThreadLocalStoreImpl::truncateStatNameIfNeeded(absl::string_view name) {
// If the main allocator requires stat name truncation, warn and truncate, before
// attempting to allocate.
if (alloc_.requiresBoundedStatNameSize()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: as I mentioned in person, I think the encapsulation would be cleaner if the allocators were constructed with their truncation settings and then they could handle any necessary truncation themselves (the fallback heap allocator would be set up to truncate, while the non-hot restart heap allocator would be set up to not truncate). Here, it seems like we're half delegating that responsibility by giving them a method to communicate whether they require truncation, but not giving them the responsibility to read the settings and do that truncation within alloc().

However, I think we should just try to get this PR in as is rather than making more interface changes, so this seems fine for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's a follow-up change.

HeapStatData* alloc(absl::string_view name) override;
void free(HeapStatData& data) override;

// StatsDataAllocator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit typo: StatDataAllocator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This typo seems to still be here :)

Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz dismissed stale reviews from htuch and mattklein123 via 1b89d7f July 31, 2018 16:59
mrice32
mrice32 previously approved these changes Jul 31, 2018
Copy link
Member

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than the typos


class RawStatDataAllocator : public StatDataAllocatorImpl<RawStatData> {
public:
// StatsDataAllocator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo here, too: StatDataAllocator

HeapStatData* alloc(absl::string_view name) override;
void free(HeapStatData& data) override;

// StatsDataAllocator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This typo seems to still be here :)

…sure...

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let's do this. Agree with @mrice32 that this code has become incredibly complicated with the different permutations of heap/hot-restart, size limits, etc. If there is anything we can do in followups to help with readability that would be awesome.

@mattklein123 mattklein123 merged commit d0f10fa into envoyproxy:master Jul 31, 2018
@jmarantz jmarantz deleted the stats-heap-divest branch July 31, 2018 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants