Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: add built-in log linear histogram support #3059

Conversation

ramaraochavali
Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 12, 2018

Description:
Implements built-in log linear histogram. Fixes the issue #1965

Risk Level: High

Testing:
Unit, Integration and Manual testing.

Docs Changes:
NA

Release Notes:
Will create PR for version update once this is approved.

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
virtual void beginMerge() PURE;
};

class ThreadLocalHistogramImpl;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to forward declare this because with out this, I have to explicitly add merge(histogram_t* target) to ThreadLocalHistogram interface which means we have to expose the libcircl to stats.h interface definition. Otherwise I have to use dynamic_cast in the ParentHistogramImpl merge method. LMK if there are any other better solutions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be exposing libcircl to stats.h interface. Nor do I think we should be exposing ThreadLocalHistogramImpl. Can't you put the methods you need into the interface and use abstract classes except in the impl itself?

Scanning through stats.h, there are actually a couple of references to 'ThreadLocal' in StoreRoot, but otherwise there shouldn't really be references to a particular implementation from an interface. There's some kind of abstraction leak here, IMO.

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Thinking further, I think agree with you that ThreadLocalHistogram should not be in stats.h as it is not used by others in the system. I moved it to thread_local_store.h. Also I removed the addTlsHistogram method in ParentHistogram as that is an implementation detail. Now I think it looks clean. PTAL.

@ramaraochavali
Copy link
Contributor Author

@jmarantz @mattklein123 thanks for your quick reviews. Today I have addressed the review feedback.
The major ones include

  • Interface Split
  • JSON format support
  • Documenting threading model
    Also addressed all other review comments. PTAL and let me know any changes are required.

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flushing comments...

virtual void beginMerge() PURE;
};

class ThreadLocalHistogramImpl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be exposing libcircl to stats.h interface. Nor do I think we should be exposing ThreadLocalHistogramImpl. Can't you put the methods you need into the interface and use abstract classes except in the impl itself?

Scanning through stats.h, there are actually a couple of references to 'ThreadLocal' in StoreRoot, but otherwise there shouldn't really be references to a particular implementation from an interface. There's some kind of abstraction leak here, IMO.

/**
* This is used to keep track of all worker thread local histograms.
*/
virtual void addTlsHistogram(TlsHistogramSharedPtr hist_ptr) PURE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about making this a HistogramSharedPtr -- which references only the interface.

@@ -273,5 +273,34 @@ void RawStatData::initialize(absl::string_view key) {
name_[xfer_size] = '\0';
}

const std::vector<double> HistogramStatisticsImpl::supported_quantiles_ = {
0, 0.25, 0.5, 0.75, 0.90, 0.95, 0.99, 0.999, 1};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comments in this PR about how to initialize std::vector, and https://github.com/envoyproxy/envoy/blob/master/STYLE.md

The Google C++ style guide points out that non-PoD static and global variables are forbidden. This includes types such as std::string. We encourage the use of the advice in the C++ FAQ on the static initialization fiasco for how to best handle this.

void HistogramStatisticsImpl::refresh(const histogram_t* new_histogram_ptr) {
for (size_t i = 0; i < computed_quantiles_.size(); ++i) {
computed_quantiles_[i] = 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine; you could also use std::fill(computed_quantiles_.begin(), computed_quantiles_.end(), 0.0); which is fewer lines. Up to you.

FYI in practice what you had earlier with clear() had two potential problems with it:

  • it would be legal for std::vector to release the backing store for .data() after the clear(), in which case you might possibly SEGV. But it's also possible std::vector would retain the storage until the next operation. This should also be caught by running CI with a debug STL. See envoy has 33 failing tests when run in an STL debug mode #2556 for why we are not doing this yet.
  • I don't think clear() would have actually had this behavior of resetting the values to zero. So it's surprising that it would work in a test. But maybe they happened to be zero anyway. Which test was catching this?

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it to use the way you suggested. All the merge tests execute this logic. I think even though clear does not set it to 0, the hist_accumulate function might be updating it with the latest computed value. But I agree it is safe to initialize to zero and also added the ASSERT as you suggested in other comment.

for (size_t i = 0; i < computed_quantiles_.size(); ++i) {
computed_quantiles_[i] = 0;
}
hist_approx_quantile(new_histogram_ptr, supported_quantiles_.data(), supported_quantiles_.size(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I'd put:
ASSERT(supported_quantiles_.size() == computed_quantiles_.size());
as that is what's being assumed in this line, without it being locally obvious that it must be the case.

HistogramStatisticsImpl(const histogram_t* histogram_ptr);

std::string summary() const override;
const std::vector<double>& supportedQuantiles() const override { return supported_quantiles_; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably this method is where you'll want to lazy-initialize supported_quantiles_, and you should probably un-inline it.

if (params.size() == 0) {
// No Arguments so use the standard.
for (auto stat : all_stats) {
response.add(fmt::format("{}: {}\n", stat.first, stat.second));
}
for (auto histogram : all_histograms) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misunderstood your comment. What if interval_value is NaN and cumulative_value is not, or vice versa? Does the algorithm guarantee that can't happen?


EXPECT_EQ(2, validateMerge());

store_->shutdownThreading();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking about calling initializeThreading() in SetUp(), and putting InSequence in the class object (or maybe a histogram-related subclass. I wasn't thinking about initiating the histograms themselves as they are not all common.

But for TearDown, can't you do all these 3 lines? They seem pretty common. I'm not sure about the EXPECT_CALL below...could that be done in TearDown, or in the test body, which would re-order it?


// Validate the summary format is as expected.
std::string expected_summary = "P0: 0, P25: 14, P50: 44, P75: 420, P90: 3220, P95: 3260, P99: "
"3292, P99.9: 3299.2, P100: 3300";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the math here. None of the inputs are over 3200; why is P100=3300?

Can you add a simpler example and/or explain in a code comment why the value is that way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a simple test case now and validated that the quantile outs are as per the test here .

@postwait can help with the maths aspect here?

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@@ -80,6 +80,19 @@ class StatsThreadLocalStoreTest : public testing::Test, public RawStatDataAlloca
store_->addSink(sink_);
}

void setUp() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void SetUp() override { ....

then you can remove the explicit calls from the test methods. It has to be applicable to all the test-methods in the class, so you might want to make a subclass just for the histograms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a added a new sub class and made this override and removed explicit calls. similar change for TearDown.

store_->initializeThreading(main_thread_dispatcher_, tls_);
}

void tearDown() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void TearDown() override { ....

*all_threads_complete = true;
});

EXPECT_TRUE(*all_threads_complete);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks racy. runOnAllThreads returns instantly, calling the 'done' callback when it is done. The pattern I expect here is for you to establish a condition variable in your test, which you wait on here, and signal that condvar on the done callback.

You can also spin-loop on all_threads_complete since it is an atomic, but IMO it's better to use a condvar and make all_threads_complete a simple bool (not atomic, not shared).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the test to use condvar.

// Ensure that the thread local call back and all_thread_complete call back are called.
std::shared_ptr<std::atomic<bool>> all_threads_complete =
std::make_shared<std::atomic<bool>>(false);
std::shared_ptr<std::atomic<uint64_t>> thread_local_calls =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be shared; it's owned by the test method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think earlier we ran in to issue if it is not shared with TSAN build, because multiple thread update it may be. That is why we changed it to shared.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's because before you put the condvar in, the test method could exit while threads were active. The condvar solves that properly, so you shouldn't need to share it anymore.

Signed-off-by: Rama <rama.rao@salesforce.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramaraochavali thank you for your patience here. I think we are very close. This is looking awesome. I dropped in a few more comments. I still haven't looked through tests. @jmarantz LMK when the tests pass muster for you and I will skim though them.

@mrice32 can you also look at this point?

void refresh(const histogram_t* new_histogram_ptr);

private:
static const std::vector<double> supported_quantiles_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anymore? I don't think so?

@@ -446,6 +482,7 @@ class IsolatedStoreImpl : public Store {
// Stats::Store
std::list<CounterSharedPtr> counters() const override { return counters_.toList(); }
std::list<GaugeSharedPtr> gauges() const override { return gauges_.toList(); }
std::list<ParentHistogramSharedPtr> histograms() const override { return parent_histograms_; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You don't need the member variable, just returning {} should work.

std::unordered_set<std::string> names;
std::unique_lock<std::mutex> lock(lock_);
// TODO(ramaraochavali): As histograms don't share storage, there is a chance of duplicate names
// here. We need to process global storage for histograms similar to how we have a central storage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We need process global storage..."

if (names.insert(hist_name).second) {
ret.push_back(parent_hist);
} else {
ENVOY_LOG(debug, "duplicate histogram {}.{}", scope->prefix_, hist_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this will potentially have an effect on output, should this one actually be a warn? I could go either way on this. It's conceivable this would happen in the data path, though really it should only happen at config load time, so I'm not that worried about log spam.

You might also make the log message something like "duplicate histogram {}.{}: data loss will occur on output".

Can you also open a tracking issue for this? I have ideas on how to fix, but given that this won't effect too many people I don't think it's super urgent (ironically it will for sure effect us at Lyft since we share stats across listeners in certain cases).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to warn and added this issue #3098

for (const TlsHistogramSharedPtr& tls_histogram : tls_histograms_) {
tls_histogram->merge(interval_histogram_);
}
histogram_t* hist_array[1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comment was lost, but I'm sure you can do lock.unlock() before this line. The lock only needs to protect tls_histograms_. I would add a comment along these lines.

* current_active index via which it writes to the correct histogram.
* - When all workers have done, the main thread continues with the flush process where the
* "actual" merging happens.
* - As the active histograms are swapped in TLS histograms, On the main thread, we can be sure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/On/on

@@ -62,8 +166,11 @@ class ThreadLocalStoreImpl : public StoreRoot {
};

// Stats::Store
// TODO(ramaraochavali): Consider changing the implementation of these methods to use vectors and
// use std::sort, rather than inserting into a map and pulling it out for better performance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: period end of sentence.

for (const Stats::CounterSharedPtr& counter : server_.stats().counters()) {
all_stats.emplace(counter->name(), counter->value());
}

for (const Stats::GaugeSharedPtr& gauge : server_.stats().gauges()) {
all_stats.emplace(gauge->name(), gauge->value());
}
const std::list<Stats::ParentHistogramSharedPtr>& histograms = server_.stats().histograms();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically you are storing a reference to a temporary here. Just copy. The compiler will use RVO.

if (params.size() == 0) {
// No Arguments so use the standard.
for (auto stat : all_stats) {
response.add(fmt::format("{}: {}\n", stat.first, stat.second));
}
for (auto histogram : all_histograms) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramaraochavali my concern about the text output version is that it is not good for 1-line grep, which I think is a common use case (at least it is for me). It might be ugly, but can we figure out a format for the text version which has everything on a single line so we can grep for the stat?


rapidjson::Value quantile_array(rapidjson::kArrayType);

for (size_t i = 0; i < histogram->intervalStatistics().supportedQuantiles().size(); ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah +1 I think we can optimize. Right now since we don't have locked protos (or any protos) for the JSON output, I think we can change this stuff at any time. I would put in a TODO to think about optimizing when we switch over to protos for all admin JSON output.

EXPECT_EQ(*thread_local_calls, 1);
*all_threads_complete = true;
all_threads_complete_ = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to hold the mutex while modifying all_threads_complete_ and notifying the condvar. I'd do that using a std::unique-lock scoped around the those two lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. please check once

});

EXPECT_TRUE(*all_threads_complete);
std::unique_lock<std::mutex> lock(cv_mutex_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just hold this over the cv_.wait call and not during the shutdown calls.

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Must be sleeping when i coded this

@@ -47,6 +47,9 @@ class ThreadLocalInstanceImplTest : public testing::Test {
InstanceImpl tls_;
Event::MockDispatcher main_dispatcher_;
Event::MockDispatcher thread_dispatcher_;
std::condition_variable cv_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you spell out "condvar_" in the variable rather this abbreviation?

https://google.github.io/styleguide/cppguide.html#General_Naming_Rules

More importantly, this is only used in one test, so can you move these variables into the test? You could put them in a local struct there if you like, so that it's easier to capture them all at once for the nested function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved them to a local struct and renamed the variables. Can you PTAL at this specific test case and see if that is what you are looking for?

// Ensure that the thread local call back and all_thread_complete call back are called.
std::shared_ptr<std::atomic<bool>> all_threads_complete =
std::make_shared<std::atomic<bool>>(false);
std::shared_ptr<std::atomic<uint64_t>> thread_local_calls =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's because before you put the condvar in, the test method could exit while threads were active. The condvar solves that properly, so you shouldn't need to share it anymore.


EXPECT_EQ(1, validateMerge());

std::string expected_summary = "P0: 1, P25: 1.025, P50: 1.05, P75: 1.075, P90: 1.09, P95: 1.095, "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a weird artifact of the algorithm extrapolating off a degenerate case. Can I ask you to add another case where you add an even distribution of all integer values from 1 to 100 inclusive? I'm curious what you'll get. Maybe

PO: not sure....
P25: 25
P50: 50
P75: 75
P90: 90
P95: 95
P99: 99
P100: 100

or something close to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added that test case and this is what we get
P0: 0, P25: 25, P50: 50, P75: 75, P90: 90, P95: 95, P99: 99, P99.9: 99.9, P100: 100

store_->addSink(sink_);
}

void SetUp() override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you want the InSequence object to be declared as a the test class member var, rather than a local member, I confess I don't know what that does. I'm suggesting that because maybe it enforces some deterministic threading model, but I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not sure on this. So just leaving it as is.

class HistogramTest : public testing::Test, public RawStatDataAllocator {
public:
HistogramTest() {
ON_CALL(*this, alloc(_)).WillByDefault(Invoke([this](const std::string& name) -> RawStatData* {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the ctor and move these lines to SetUp().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

const std::vector<double>& HistogramStatisticsImpl::supportedQuantiles() const {
static const std::vector<double> supported_quantiles_ = {0, 0.25, 0.5, 0.75, 0.90,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local var should not have _ suffix.

}
histogram_t* hist_array[1];
hist_array[0] = interval_histogram_;
hist_accumulate(cumulative_histogram_, hist_array, ARRAY_SIZE(hist_array));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just write

hist_accumulate(cumulative_histogram_, &interval_histogram_, 1);

and skip setting up the temp array

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

@mattklein123 addressed all the comments. PTAL.

Signed-off-by: Rama <rama.rao@salesforce.com>
bool all_threads_complete_;
};

thread_local_verification_info thread_local_data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are going to define a type it should be called ThreadLocalVerificationInfo, but you don't need to define a type. Just do:

struct {
  std::atomic<uint64_t> thread_local_calls_{0};
  std::condition_variable condvar_;
  std::mutex condvar_mutex_;
  bool all_threads_complete_;
} thread_local_data;

thread_local_data.all_threads_complete_ = true;
thread_local_data.condvar_.notify_one();
});
std::unique_lock<std::mutex> lock(thread_local_data.condvar_mutex_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the mutex lock and the wait into a scope, so the lock is released before you call shutdownGlobalThreading. This sounds familiar, right? I am never sure if all my github comments are getting through.

In general hold locks for the minimum amount of time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did not get this specific comment. Will change now.

Copy link
Contributor

@jmarantz jmarantz Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment was "just hold this over the cv_.wait call and not during the shutdown calls." and your response was "done. Must be sleeping when i coded this".

If you could iterate through the PR and look at all the comments exposed by "show outdated" links it would reduce the number of comments that wind up getting repeated, and speed up iteration.

std::unique_lock<std::mutex> lock(thread_local_data.condvar_mutex_);
thread_local_data.condvar_.wait(
lock, [&thread_local_data] { return thread_local_data.all_threads_complete_; });

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You had EXPECT_TRUE(thread_local_data.all_threads_complete) here, right? Where did that go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Some how got deleted looks like. Will add it back

std::string h1_expected_summary =
"P0: 1, P25: 1.025, P50: 1.05, P75: 1.075, P90: 1.09, P95: 1.095, "
"P99: 1.099, P99.9: 1.0999, P100: 1.1";
std::string h2_expected_summary =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make these locals const. or just move the literals into their single use.

BTW I'm not really sure why you need the degenerate case; seems like you are testing behavior which is really at the boundary of sanity, and probably not that important. But it's nice to see that the even-distribution case is working.

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i want to verify that case as well. I will leave it like that. I will make those variable local const.

expectCallAndAccumulate(h2, 1);
expectCallAndAccumulate(h2, 2);

EXPECT_EQ(2, validateMerge());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not really validating that the merge did something sane because you are not showing the summary. WDYT about adding some data to both histograms so that, when merged, you get a result you expect?

Maybe have one histogram with [1, 1, 0, 0, 0, 0, 0, 0, 1, 1]; and see what the summary for that is. Make another histogram with [0, 0, 1, 1, 1, 1, 1, 1, 0, 0]. and see what the summary for that is. EXPECT_EQ on both of those. Then merge them, and EXPECT_EQ on the summary for that. Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the merge logic on our side is validated by the MultiMerge test. Are you trying to validate the libcircl hist_accumulate function? I tested that earlier IIRC and they were not same probably because of the insertion order. They will be close but little different. I will create a test case and send for @postwait's review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no test called 'MultiMerge' but there are a number of tests that do merges, which check that the number of merged histograms is as expected. However, there's none I can see that looked at the values in the merged histogram. I scanned your message to @postwait and it might be reasonable for there to be some floating point sloppiness, in which case you could do inexact data matching in the test.

But it seems wrong to have no functional testing at all of the merged data.

I'm also fine with changing the data from what I suggested above to something that's easier to reason about in the context of the math behind circlehist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarantz Now I added the test BasicHistogramMergeSummary that validates two set of values 1-50 and 51-100 and validates the merged summary is equal to [1-100].

* that can be asserted later.
*/
uint64_t validateMerge() {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove blank line here.

h2_interval_values;

/**
* Validates taht Histogram merge happens as desired and returns the processed histogram count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/taht/that/


rapidjson::Value quantile_array(rapidjson::kArrayType);

for (size_t i = 0; i < histogram->intervalStatistics().supportedQuantiles().size(); ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right a TODO should be here -- actually a warning not to use the data until another round of optimization.

if (params.size() == 0) {
// No Arguments so use the standard.
for (auto stat : all_stats) {
response.add(fmt::format("{}: {}\n", stat.first, stat.second));
}
for (auto histogram : all_histograms) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this format LGTM; easy to parse with a shell script if needed.

EXPECT_CALL(*this, alloc("stats.overflow"));
store_.reset(new ThreadLocalStoreImpl(*this));
store_->addSink(sink_);
InSequence s;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I've never been a huge gmock user. But I looked up InSequence and I suspect this usage model is not quite right. Check out https://github.com/google/googletest/blob/master/googlemock/docs/ForDummies.md#ordered-vs-unordered-calls

I think this only affects what's in the scope of that declaration, which in this case is only one call. I would think it should either be declared at the top of SetUp or as a class member var.

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

@jmarantz reg the json compact issue, I moved the TODO to the right place and created a envoy issue. Not sure if we want to warn not to use that till next optimization. @mattklein123 WDYT?

@ramaraochavali
Copy link
Contributor Author

ramaraochavali commented Apr 17, 2018 via email

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sticking through this. Mostly nits on my part now, except for the lack of a functional test for the merged histogram data.

I still want to look at the threading model -- I haven't dived into that yet, and I want to mainly to learn myself more about Envoy threading. I expect @mattklein123 will scrutinize that.

I raised an issue with github on why this PR unicorns...it's not that big. But I would suggest, if we need to clone it again, to squash the commits into one. We'd be losing the comments anyway, I think. But maybe the high commit-count is what blows up GH.

HistogramStatisticsImpl h2_interval_statistics(hist2_interval);

for (const Stats::ParentHistogramSharedPtr& histogram : histogram_list) {
if (histogram->name() == "h1") {
Copy link
Contributor

@jmarantz jmarantz Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK this pattern of looping over the histograms is a bit awkward and I think it might be worth having a test-class helper-method:

typedef std::map<std::string, Stats::ParentHistogramSharedPtr> NameHistogramMap
NameHistogramMap makeHistogramMap(list) { ... };

with usage in tests

const Stats::ParentHistogramSharedPtr& h1 = histogram_map["h1"];
EXPECT_EQ(h1->cumulativeStatistics().summary(), h1_cumulative_statistics.summary());
...

rather than looping and if/elseing the cases. If it were just one of these then it might not be worth it, but there are several, here & below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

EXPECT_CALL(sink_, onHistogramComplete(Ref(histogram), record_value));
histogram.recordValue(record_value);

if (histogram.name().find("h1") != std::string::npos) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@@ -82,6 +82,43 @@ TEST_F(ThreadLocalInstanceImplTest, All) {
tls_.shutdownThread();
}

// Validate ThreadLocal::runOnAllThreads behavior with all_thread_complete call back.
TEST_F(ThreadLocalInstanceImplTest, RunOnAllThreads) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm blank line

TEST_P(MetricsServiceIntegrationTest, BasicFlow) {
initialize();
// Send an empty request so that histogram values merged for cluster_0.
codec_client_ = makeHttpConnection(makeClientConnection((lookupPort("http"))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra parens around lookupPort

expectCallAndAccumulate(h2, 1);
expectCallAndAccumulate(h2, 2);

EXPECT_EQ(2, validateMerge());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no test called 'MultiMerge' but there are a number of tests that do merges, which check that the number of merged histograms is as expected. However, there's none I can see that looked at the values in the merged histogram. I scanned your message to @postwait and it might be reasonable for there to be some floating point sloppiness, in which case you could do inexact data matching in the test.

But it seems wrong to have no functional testing at all of the merged data.

I'm also fine with changing the data from what I suggested above to something that's easier to reason about in the context of the math behind circlehist.

Copy link
Member

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay. Just starting to look through this - here are a few surface-level comments. Really cool stuff! Awesome work!

}

const std::vector<double>& HistogramStatisticsImpl::supportedQuantiles() const {
static const std::vector<double> supported_quantiles_ = {0, 0.25, 0.5, 0.75, 0.90,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: supported_quantiles_ -> supported_quantiles

// here. We need to process global storage for histograms similar to how we have a central storage
// in shared memory for counters/gauges.
for (ScopeImpl* scope : scopes_) {
for (auto name_histogram_pair : scope->central_cache_.histograms_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not auto &?

tls_->runOnAllThreads(
[this]() -> void {
for (ScopeImpl* scope : scopes_) {
for (auto name_histogram_pair :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not auto &?

void recordValue(uint64_t value) override;

bool used() const override { return flags_ & Flags::Used; }
void beginMerge() override { current_active_ = 1 - current_active_; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

}

void ThreadLocalHistogramImpl::merge(histogram_t* target) {
hist_accumulate(target, &histograms_[1 - current_active_], 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make an inline class method "otherHistogramIndex()" for 1 - currentActive_, and also save it in a temp and use it in both places in this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

bool used() const override { return flags_ & Flags::Used; }
void beginMerge() override {
// this switches the current_active_ between 1 and 0.
current_active_ = 1 - current_active_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factor out '1 - current_active_' here too.

Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

Closing this and will create another PR as this is taking time to load getting unicorn for most of the folks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants