Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: add built-in log linear histogram support #2932

Conversation

ramaraochavali
Copy link
Contributor

@ramaraochavali ramaraochavali commented Mar 29, 2018

Description:
Implements built-in log linear histogram. Fixes the issue #1965

Risk Level: High

Testing:
Unit, Integration and Manual testing.

Docs Changes:
NA

Release Notes:

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

@mattklein123 attempted the approach that we talked about in #1965. Can you PTAL? Still need to

  • Flush the histogram data to sinks
  • Correct and add tests

void ThreadLocalStoreImpl::mergeHistograms() {
if (!shutting_down_) {
std::unique_lock<std::mutex> lock(lock_);
// TODO: Wait till all threads have been checked-in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattklein123 I need your help here. How do I wait till all worker threads have finished? Is there a standard pattern in c++/envoy code here that I can use? Tried to look at latch/barrier support but could not find such thing. This is the only thing pending from basic implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a perfect world you could add barrier/notify functionality into the TLS itself, e.g., runOnAllThreadsBarrier(...). If you don't want to do that, I would just add a function to TLS to get the number of workers, then in the run on all threads lambda, decrement an atomic counter. Whichever wins the race to 0 can post back to the main thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the method in TLS system.

@mattklein123
Copy link
Member

@ramaraochavali sorry for the delay I will spend some devoted time to this tomorrow. @mrice32 if you have time it would also be good for you to take a quick look see.

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@@ -101,7 +101,7 @@ TEST_F(StatsThreadLocalStoreTest, NoTls) {
EXPECT_EQ(&g1, &store_->gauge("g1"));

Histogram& h1 = store_->histogram("h1");
EXPECT_EQ(&h1, &store_->histogram("h1"));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we the new approach, if tls is not there we are creating new one instead of returning central ref, this condition fails. I commented it for now and open for discussion on this.

Signed-off-by: Rama <rama.rao@salesforce.com>
@@ -135,6 +135,26 @@ class MetricsServiceSink : public Sink {
gauage_metric->set_value(value);
}

void flushHistogram(const Histogram& histogram) override {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed, i am just pushing the summary quantiles not rich histogram in the first iteration. If some one is interested, I am open to publishing rich histogram data as well.

@ramaraochavali
Copy link
Contributor Author

@trabetti fyi. still WIP. But you can follow this PR

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is great and I think what we want to have. Very exciting I dropped in a bunch of comments to get started. This will need multiple review passes but we are definitely on the right track. Awesome!!

/**
* Flush a histogram.
*/
virtual void flushHistogram(const Histogram&){};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PURE (have every sink implement, even if they ignore).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.. will change them. added them like that so that i can quickly test with out changing tests.

@@ -263,6 +268,11 @@ class Store : public Scope {
* @return a list of all known gauges.
*/
virtual std::list<GaugeSharedPtr> gauges() const PURE;

virtual std::list<HistogramSharedPtr> histograms() const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, PURE, every store should implement.

@@ -294,6 +304,8 @@ class StoreRoot : public Store {
* down.
*/
virtual void shutdownThreading() PURE;

virtual void mergeHistograms() {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same: please also add doc comments for all interface functions (above also).

void flushHistogram(const Histogram& histogram) override {
// ParentHistogramSharedPtr parent_histogram =
// std::dynamic_pointer_cast<HistogramParentImpl>(&histogram);
const HistogramParentImpl& parent_histogram =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't dynamic cast here. Add whatever interface functions you need.

@@ -74,6 +75,10 @@ class ThreadLocalStoreImpl : public StoreRoot {
ThreadLocal::Instance& tls) override;
void shutdownThreading() override;

void mergeHistograms() override;

void mergeInternal();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private

for (ScopeImpl* scope : scopes_) {
for (auto histogram : tls_->getTyped<TlsCache>().scope_cache_[scope].histograms_) {
TlsHistogramSharedPtr tls_histogram_ptr =
std::dynamic_pointer_cast<ThreadLocalHistogramImpl>(histogram.second);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove dynamic_cast (by storing impls if needed). I will stop commenting but basically the change should have no dynamic_casts.

}
}
});
if (main_thread_dispatcher_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think main_thread_dispatcher_ must be non-nullptr here. I would skip this check? Obviously this also needs to get moved inside the run on all threads lambda above, see my comment above.

@@ -139,6 +145,7 @@ void InstanceImpl::flushStats() {
server_stats_->days_until_first_cert_expiring_.set(
sslContextManager().daysUntilFirstCertExpires());

stats_store_.mergeHistograms();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, the merge will essentially always be one interval behind the flush interval. Is this what we want? Do we want to wait into the merge is complete? If so, my thinking was to have mergeHistograms() take a lambda which would fire when the merge is complete, and then the flush would complete.

Also, on this topic, for thought: I do wonder if the histogram merge interval should be separately configurable from the main stat flush interval. (E.g., let's say you wanted to flush stats/gauges every 500ms, you might want to wait longer to get more histogram data per interval). I would put in a TODO to consider this in the future.

Copy link
Contributor Author

@ramaraochavali ramaraochavali Mar 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not get the first part. Since we block till all tls histograms are merged then main merge goes in sequentially - is the lambda still required? I may be missing some thing

@@ -139,6 +145,7 @@ void InstanceImpl::flushStats() {
server_stats_->days_until_first_cert_expiring_.set(
sslContextManager().daysUntilFirstCertExpires());

stats_store_.mergeHistograms();
InstanceUtil::flushCountersAndGaugesToSinks(config_->statsSinks(), stats_store_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename this function to something more appropriate? flushMetricsToSinks() ?

void ThreadLocalStoreImpl::mergeHistograms() {
if (!shutting_down_) {
std::unique_lock<std::mutex> lock(lock_);
// TODO: Wait till all threads have been checked-in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a perfect world you could add barrier/notify functionality into the TLS itself, e.g., runOnAllThreadsBarrier(...). If you don't want to do that, I would just add a function to TLS to get the number of workers, then in the run on all threads lambda, decrement an atomic counter. Whichever wins the race to 0 can post back to the main thread.

dynamic_cast<const HistogramParentImpl&>(histogram);
io::prometheus::client::MetricFamily* metrics_family = message_.add_envoy_metrics();
metrics_family->set_type(io::prometheus::client::MetricType::SUMMARY);
metrics_family->set_name(parent_histogram.name());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like this should go into the .cc

@@ -23,6 +23,10 @@

#include "absl/strings/string_view.h"

extern "C" {
#include <circllhist.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include "circlehist.h"

<> should be used for system includes (e.g. included with the OS or compiler). Also I would've thought this should be in a subdir.

struct HistogramStatistics {
public:
HistogramStatistics(histogram_t* histogram_ptr) {
double quantile_in[] = {0, 0.25, 0.5, 0.75, 0.90, 0.95, 0.99, 0.999, 1};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factor out this constant so it only appears once. I see it also in grpc_metrics_service_impl.h:149

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not able to make it const - because of my c++ knowledge levels :-). But made it class variable and used it every where.

quantile_out_[4], quantile_out_[5], quantile_out_[6], quantile_out_[7],
quantile_out_[8]);
}
double quantile_out_[9];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double quantile_out_[ARRAY_SIZE(quantile_in)];

}

void beginMerge() {
if (current_active_ == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: current_active_ = 1 - current_active_;

void addTlsHistogram(HistogramSharedPtr hist_ptr) { tls_histograms_.emplace_back(hist_ptr); }

void merge() {
hist_clear(interval_histogram_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you comment about the threading model here? or maybe just point to Matt's design in github?

}

HistogramStatistics getIntervalHistogramStatistics() const {
HistogramStatistics interval_histogram(interval_histogram_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest: return HistogramStatistics(interval_histogram_);
...just for brevity

}

HistogramStatistics getCumulativieHistogramStatistics() const {
HistogramStatistics cumulative_histogram(cumulative_histogram_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

for (ScopeImpl* scope : scopes_) {
for (auto histogram : scope->central_cache_.histograms_) {
if (names.insert(histogram.first).second) {
ret.push_back(histogram.second);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth a note here that the histograms will come back in arbitrary order.

suggestion: return a vector rather than a list, and build that vector after accumulating the set, so that:
a) you can let go of the lock before converting to a vector
b) you can do vector.reserve(names_.size()) before building it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarantz have not addressed this yet. will address in next round. :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize on re-read that there's an established pattern of using lists rather than for returning collections of stats. @mattklein123 -- I'm curious why that choice?

But for this PR I retract this suggestion -- it probably doesn't save that much to release the lock earlier. I would like to see some comments as to why this de-duping is necessary though; that seems like something must have gone wrong upstream for it to be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember why this was a list. I think this was historical. Free free to return a vector if you want to change the various usages.

re: The de-dup, that is required due to the threaded store uses scopes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we incorporate the scopes into the histogram names somehow, so we have distinct ones for each scope? It seems like just dropping them would lose data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes, this is a problem for histograms which don't use shared mem backing store. It's not an issue for counters/gauges. Good catch. This will need to be fixed somehow.

Copy link
Contributor

@jmarantz jmarantz Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool... @ramaraochavali you can maybe leave a TODO here, and maybe even a warning log that we are dropping content?

@ambuc See also #2453

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarantz I have added a todo and not changing the return type now based on your comment above about "I retract this suggestion". Is that fine?

std::unordered_set<std::string> names;
std::unique_lock<std::mutex> lock(lock_);
for (ScopeImpl* scope : scopes_) {
for (auto histogram : scope->central_cache_.histograms_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style-guide (https://google.github.io/styleguide/cppguide.html#auto) leaves some room for interpretation, but my reading is that in this case and a few others in this PR, you should use explicit types.

If you were using iterator objects, those are good candidates for 'auto' because their types are fairly obvious and not necessarily important to spell out.

e..g.

for (auto it = collection.begin(); it != collection.end(); ++it) {
  explicit_type& val = *it;
  ...
};

or

for (explicit_type& val : collection) {
  ...
};

But it's possible there's an established style here I'm going against the grain in suggesting.

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am changing it like this to be type explicit. Is that fine?

  for (ScopeImpl* scope : scopes_) {
    for (auto histogram : scope->central_cache_.histograms_) {
      **const std::string& hist_name = histogram.first;
      const ParentHistogramSharedPtr& parent_hist = histogram.second;**
      if (names.insert(hist_name).second) {
        ret.push_back(parent_hist);
      }
    }
  }

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

ramaraochavali commented Mar 31, 2018

@mattklein123 ready for another pass.

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramaraochavali definitely on the right track. I left my next dump of comments. Let's work on these and then I will take another pass. Thank you!


genrule(
name = "circllhist_config",
srcs = ["src/circllhist_config.h.in"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we follow up on @jmillikin-stripe comment here? It does seem weird that the generated header is just being copied. Can you add a comment as to why this is OK if it is? cc @postwait

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _config.h.in is an autoconf input template. It is not safe to copy it on all platforms. There is a

#ifdef HAVE_ALLOCA_H

in the circllhist.c file that needs that that conditionally set depending on system headers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@postwait thanks. What is your suggestion here? Should I copy based on platform? or which platforms it is good to copy?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libcircllhist (like many open source projects) is auto-tools based. When you build it, it will build-time detect which features are available on your platform and act on that. I don't know how to do the same thing in cmake. I am pretty sure that HAVE_ALLOCA_H is the only important thing in that header though -- so if you can get cmake to provide that, you're set.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, it is only alloca() that is a problem. I've opened a PR on libcircllhist to resolve this. It should be possible shortly to just ignore that header completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@postwait thank you so much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@postwait Awesome. Thanks. I got the new version and removed the genRule and it works.

/**
* Returns the Histogram Summary Statistics for the flush interval.
*/
virtual HistogramStatistics intervalStatistics() const PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure how this compiles. I think it's luck if HistogramStatistics is returned by value. We either need to define HistogramStatistics in this file, or I think probably more optimally, create an abstract interface for HistogramStatistics and return by const ref. I'm not sure which one will be easier. (The question on whether we can return by const ref is not really dependent on whether we use an abstract interface or not.)

virtual void merge() PURE;

/**
* Returns the Histogram Summary Statistics for the flush interval.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: don't capitalize "Histogram Summary Statistics"

virtual HistogramStatistics intervalStatistics() const PURE;

/**
* Returns the Cumulative Histogram Summary Statistics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: don't capitalize "Cumulative Histogram Summary Statistics"

/**
* Flush a histogram.
*/
virtual void flushHistogram(const Histogram&) PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const Histogram& histogram


bool used() const override { return flags_ & Flags::Used; }

HistogramStatistics intervalStatistics() const override { return HistogramStatistics(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still confused here about how we are returning the summary data...


Store& parent_;
std::list<TlsHistogramSharedPtr> tls_histograms_;
histogram_t* interval_histogram_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think in looking at this now I would make HistogramStatistics an interface, then where you need it, have an impl that wraps the circl implementation, which you can then return via const reference.

void InstanceImpl::runOnAllThreadsWithBarrier(Event::PostCb cb, Event::PostCb main_callback) {
ASSERT(std::this_thread::get_id() == main_thread_id_);
ASSERT(!shutdown_);
worker_counter_ = registered_threads_.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not safe for concurrent uses of this function. Instead of using a member variable, allocate an atomic inside a shared_ptr, and capture it in the posted lambdas.

Also, this new function needs explicit tests in thread_local_impl_test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be used concurrently since it is allowed to be called only from main thread? Any way I will try with shared_ptr as it seems better compared to member variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it to use shared_ptr. I came across this http://floating.io/2017/07/lambda-shared_ptr-memory-leak/ - Should we worry about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reg tests, I will add once the implementation comes to a sane stage.

Store& parent_;
};

/**
* Log Linear Histogram implementation per thread.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move ThreadLocalHistogramImpl and HistogramParentImpl directly into thread_local_store. We are going to have to deal with some locking issues and I would like things to be more explicitly coupled. We can break it apart later if necessary.


bool used() const override {
bool any_tls_used = false;
for (auto tls_histogram : tls_histograms_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All sues of tls_histograms_ have to be locked if we are going to do the used() stuff. I think what I would actually do is move your "merge_lock" into each histogram. and then just lock each function here that is needed. Let's go ahead and do that and I can review again. (Just trying to iteratively tighten stuff up so I can focus more on concurrency).

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Moved the merge_lock to histogram and locked the functions there. Looks like this is much cleaner. If you think locking is costly compared to just merging with out checking for used, we can do that. But I think would be better to check used while flushing though. WDYT?

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@jmarantz
Copy link
Contributor

jmarantz commented Apr 5, 2018

@ramaraochavali There's a handful of comments I've left that don't look like they are addressed yet to me.

Can you take another pass over the 'Conversation' view, poking at the 'show outdated' links in GitHub?

Then I'll take another look. You can also DM me on Slack if you like and I can chat about C++ stuff if that helps.

Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

ramaraochavali commented Apr 6, 2018

@jmarantz I have looked at the conversations view and addressed all. Can you PTAL? Please let me know I missed addressing any comment by mistake.
I have modified tests to be more readable and pushed the commit. Please suggest what additional test cases need to be added?

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
for (const Stats::HistogramSharedPtr& histogram : histogram_list) {
if (histogram->name().find(hist_name) != std::string::npos) {
EXPECT_EQ(histogram->cumulativeStatistics().summary(), h1_cumulative_statistics.summary());
// EXPECT_EQ(histogram->intervalStatistics().summary(), h1_interval_statistics.summary());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will enable this check based on @postwait reply on the issue I asked him.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find this communication. Can you send this issue to me directly? Or file an issue on libcircllhist?

Copy link
Contributor Author

@ramaraochavali ramaraochavali Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@postwait Here is the issue
looks like found another corner case issue or may be i am misunderstanding.

Consider the code

histogram_t* main_thread_interval_hist = hist_alloc();
histogram_t* per_thread_interval_hist = hist_alloc();
hist_insert_intscale(per_thread_interval_hist, 1, 0, 1);
double in[9] = {0, 0.25, 0.5, 0.75, 0.90, 0.95, 0.99, 0.999, 1};
double out[9];
hist_approx_quantile(main_thread_interval_hist, in, 9,out);
histogram_t* hist_array[1];
hist_array[0] = per_thread_interval_hist;
hist_accumulate(main_thread_interval_hist, hist_array, ARRAY_SIZE(hist_array));
hist_clear(per_thread_interval_hist);
//hist_free(per_thread_interval_hist);
//per_thread_interval_hist = hist_alloc();
hist_array[0] = per_thread_interval_hist;
hist_insert_intscale(per_thread_interval_hist, 2, 0, 1);
//hist_clear(main_thread_interval_hist);
//hist_free(main_thread_interval_hist);
main_thread_interval_hist = hist_alloc();
hist_accumulate(main_thread_interval_hist, hist_array, ARRAY_SIZE(hist_array));
hist_approx_quantile(main_thread_interval_hist, in, 9,out);
std::cout<<"after accumulations :"<<out[0]<<"==> \n";

histogram_t* direct_hist = hist_alloc();
hist_insert_intscale(direct_hist, 2, 0, 1);
hist_approx_quantile(direct_hist, in, 9,out);
std::cout<<"direct :"<<out[0]<<"==> \n";

I expect direct and after accumulations to give the same output. But it gives 1 in first bin (rest of the bins are same) for accumulation case and 2 in the direct case.

If I change the code to free and hist_alloc it works correctly in both the cases. Is this is a corner case issue with hist_clear?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code as written should give 1 in the first case and 2 in the second.

You add 1 to a hist and accumulate into "main", then you add 2 into a cleared hist and accumulate into "main". main has an item in 1 and 2... q(0) == 1.

The second, you just add a 2... so q(0) == 2.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry confusing example of allocating without freeing. Hard to follow the example. I think you're right... There's an issue with empty bins.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@postwait thanks a lot. Updated the lib and it works great.

Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

@jmarantz @mattklein123 fyi, tested this manually today by running envoy and a test service continuously pushing requests for an hour. At the end of it the histograms are captured as follows

cluster.test.upstream_rq_time: 
 	 Interval:   P0: 0, P25: 0, P50: 0, P75: 0, P90: 0, P95: 1.01228, P99: 1.08246, P99.9: 1.09825, P100: 1.1 
 	 Cumulative: P0: 0, P25: 0, P50: 0, P75: 0, P90: 1.08335, P95: 2.07604, P99: 29.9808, P99.9: 33.4218, P100: 120

Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali ramaraochavali changed the title [WIP] stats: add built-in log linear histogram support stats: add built-in log linear histogram support Apr 10, 2018
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand how agnostic we want Envoy to be wrt to some of its dependencies. This binds Envoy's core source pretty tightly to circlehist, which might be the right thing in this case, but I know there are other histogram impls as well.

Part of the broad appeal of Envoy I thought in general was the ability to integrate it into lots of different environments. So it'd be good to have @htuch and @mattklein123 and @alyssawilk weigh in on this aspect for this PR.

@@ -114,6 +115,11 @@ class Metric {
* Returns the name of the Metric with the portions designated as tags removed.
*/
virtual const std::string& tagExtractedName() const PURE;

/**
* Indicates whether a metric has been used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has been used by what? What does 'used' mean here? From reading the code, does it mean non-empty? If so, could you call this empty() and invert it, per C++ conventions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this actually is a method used in other Metric as well. Used here signifies whether it has been updated/a value has been set because of some thing happening like connection closing etc. This is helpful in flushing only metrics whose value has been modified. I just moved it to the Metric interface now because histogram also uses it. Earlier Counter and Gauge has this method. I am not sure if it is equivalent of empty.

Copy link
Contributor

@jmarantz jmarantz Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK you shouldn't change the name as that would make the PR bigger.

But could you change to the comment something like:

  * Indicates whether the metric contains new data since the last flush().

@mattklein123 confirming that's accurate....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not since last flush, it indicates it has actually ever been set a value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No? Then what is it used for? Above you said "flushing only metrics whose value has been modified."

Over time, couldn't you wind up with every metric in this state, if the bit isn't cleared during flush?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it works like this. If a metric value has ever been set it is considered "used" and it would be pushed to stats sinks on subsequent flusher irrespective of whether it has been updated in that interval or not. I think it is reasonable because and would allow only metrics that are really used in that envoy instance to be pushed rather than all metrics. @mattklein123 may have better idea on the history. So I am not changing any thing here.

@@ -294,6 +344,12 @@ class StoreRoot : public Store {
* down.
*/
virtual void shutdownThreading() PURE;

/**
* Called during the flush process to merge all the thread local histograms. The passed in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please clarify in the comment when the flush will complete, and in what thread.

HistogramStatisticsImpl(const HistogramStatisticsImpl&) = delete;
HistogramStatisticsImpl& operator=(HistogramStatisticsImpl const&) = delete;

std::string summary() const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omve to cc file

public:
HistogramStatisticsImpl() : computed_quantiles_(supported_quantiles_.size(), 0.0) {}

HistogramStatisticsImpl(histogram_t* histogram_ptr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to cc file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

/**
* Clears the old computed values and refreshes it with values computed from passed histogram.
*/
void refresh(histogram_t* new_histogram_ptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to cc file

interval_statistics_(interval_histogram_), cumulative_statistics_(cumulative_histogram_) {}

virtual ~HistogramParentImpl() {
hist_free(interval_histogram_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these non-trivial-bodies should be in cc file

@@ -91,6 +91,22 @@ void InstanceImpl::runOnAllThreads(Event::PostCb cb) {
cb();
}

void InstanceImpl::runOnAllThreadsWithBarrier(Event::PostCb cb,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does 'barrier' mean here, to a user of this method?

Of course you are using a shared std::atomic (nicely!) to track when to call the completion CB, but I don't really understand why the word 'barrier' wants to be in this method name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'barrier' was added to indicate that there is a barrier that waits for all threads to complete before calling the completion CB. If you have any better names I am open to changing this.

@@ -402,11 +404,21 @@ Http::Code AdminImpl::handlerStats(absl::string_view url, Http::HeaderMap& respo
all_stats.emplace(gauge->name(), gauge->value());
}

for (const Stats::HistogramSharedPtr& histogram : server_.stats().histograms()) {
all_histograms.emplace(histogram->name(),
fmt::format("\n \t Interval: {} \n \t Cumulative: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need for space before second \n or the spaces around the tabs.

void validateMerge(std::string hist_name, std::vector<uint64_t> h1_cumulative_values,
std::vector<uint64_t> h2_cumulative_values,
std::vector<uint64_t> h1_interval_values,
std::vector<uint64_t> h2_interval_values, bool single_histogram = false) {
Copy link
Contributor

@jmarantz jmarantz Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't give a default value to 'single_histogram'; it is not obvious why it should default to false, and you have no existing callers since this is a new method.

You could also just return the number of histograms at the end of the method, and do the EXPECT_EQ at the call-site, since you don't use 'single_histogram' for anything else.

@@ -189,6 +246,251 @@ TEST_F(StatsThreadLocalStoreTest, BasicScope) {
// Includes overflow stat.
EXPECT_CALL(*this, free(_)).Times(5);
}
TEST_F(StatsThreadLocalStoreTest, BasicSingleHistogramMerge) {
InSequence s;
store_->initializeThreading(main_thread_dispatcher_, tls_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these test methods are very long and there are several of them. Do you think it makes sense to factor them out into a helper method? Then the differences between them may be easier to understand by passing in what varies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it and since the flow in each method/test case is little different, I felt it is difficult to factor them into a helper method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'll try to point out some small helper-function opportunities that would make it easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed this.

@@ -189,6 +246,251 @@ TEST_F(StatsThreadLocalStoreTest, BasicScope) {
// Includes overflow stat.
EXPECT_CALL(*this, free(_)).Times(5);
}
TEST_F(StatsThreadLocalStoreTest, BasicSingleHistogramMerge) {
InSequence s;
store_->initializeThreading(main_thread_dispatcher_, tls_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'll try to point out some small helper-function opportunities that would make it easier to read.

std::vector<uint64_t> h1_cumulative_values;
std::vector<uint64_t> h2_cumulative_values;
std::vector<uint64_t> h1_interval_values;
std::vector<uint64_t> h2_interval_values;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW you could declare all these in one line. Alternatively you could also make these vectors be class member vars, since you are using the same pattern in many tests.

EXPECT_CALL(sink_, onHistogramComplete(Ref(h1), 0));
h1.recordValue(0);
h1_cumulative_values.push_back(0);
h1_interval_values.push_back(0);
Copy link
Contributor

@jmarantz jmarantz Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would turn this repeated 4-line stanza a helper method, e.g.

  expectCallAndAccmumute(expected_call, record_value, cumulative_value, interval_value) {
     ...
  }

...
  {
    expectCallAndAccmumute(43, 43, 43, 43);
    expectCallAndAccmumute(41, 41, 41, 41);

Note that when you factor out all this repeated stuff you might see patterns more easily. I imagine that in other sub-tests the 4 args will not all be the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Good pattern. I have even simplified further by
expectCallAndAccumulate(Histogram& histogram, uint64_t record_value) as record_value is what we push to interval as well as cumulative histograms

@@ -80,6 +80,63 @@ class StatsThreadLocalStoreTest : public testing::Test, public RawStatDataAlloca
store_->addSink(sink_);
}

void validateMerge(std::string hist_name, std::vector<uint64_t> h1_cumulative_values,
std::vector<uint64_t> h2_cumulative_values,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are not going to make these member vars (suggested below) you should definitely pass them by const ref.

@mattklein123
Copy link
Member

I'm trying to understand how agnostic we want Envoy to be wrt to some of its dependencies. This binds Envoy's core source pretty tightly to circlehist, which might be the right thing in this case, but I know there are other histogram impls as well. Part of the broad appeal of Envoy I thought in general was the ability to integrate it into lots of different environments. So it'd be good to have @htuch and @mattklein123 and @alyssawilk weigh in on this aspect for this PR.

I think this is the right approach for this feature (after cleanup and code review). Although it's possible to do this as a sink, the histograms are generally useful for local debugging on the admin endpoint regardless of sink. Also, we will likely want this functionality later for latency based load balancing.

Although there are multiple histogram implementations, after research I think it really comes down to this one and the HDR implementation which is more complicated. I think this implementation is pretty simple and a good place to start. If there is real pushback on having this feature in the core I would like to understand why. If perf is the concern, we can disable histogram collection via a stats store option. What else?

@mrice32
Copy link
Member

mrice32 commented Apr 11, 2018

@mattklein123 +1 to the idea of adding a stats store option to disable the histogram generation/collection.

Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
Signed-off-by: Rama <rama.rao@salesforce.com>
@ramaraochavali
Copy link
Contributor Author

@mattklein123 +1 to disabling it based on config. But would like to do it in a follow-up PR as this one is already huge and would like avoid further churn here.

@ramaraochavali
Copy link
Contributor Author

@jmarantz addressed all the comments. Whereever I have not addressed I have replied with a question/clarification. PTAL and it should be good now.

@alyssawilk
Copy link
Contributor

alyssawilk commented Apr 11, 2018 via email

@mattrice
Copy link

@alyssawilk I think you probably wanted to ping @mrice32 instead of me. I do not feel I have the necessary familiarity with this project to render an opinion 😄

@mattklein123
Copy link
Member

mattklein123 commented Apr 11, 2018 via email

@mattklein123
Copy link
Member

mattklein123 commented Apr 11, 2018 via email

@ramaraochavali
Copy link
Contributor Author

Closing this and will open another one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants