Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docdb] Reduce memory footprint of the per tablet Histograms #7805

Closed
bmatican opened this issue Mar 24, 2021 · 1 comment
Closed

[docdb] Reduce memory footprint of the per tablet Histograms #7805

bmatican opened this issue Mar 24, 2021 · 1 comment
Assignees
Labels
area/docdb YugabyteDB core features priority/high High Priority

Comments

@bmatican
Copy link
Contributor

@amitanandaiyer brought this up as part of the investigation in #6676

We should be able to optimize our Histogram usage, to instead of storing these per tablet, bubble these up per table. We could maybe store references to these at the TSTabletManager level and flow these into the Tablet instances, to keep most of the existing code the same.

Other notes

  • we are already doing per-table aggregations in Prometheus, so this would only affect the per-tablet versions in the JSON /metrics, which is less relevant anyway
  • we already reset the histograms whenever we query from Prometheus
  • need to do something special for GC, in case a table is fully deleted / a TS no longer serves any tablets of a particular table, but that should be easy to detect..
@bmatican bmatican added area/docdb YugabyteDB core features priority/high High Priority labels Mar 24, 2021
@bmatican bmatican added this to To do in Usability via automation Mar 24, 2021
amitanandaiyer added a commit that referenced this issue Apr 6, 2021
…of having Histograms (in TabletMetrics and other objects) separately for each tablet.

Summary:
Reduce memory footprint of the per tablet histograms.
 - Define Histogram Metrics at a table level MetricEntity, instead of having one each at the Tablet level.
   We export them to prometheus-metrics as a roll up at table level, so no loss of precision here.
 - Counters/Gauges are still going to be at the tablet level.
   Counters using incr/decr could probably be moved up as well. However guages using set_value will need to be at the tablet level.
   Metrics such as `is_raft_leader` do not make sense at the table level.

Test Plan:
Jenkins

spin up local yb-ctl
compare output from 127.0.0.1:9000/prometheus-metrics with and without the changes

{F15979}

{F15980}

Space usage in-alloc bytes

Before

{F15984}

vs after

{F15983}

Reviewers: bogdan, kannan, rahuldesirazu

Reviewed By: rahuldesirazu

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D11089
@amitanandaiyer
Copy link
Contributor

Images posted on the commit essentially verify that the TabletMetrics in-use go down from 512 (i.e. # of tablets) at each call stack to 1 (# of tables created) per call stack.

Before commit a4eb8ae :

07:40 ~/code/yugabyte-db [more_traces] $ grep -B 5 "yb::tablet::Table.*Metrics::Table.*Metrics" ~/Downloads/512_tablets.in_use_bytes.html | grep -A 1 -B 5 "yb::HistogramPrototype::Instantiate()" | grep -A 5 -B 2 "yb::HdrHistogram::Init()"
<tr>
<td>512</td><td>10485760</td><td>20480.00</td><td>512</td><td>10485760</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>512</td><td>10485760</td><td>20480.00</td><td>512</td><td>10485760</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>512</td><td>10485760</td><td>20480.00</td><td>512</td><td>10485760</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>512</td><td>10485760</td><td>20480.00</td><td>512</td><td>10485760</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>512</td><td>10485760</td><td>20480.00</td><td>512</td><td>10485760</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--

After commit a4eb8ae

07:39 ~/code/yugabyte-db [more_traces] $ grep -B 5 "yb::tablet::Table.*Metrics::Table.*Metrics" 512_tablets_with_fix.in_use_bytes.html | grep -A 1 -B 5 "yb::HistogramPrototype::Instantiate()" | grep -A 5 -B 2 "yb::HdrHistogram::Init()"
--
<tr>
<td>1</td><td>20480</td><td>20480.00</td><td>1</td><td>20480</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>1</td><td>20480</td><td>20480.00</td><td>1</td><td>20480</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>1</td><td>20480</td><td>20480.00</td><td>1</td><td>20480</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>1</td><td>20480</td><td>20480.00</td><td>1</td><td>20480</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
--
--
--
<tr>
<td>1</td><td>20480</td><td>20480.00</td><td>1</td><td>20480</td><td>20480.00</td><td><pre>yb::HdrHistogram::Init()
yb::HdrHistogram::HdrHistogram()
yb::Histogram::Histogram()
yb::HistogramPrototype::Instantiate()
yb::tablet::TabletMetrics::TabletMetrics()
--
07:40 ~/code/yugabyte-db [more_traces]

amitanandaiyer added a commit that referenced this issue Apr 17, 2021
…belonging to a table.

Summary:
As part of reducing the memory footprint of histograms, we want to track metrics
at the table level, since we only export/store prometheus metrics rolled upto a table
level.

Implements rocksdb::Statistics based on Metric/MetricEntity framework, and uses
a table-level MetricEntity to instantiate histograms that are to be shared across
tablets.

Depends on D11089

Test Plan:
Jenkins

Before
{F16052}
After
{F16053}

Restart before and after; and compare /prometheus-metrics output to make sure that all the metrics are exported correctly.

```
18:02 $ curl 127.0.0.1:9000/prometheus-metrics > after2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  178k  100  178k    0     0  2805k      0 --:--:-- --:--:-- --:--:-- 2825k
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
18:03 $ for i in before after2 ; do cat $i | sort | sed 's/}.*/}/' > compare/$i; done
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
18:03 $ diff compare/*
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
18:03 $ md5sum compare/*
3b271eff87a4208f4e9f8719ce0391e5  compare/after2
3b271eff87a4208f4e9f8719ce0391e5  compare/before
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
```

Reviewers: bogdan, rahuldesirazu

Reviewed By: rahuldesirazu

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D11138
Usability automation moved this from To do to Done May 24, 2021
YintongMa pushed a commit to YintongMa/yugabyte-db that referenced this issue May 26, 2021
…instead of having Histograms (in TabletMetrics and other objects) separately for each tablet.

Summary:
Reduce memory footprint of the per tablet histograms.
 - Define Histogram Metrics at a table level MetricEntity, instead of having one each at the Tablet level.
   We export them to prometheus-metrics as a roll up at table level, so no loss of precision here.
 - Counters/Gauges are still going to be at the tablet level.
   Counters using incr/decr could probably be moved up as well. However guages using set_value will need to be at the tablet level.
   Metrics such as `is_raft_leader` do not make sense at the table level.

Test Plan:
Jenkins

spin up local yb-ctl
compare output from 127.0.0.1:9000/prometheus-metrics with and without the changes

{F15979}

{F15980}

Space usage in-alloc bytes

Before

{F15984}

vs after

{F15983}

Reviewers: bogdan, kannan, rahuldesirazu

Reviewed By: rahuldesirazu

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D11089
YintongMa pushed a commit to YintongMa/yugabyte-db that referenced this issue May 26, 2021
…tablets belonging to a table.

Summary:
As part of reducing the memory footprint of histograms, we want to track metrics
at the table level, since we only export/store prometheus metrics rolled upto a table
level.

Implements rocksdb::Statistics based on Metric/MetricEntity framework, and uses
a table-level MetricEntity to instantiate histograms that are to be shared across
tablets.

Depends on D11089

Test Plan:
Jenkins

Before
{F16052}
After
{F16053}

Restart before and after; and compare /prometheus-metrics output to make sure that all the metrics are exported correctly.

```
18:02 $ curl 127.0.0.1:9000/prometheus-metrics > after2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  178k  100  178k    0     0  2805k      0 --:--:-- --:--:-- --:--:-- 2825k
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
18:03 $ for i in before after2 ; do cat $i | sort | sed 's/}.*/}/' > compare/$i; done
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
18:03 $ diff compare/*
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
18:03 $ md5sum compare/*
3b271eff87a4208f4e9f8719ce0391e5  compare/after2
3b271eff87a4208f4e9f8719ce0391e5  compare/before
dev-server-amitanand2:~/code/yugabyte-perf [:e00d18a|✚ 47…13⚑ 17]
```

Reviewers: bogdan, rahuldesirazu

Reviewed By: rahuldesirazu

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D11138
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features priority/high High Priority
Projects
Usability
  
Done
YBase features
  
Awaiting triage
Development

No branches or pull requests

3 participants
@bmatican @amitanandaiyer and others