Port isolation from old TSDB PR #6841

beorn7 · 2020-02-18T15:33:17Z

The original PR was prometheus-junkyard/tsdb#306 .

I tried to carefully adjust to the new world order, but please give this a very careful review, especially around iterator reuse (marked with a TODO).

On the bright side, I definitely found and fixed a bug in txRing.

beorn7 · 2020-02-18T15:34:25Z

Fixes #1893.

beorn7 · 2020-02-18T16:17:22Z

Note: Recently merged #6777 didn't trigger any formal merge conflicts, but it still breaks the tests in here. I'll rebase at my next convenience. This should not touch anything else but tests, as far as I can see, so please go ahead with the review.

brian-brazil

Great to see this finally back on the table.

tsdb/isolation.go

tsdb/db_test.go

tsdb/head.go

tsdb/isolation.go

tsdb/head.go

roidelapluie · 2020-02-18T18:49:38Z

Just dropping a note that I would like, once this is ready for merge, to prombench this against master.

brian-brazil · 2020-02-18T19:56:16Z

We should prombench as soon as it compiles. I wonder how my memory estimates from 3 years ago hold up.

roidelapluie · 2020-02-18T19:58:59Z

Is this documentation worthy ? in tsdb/docs ?

beorn7 · 2020-02-18T23:16:03Z

/benchmark master

roidelapluie · 2020-02-18T23:31:02Z

/benchmark master

/prombench master

prombot · 2020-02-18T23:31:03Z

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-6841 and master

After successful deployment, the benchmarking metrics can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart v2.12.0

beorn7 · 2020-02-19T10:12:37Z

Something locks up, and then the instances OOM.
Debugging now.
@gouthamve your eyes on this will be much appreciated.

beorn7 · 2020-02-19T10:12:49Z

/prombench cancel

prombot · 2020-02-19T10:12:51Z

Benchmark cancel is in progress.

roidelapluie · 2020-02-20T09:43:06Z

This pull request has been worked on by multiple people in the last years. That alone shows how this feature is needed and also the complexity of this issue.

I want to drop a note here that as this is a important and risky change, as release shepherd, this pull request is under my watch and if we do not settle on it (merge it) before the 5th of March, I will put my veto on it for the 2.17 release.

That said, we are still 2 weeks away from this date and the prometheus and tsdb maintainers can merge it in the time between without my intervention.

krasi-georgiev · 2020-02-20T10:07:06Z

@beorn7 the prombench has loki logs, so maybe that will help with the debuging
I had a quick look and can't find the logs though, maybe @geekodour can point us to where the logs are for this crashing prometheus

geekodour · 2020-02-20T10:34:33Z

@krasi-georgiev these logs?

beorn7 · 2020-02-20T10:35:01Z

I had a look at the logs, that's where my conclusions are coming from. My next goal is to reproduce the crash locally so that I don't have to run prombench for eight hours. So far no luck.
It would be great if @gouthamve could chime in as he has probably the most intimate understanding how this code works.

beorn7 · 2020-02-20T10:37:27Z

@roidelapluie yes sure. It's your call when the time has come. Currently, we don't even know if this will work at all.

beorn7 · 2020-02-20T15:18:49Z

Looking at this dashboard, the hypothesis is the following:

For some reason, the low watermark got stuck around 02:45 UTC. First crash happened a while after the head truncation at 05:00, which apparently ran into trouble (head chunks didn't drop). Things didn't look to bad after that, but then the server went into a tight crash loop.

I'll focus my investigation on possible reasons why the low watermark didn't get updated. (Wild guess: A scrape got canceled halfway through.)

Still, hints from people who know this better than me highly appreciated.

beorn7 · 2020-02-20T16:52:17Z

/prombench master

prombot · 2020-02-20T16:52:19Z

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-6841 and master

After successful deployment, the benchmarking metrics can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart v2.12.0

beorn7 · 2020-02-21T13:20:49Z

I removed the downsizing code. Prombench has run for ~20 hours now without issues. RAM usage is increased a bit, but not more than before, when the downsizing code was included.

Maybe the downsizing code has a bug, which is plausible, but not proven yet.
Or we were just lucky in the current Prombench run.

Today I'm busy with other things, so I'll let Prombench run for a few days.

prombot · 2020-02-23T16:57:36Z

Benchmark tests are running for 3 days! If this is intended ignore this message otherwise you can cancel it by commenting: /prombench cancel

bwplotka

Oh my. I think I see it, but...

Making ingestion of all the samples from a single scrape atomic would solve the problem

Do we know how much of overhead this simple solution would give? Would it really be too slow? Do we have data?

It looks good, I reviewed most of, still can't wrap around the writeID and isolation logic will continue review later. Also, can we remove/clarify commented code? It does not help in review 😄

Some suggestions for now.

tsdb/db_test.go

tsdb/head.go

tsdb/head_test.go

beorn7 · 2020-02-27T22:57:54Z

/prombench cancel

prombot · 2020-02-27T22:57:55Z

Benchmark cancel is in progress.

beorn7 · 2020-02-27T22:58:19Z

Benchmark looks OK, but we have to benchmark the new changes anyway. So canceling the current run.

beorn7 · 2020-02-27T23:17:02Z

Don't we need (*txRing) cleanupAppendIDsBelow when Rolling Back ?

We do. I have added it, including amending a test so that it now exposes the bug.

tsdb/head_test.go

beorn7 · 2020-02-28T00:39:18Z

/prombench master

prombot · 2020-02-28T00:39:19Z

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-6841 and master

After successful deployment, the benchmarking metrics can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart v2.12.0

beorn7 · 2020-02-28T12:29:57Z

Prombench results:

3% increase in CPU usage.
Avg query latencies increased 4% for query, 19% for query_range.
Avg. go_memstats_alloc_bytes are 19% increased.

This is now much higher than before, but I believe due to the buggy "cleanup everything while no reads are in progress" we had before, we didn't really do the full story before.

beorn7 · 2020-02-28T12:37:52Z

I think those values are still within expectations, based on what @brian-brazil said above.

This will still, I guess, be noticed painfully by users with high-load tight-resource setups.

Signed-off-by: beorn7 <beorn@grafana.com>

@brian-brazil

This has been ported from prometheus-junkyard/tsdb#306. Original implementation by @brian-brazil, explained in detail in the 2nd half of this talk: https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ The implementation was then processed by @gouthamve into the PR linked above. Relevant slide deck: https://docs.google.com/presentation/d/1-ICg7PEmDHYcITykD2SR2xwg56Tzf4gr8zfz1OerY5Y/edit?usp=drivesdk Signed-off-by: beorn7 <beorn@grafana.com> Co-authored-by: Brian Brazil <brian.brazil@robustperception.io> Co-authored-by: Goutham Veeramachaneni <gouthamve@gmail.com>

beorn7 · 2020-02-28T13:49:07Z

Rebased and squashed.

Last call for objections, otherwise I'll merge in about an hour or so…

beorn7 · 2020-02-28T13:49:36Z

/prombench cancel

prombot · 2020-02-28T13:49:38Z

Benchmark cancel is in progress.

roidelapluie · 2020-02-28T17:33:02Z

Hooray 🎉🎉🎉🎉

roidelapluie · 2020-02-28T17:44:31Z

@beorn7 @brian-brazil can we agree on a sentence to put at the beginning of the release notes about this? Especially something about the memory increase.

beorn7 · 2020-02-28T18:30:38Z

I'm also running this now on a moderately loaded production server for comparison (500k series, 30k samples/s, very low query volume). The increase in RAM consumption is even more pronounced here: The peak value of go_memstats_heap_alloc_bytes jumps from 4.4GB to 7.2GB. container_memory_working_set_bytes goes from 8.2GB to 11.3GB.

I'll do a bit of heap analysis at my next convenience to see if there are low hanging fruit.

beorn7 · 2020-02-28T19:02:01Z

This should definitely come with a big warning in the release notes. It will anyway be a hard sell, given that only very few users will have ever noticed the problems with isolation. I'd also keep reverting this on the agenda. We need to give it some thought…

beorn7 · 2020-02-28T19:03:56Z

Peak container_memory_working_set_bytes (just before head truncation) is 11.7GB vs. 8.6GB, i.e. 36% increase.

beorn7 · 2020-02-29T01:32:35Z

I think I have found the bug: When we replay the WAL, we dutifully append every sample with appendID 0. Those are all cleaned up after the next commit, but the ring buffer has then already grown to accommodate all samples in the WAL, and it will never shrink again.

The solution should be easy: appendID==0 should never be recorded. PR in preparation…

roidelapluie · 2020-02-29T13:12:25Z

I think I have found the bug: When we replay the WAL, we dutifully append every sample with appendID 0. Those are all cleaned up after the next commit, but the ring buffer has then already grown to accommodate all samples in the WAL, and it will never shrink again.

The solution should be easy: appendID==0 should never be recorded. PR in preparation…

I still think it might be worthy to have the downsizing code again

beorn7 · 2020-02-29T20:28:14Z

I still think it might be worthy to have the downsizing code again

Let's see how it turns out in practice. Downsizing could easily let to oscillating behavior, which would be worse than no downsizing (spikier memory usage, more allocations, slower appends).

bobrik · 2020-03-23T02:29:14Z

I think this could fix #4580 as well.

beorn7 requested review from bwplotka, brian-brazil, gouthamve and codesome February 18, 2020 15:33

beorn7 force-pushed the beorn7/isolation branch from 63805f3 to 07b3db1 Compare February 18, 2020 15:35

brian-brazil reviewed Feb 18, 2020

View reviewed changes

tsdb/isolation.go Outdated Show resolved Hide resolved

tsdb/db_test.go Outdated Show resolved Hide resolved

tsdb/head.go Outdated Show resolved Hide resolved

tsdb/isolation.go Outdated Show resolved Hide resolved

tsdb/head.go Outdated Show resolved Hide resolved

beorn7 force-pushed the beorn7/isolation branch from 07b3db1 to 2767d66 Compare February 18, 2020 16:41

prombot added the prombench label Feb 18, 2020

bwplotka reviewed Feb 24, 2020

View reviewed changes

roidelapluie reviewed Feb 27, 2020

View reviewed changes

tsdb/head_test.go Outdated Show resolved Hide resolved

roidelapluie approved these changes Feb 27, 2020

View reviewed changes

beorn7 and others added 2 commits February 28, 2020 14:17

Fix punctuation nits

6b81813

Signed-off-by: beorn7 <beorn@grafana.com>

beorn7 force-pushed the beorn7/isolation branch from b9e2af6 to 7f30b09 Compare February 28, 2020 13:43

beorn7 merged commit d137cdd into master Feb 28, 2020

beorn7 deleted the beorn7/isolation branch February 28, 2020 16:48

beorn7 mentioned this pull request Feb 29, 2020

Do not attempt isolation for appendID == 0 #6899

Merged

ipstatic mentioned this pull request Mar 12, 2020

Query/Rule Time Regression #6968

Closed

mxinden mentioned this pull request Apr 5, 2020

src/histogram: Make Histogram::observe atomic across collects tikv/rust-prometheus#314

Merged

Port isolation from old TSDB PR #6841

Port isolation from old TSDB PR #6841

Conversation

beorn7 commented Feb 18, 2020

beorn7 commented Feb 18, 2020

beorn7 commented Feb 18, 2020

brian-brazil left a comment

Choose a reason for hiding this comment

roidelapluie commented Feb 18, 2020

brian-brazil commented Feb 18, 2020

roidelapluie commented Feb 18, 2020 • edited

beorn7 commented Feb 18, 2020

roidelapluie commented Feb 18, 2020

prombot commented Feb 18, 2020

beorn7 commented Feb 19, 2020

beorn7 commented Feb 19, 2020

prombot commented Feb 19, 2020

roidelapluie commented Feb 20, 2020

krasi-georgiev commented Feb 20, 2020

geekodour commented Feb 20, 2020

beorn7 commented Feb 20, 2020

beorn7 commented Feb 20, 2020

beorn7 commented Feb 20, 2020

beorn7 commented Feb 20, 2020

prombot commented Feb 20, 2020

beorn7 commented Feb 21, 2020

prombot commented Feb 23, 2020

bwplotka left a comment

Choose a reason for hiding this comment

beorn7 commented Feb 27, 2020

prombot commented Feb 27, 2020

beorn7 commented Feb 27, 2020

beorn7 commented Feb 27, 2020

beorn7 commented Feb 28, 2020

prombot commented Feb 28, 2020

beorn7 commented Feb 28, 2020

beorn7 commented Feb 28, 2020

beorn7 commented Feb 28, 2020

beorn7 commented Feb 28, 2020

prombot commented Feb 28, 2020

roidelapluie commented Feb 28, 2020

roidelapluie commented Feb 28, 2020

beorn7 commented Feb 28, 2020

beorn7 commented Feb 28, 2020

beorn7 commented Feb 28, 2020

beorn7 commented Feb 29, 2020

roidelapluie commented Feb 29, 2020

beorn7 commented Feb 29, 2020 • edited

bobrik commented Mar 23, 2020

roidelapluie commented Feb 18, 2020 •

edited

beorn7 commented Feb 29, 2020 •

edited