-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for Native Histograms #11447
Conversation
We cannot just use prometheus/client_model directly because we want to stay consistent with the use of gogo-protobuf. So this converts metrics.proto to proto3 and edits it lightly so that it fits into the framework how prometheus/prometheus handles protobuf. Note that metrics.proto couldn't be merged into the prompb package because prompb already has an Exemplar type, which is different from the Exemplar type in metrics.proto. The directory structure seems to play a role in the protobuf world, so I better kept it. Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* histogram chunk Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * xorAppender.AppendHistogram non-method Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * basic histogram chunk test Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* integer types and timestamp separation 1) unify types to int64. as suggested by beorn. we want to support counters going down (resets) even if we plan to create new chunks for now, in that case 2) histogram type doesn't know its own timestamp. include it separately in appending and iteration Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * correction: count and zeroCount to remain unsigned to make api more resilient and that's what we use in protobuf anyway Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * temp hack. Ganesh will fix Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* Append sparse histograms into the Head block Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add AtHistogram() to Iterator interface. Make HistoChunk conform to Chunk interface. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
…#9015) * factor out different varbit schemes and include Beorn's "optimum" for buckets Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * use more compact dod encoding scheme for SHS chunk columns Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * remove FB VB and xor dod encoding because we won't use it Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * HistoChunk metadata encoding Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * add SparseHistogram.Copy() Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * histogram test: test appending a few histograms Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * add license headers Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
This "brings back" protobuf parsing, with the only goal to play with the new sparse histograms. The Prom-2.x style parser is highly adapted to the structure of the Prometheus text format (and later OpenMetrics). Some jumping through hoops is required to feed protobuf into it. This is not meant to be a model for the final implementation. It should just enable sparse histogram ingestion at a reasonable efficiency. Following known shortcomings and flaws: - No tests yet. - Summaries and legacy histograms, i.e. without sparse buckets, are ignored. - Staleness doesn't work (but this could be fixed in the appender, to be discussed). - No tricks have been tried that would be similar to the tricks the text parsers do (like direct pointers into the HTTP response body). That makes things weird here. Tricky optimizations only make sense once the final format is specified, which will almost certainly not be the old protobuf format. (Interestingly, I expect this implementation to be in fact much more efficient than the original protobuf ingestion in Prom-1.x.) - This is using a proto3 version of metrics.proto (mostly to be consistent with the other protobuf uses). However, proto3 sees no difference between an unset field. We depend on that to distinguish between an unset timestamp and the timestamp 0 (1970-01-01, 00:00:00 UTC). In this experimental code, we just assume that timestamp is never specified and therefore a timestamp of 0 always is interpreted as "not set". Signed-off-by: beorn7 <beorn@grafana.com>
Hacky implementation of protobuf parsing
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: beorn7 <beorn@grafana.com>
#9030) * bucketIterator which returns all valid bucket indices for a []span Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * support for comparing []spans and generating interjections Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * add license header Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * assert order fix Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * handle pathological 0-length span case more gracefully Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * stale todo Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * decode-recode histograms when new buckets appear Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * factor out recoding and also add it to the fallback case Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * make linter happy Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* Modify query_range to serve only sparse histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Finish CumulativeExpandSparseHistogram for positive schema Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix bug and comment out tests for query_range Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint 2 Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Do not panic on histoAppender.Append Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * M-map all chunks on shutdown Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Support negative schema for querying Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Update querier.go to support Head compaction with histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add test for Head compaction with histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix tests Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Hardcode rate() for sparse histograms
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Sparsehistogram: SHS chunk recording and head cutting to head block
Signed-off-by: beorn7 <beorn@grafana.com>
Fix lint issues
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Fix TSDB race while reading histograms
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Fix interjections at the end
/prombench cancel Number of blocks equal after the next compaction. main branch did more compactions this time. |
Benchmark cancel is in progress. |
PromQL benchmark results. Not all pretty at the moment. Show results
|
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
With these changes, the "happy path" when the leading and trailing number of bits don't need an update, fewer operations are needed. The change is probably very marginal (no change in the benchmark added here, but the benchmark also doesn't cover non-changing values), and an argument could me made that avoiding pointers also has its benefits. However, I think that reducing the number of return values improves readability. Which convinced me that I should at least propose this. Signed-off-by: beorn7 <beorn@grafana.com>
One of the conversation that @beorn7 and I were having in the CNCF slack ( We have 2 options here:
I am in favour of option 1. Reason: in this case, I believe in shipping early and getting feedback early. And on top of that, I and Beorn won't have a whole lot of time before v2.40.0 process begins (both will be at a conference next week). Also, we are not 100% sure how the performce improvement efforts will work out; it might end up being that we should have merged it earlier and not have skipped v2.40.0 (hopefully this is not the case, but there is a chance). cc @bwplotka @roidelapluie @juliusv @Nexucis what are your thoughts on this? |
How bad is the perf decrease? From your comment at #11447 (comment) it sounds like it wasn't that noticeable (but benchmark has been cancelled since). I'd care about the real-world usage more than the Go micro benchmarks. If it's just a tiny perf decrease (<=5%), it should be ok IMO. |
Hmmmm, interesting. We were mostly looking at the Go micro benchmarks at the moment. While most are <5% slower, there were a bunch of queries that were worse. BUT, the prombench is interesting. If we look at the Now this is the inner_eval of the PromQL engine, but this is just internal execution of query (link). So the above http request duration matters more. I don't see any noticeable difference in memory usage as well (link) Well, it looks like we should give this a try and merge now. |
If that makes things significantly easier for you (and get more feedback faster), I'd say go for it then, yeah. Note, I'm saying that without having reviewed anything about this branch yet, just going on what you're saying on performance so far. |
Great. I was hoping the real-life performance of the queries wouldn't really get a dent from the changes. That seems to work out. Maybe other optimizations give us more than what the native histograms take. Or the overall difference is just so small that it disappears in the usual noise. Microbenchmarks are still important, and I still have a few ideas up my sleeves how we might avoid performance regressions there. Will work on them soon… |
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
To clarify (from discussion on the CNCF slack): I'll work on my planned optimization against the main thread. This can be merged now. |
YOLO! |
Let's goooo! I think it's fine to merge it for now. I did not review it, but early benchmarks looks nice - happy with this risk. |
🥳 |
…heus#9328 and brougt back by mistake in 095f572 as part of prometheus#11447
…heus#9328 and brougt back by mistake in 095f572 as part of prometheus#11447 Signed-off-by: machine424 <ayoubmrini424@gmail.com>
…heus#9328 and brougt back by mistake in 095f572 as part of prometheus#11447 Signed-off-by: machine424 <ayoubmrini424@gmail.com>
This PR merges all the coding work that has been done in
sparsehistogram
branch over the last 1 year into main branch. This PR is open for review, and in the mean time while this is up for review, we will keep working on thesparsehistogram
branch to close some P1 histogram issues.Design doc on native histograms: https://docs.google.com/document/d/1cLNv3aufPZb3fNfaJgdaRBZsInZKKIHo9E6HinJVbpM/edit
Some sneak peak: https://www.youtube.com/watch?v=T2GvcYNth9U
We would like this PR to be not squashed or rebased when merging.