Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native histograms custom buckets storage #13662

Merged
merged 31 commits into from
Jun 27, 2024
Merged

Native histograms custom buckets storage #13662

merged 31 commits into from
Jun 27, 2024

Conversation

krajorama
Copy link
Member

@krajorama krajorama commented Feb 28, 2024

Implements parts of #13485
#13486
#13487
#13508

In short: implements the internal model, storage (TSDB) and basic unit tests for the storage and query.
No user visible impact yet. No scraping/remote write/configuration change.

* add custom buckets to native histogram model
* simple copy for custom bounds
* return errors for unsupported add/sub operations
* add test cases for string and update appendhistogram in scrape to account for new schema
* check fields which are supposed to be unused but may affect results in equals
* allow appending custom buckets histograms regardless of max schema

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
@krajorama krajorama marked this pull request as draft February 28, 2024 13:08
@krajorama
Copy link
Member Author

/prombench main

@prombot
Copy link
Contributor

prombot commented Feb 28, 2024

@krajorama is not a org member nor a collaborator and cannot execute benchmarks.

@krajorama
Copy link
Member Author

/prombench main

@prombot
Copy link
Contributor

prombot commented Feb 28, 2024

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-13662 and main

After successful deployment, the benchmarking results can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart main

@bboreham
Copy link
Member

I don't see any difference. Also, prombench data doesn't have any native histograms.

@prombot
Copy link
Contributor

prombot commented Mar 2, 2024

Benchmark tests are running for 3 days! If this is intended ignore this message otherwise you can cancel it by commenting: /prombench cancel

@bboreham
Copy link
Member

bboreham commented Mar 2, 2024

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Mar 2, 2024

Benchmark cancel is in progress.

@krajorama
Copy link
Member Author

I don't see any difference. Also, prombench data doesn't have any native histograms.

We made modifications to the engine, was wondering about any regression on the non-experimental part in floats.
Also my apologies, didn't realize I had to manually cancel.

zenador and others added 11 commits March 22, 2024 14:36
* add custom bounds to chunks encoding
* change custom buckets schema number
* rename custom bounds to custom values

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
The size of histogram points are now bigger by 24 bytes due to the
custom values slice.

When histograms are loaded into partial results in vector selectors
we use HPoint type where the size is calculated as
(size of histogram + 8 for timestamp)/16.
https://github.com/prometheus/prometheus/blob/a3d1a46eda682590a80fb1f15959457dad1e5d91/promql/value.go#L176

When histograms are put into Sample type in range evaluations, the
Sample has more overhead and the size is calculated differently:
(size of histogram / 16) + 1 for time stamp.
https://github.com/prometheus/prometheus/blob/a3d1a46eda682590a80fb1f15959457dad1e5d91/promql/engine.go#L1928

When the size of the histogram is 16k, then the first calculation gives k
but the second gives k+1 for the sample count.
If the histogram size is 16k+8, then both would give k+1.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
…s and floats (#13828)

* Use single bit to differentiate between optimized bounds and floats

Use one bit to decide what kind of data to read/write.
This reduces storage need of floats from 72 bits to 65 bits and makes the
integers store in 5 to 32 bits instead of 16.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
…buckets converted from classic histograms (#13794)

* modify unit test framework to automatically generate native histograms with custom buckets from classic histogram series
* add very basic tests for classic histogram converted into native histogram with custom bounds
* fix histogram_quantile for native histograms with custom buckets
* make loading with nhcb explicit
* evaluate native histograms with custom buckets on queries with explicit keyword
* use regex replacer
* use temp histogram struct for automatically loading converted nhcb

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
* process custom values in histogram unit test framework
* check for warnings when evaluating in unit test framework
* add test cases for custom buckets in test framework

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
@krajorama krajorama marked this pull request as ready for review May 8, 2024 11:58
@krajorama krajorama changed the title Implement native histograms custom buckets Native histograms custom buckets storage May 8, 2024
@beorn7 beorn7 self-requested a review May 8, 2024 12:50
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
krajorama and others added 2 commits June 11, 2024 10:55
[nhcb branch] address review comments and merge with main
Currently we can only reduce the resolution of exponential native
histograms, so checking the schema for that is slightly more precise
than checking against max schema.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@beorn7
Copy link
Member

beorn7 commented Jun 13, 2024

Let me know when you want me to have another look here.

krajorama and others added 7 commits June 18, 2024 15:21
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
native histograms: only reduce resolution for exponential histograms
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
[nhcb branch] update missed suggested change from code review
@zenador
Copy link
Contributor

zenador commented Jun 21, 2024

@beorn7 PTAL

krajorama and others added 2 commits June 24, 2024 07:51
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
Some weird formatting issue in using comment suggestion

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@beorn7
Copy link
Member

beorn7 commented Jun 26, 2024

Will give it another round of review ASAP.

Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great epic stuff, thank you very much.

Would you prefer to keep the current commit structure, or would you rather like to squash everything into one commit? I would tend to do the former.
Or maybe you want to restructure the commits.

I'll wait for your pick before merging.

@zenador
Copy link
Contributor

zenador commented Jun 27, 2024

Thank you! Keeping the current structure sounds good.

@beorn7
Copy link
Member

beorn7 commented Jun 27, 2024

With the wild commit history, the DCO test got confused, but I think all commits are signed alright.

@beorn7 beorn7 merged commit 2e58d46 into main Jun 27, 2024
41 of 42 checks passed
@beorn7 beorn7 deleted the nhcb branch June 27, 2024 19:44
jan--f added a commit to jan--f/prometheus that referenced this pull request Jul 1, 2024
 Conflicts:
  promql/engine_test.go
    Resolved by picking main changes but adjusting total_samples for
    query "max_over_time(metricWith1HistogramEvery10Seconds[60s])[20s:5s]"
    to 312. Via prometheus#13662 this
    histogram now stores 13 values per timestamp, but via
    prometheus#13904 the range query
    is now left-open.
  promql/promqltest/testdata/functions.test
    Resolved by picking changes in main. See also
    prometheus#13662.
  promql/promqltest/testdata/histograms.test
    Resolved by picking changes in main. See also
    prometheus#13662, but adjust some
    range selectors (`s/5m/10m/`) to account for
    prometheus#13904
jan--f added a commit to jan--f/prometheus that referenced this pull request Jul 3, 2024
 Conflicts:
  promql/engine_test.go
    Resolved by picking main changes but adjusting total_samples for
    query "max_over_time(metricWith1HistogramEvery10Seconds[60s])[20s:5s]"
    to 312. Via prometheus#13662 this
    histogram now stores 13 values per timestamp, but via
    prometheus#13904 the range query
    is now left-open.
  promql/promqltest/testdata/functions.test
    Resolved by picking changes in main. See also
    prometheus#13662, but adjust some
    range selectors (`s/1m/2m/`) to account for
    prometheus#13904.
  promql/promqltest/testdata/histograms.test
    Resolved by picking changes in main. See also
    prometheus#13662, but adjust some
    range selectors (`s/5m/10m/`) to account for
    prometheus#13904.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants