New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
promtool: add a "tsdb dump-openmetrics" to dump in OpemMetrics format. #13194
Conversation
21b510e
to
2c3826b
Compare
model/labels/labels_common.go
Outdated
b.WriteByte('{') | ||
i := 0 | ||
ls.Range(func(l Label) { | ||
if i > 0 { | ||
b.WriteByte(',') | ||
b.WriteByte(' ') | ||
if addSpace { | ||
b.WriteByte(' ') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this space only breaks two "formatting" tests. Maybe we could get rid on it in 3.0
Keeping it as it "breaks" promtool tsdb dump
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest moving this into promtool where you need it.
The interface of Labels is too big already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate? Do you mean redefining the same String
on promtool side without the b.WriteByte(' ')
line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CondensedString
, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bboreham Could you take a look now? thanks.
f476e13
to
9b89bf5
Compare
cc @bboreham (I feel there are at least some naming improvements to make) |
396b764
to
3a6d65a
Compare
cmd/promtool/tsdb.go
Outdated
for ss.Next() { | ||
series := ss.At() | ||
lbs := series.Labels() | ||
labelsBuilder := labels.NewBuilder(lbs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe reuse the same builder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recently added labels.DropMetricName()
which does this more efficiently.
However it's probably even more efficient to write out what you need, i.e. re-code CondensedString
to pull MetricName
out first and add the non-name labels after.
It could even print to stdout to save allocating a string, but that is perhaps overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for DropMetricName
, I pushed a version using it.
I think it's better to pull out MetricName
once per metric than for every time series, even though the util is cheap.
I tested with delegating the whole formatting (even the printing) to CondensedString
and on my machine and for my testdata (prometheus' metrics), that didn't yield any improvement: more B/op as we add the metric name to every buffer.
With:
// CondensedString is labels.Labels.String() without spaces after the commas.
-func CondensedString(ls labels.Labels) string {
+func CondensedString(metricName string, ls labels.Labels, val float64, ts int64) string {
var b bytes.Buffer
+ b.WriteString(metricName)
b.WriteByte('{')
i := 0
ls.Range(func(l labels.Label) {
@@ -794,6 +795,7 @@ func CondensedString(ls labels.Labels) string {
i++
})
b.WriteByte('}')
+ b.WriteString(fmt.Sprintf(" %g %.3f\n", val, float64(ts)/1000))
return b.String()
}
@@ -801,13 +803,13 @@ func formatSeriesSetOpenMetrics(ss storage.SeriesSet) error {
for ss.Next() {
series := ss.At()
lbs := series.Labels()
- labelsBuilder := labels.NewBuilder(lbs)
metricName := lbs.Get(labels.MetricName)
- lbs = labelsBuilder.Del(labels.MetricName).Labels()
+ lbs = lbs.DropMetricName()
it := series.Iterator(nil)
for it.Next() == chunkenc.ValFloat {
ts, val := it.At()
- fmt.Printf("%s%s %g %.3f\n", metricName, CondensedString(lbs), val, float64(ts)/1000)
+ fmt.Print(CondensedString(metricName, lbs, val, ts))
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/cmd/promtool
│ pr │ pr_delegate │
│ sec/op │ sec/op vs base │
XXX-10 551.8m ± 3% 556.8m ± 4% ~ (p=0.247 n=10)
│ pr │ pr_delegate │
│ B/op │ B/op vs base │
XXX-10 110.7Mi ± 0% 188.4Mi ± 0% +70.21% (p=0.000 n=10)
│ pr │ pr_delegate │
│ allocs/op │ allocs/op vs base │
XXX-10 3.473M ± 0% 3.807M ± 0% +9.61% (p=0.000 n=10)
and after delegating the printing as well:
// CondensedString is labels.Labels.String() without spaces after the commas.
-func CondensedString(ls labels.Labels) string {
+func CondensedString(metricName string, ls labels.Labels, val float64, ts int64) {
var b bytes.Buffer
+ b.WriteString(metricName)
b.WriteByte('{')
i := 0
ls.Range(func(l labels.Label) {
@@ -794,20 +795,21 @@ func CondensedString(ls labels.Labels) string {
i++
})
b.WriteByte('}')
- return b.String()
+ b.WriteString(fmt.Sprintf(" %g %.3f\n", val, float64(ts)/1000))
+ fmt.Print(b.String())
}
func formatSeriesSetOpenMetrics(ss storage.SeriesSet) error {
for ss.Next() {
series := ss.At()
lbs := series.Labels()
- labelsBuilder := labels.NewBuilder(lbs)
metricName := lbs.Get(labels.MetricName)
- lbs = labelsBuilder.Del(labels.MetricName).Labels()
+ lbs = lbs.DropMetricName()
it := series.Iterator(nil)
for it.Next() == chunkenc.ValFloat {
ts, val := it.At()
- fmt.Printf("%s%s %g %.3f\n", metricName, CondensedString(lbs), val, float64(ts)/1000)
+ CondensedString(metricName, lbs, val, ts)
}
got
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/cmd/promtool
│ pr │ pr_deleg_print │
│ sec/op │ sec/op vs base │
XXX-10 551.8m ± 3% 557.6m ± 4% ~ (p=0.353 n=10)
│ pr │ pr_deleg_print │
│ B/op │ B/op vs base │
XXX-10 110.7Mi ± 0% 188.4Mi ± 0% +70.24% (p=0.000 n=10)
│ pr │ pr_deleg_print │
│ allocs/op │ allocs/op vs base │
XXX-10 3.473M ± 0% 3.807M ± 0% +9.61% (p=0.000 n=10)
( I tested with strings.Builder
, a slightly better sec and B, but 20% more allocs.)
Maybe There is a optimization we can take advantage of, but I didn't spent time on that, especially that the this OM dump looks way efficient than the existing one that no one looks to be complaining about:
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/cmd/promtool
│ dump │ dump_openmetrics │
│ sec/op │ sec/op vs base │
XXX-10 679.3m ± 4% 551.8m ± 3% -18.77% (p=0.000 n=10)
│ dump │ dump_openmetrics │
│ B/op │ B/op vs base │
XXX-10 215.9Mi ± 0% 110.7Mi ± 0% -48.74% (p=0.000 n=10)
│ dump │ dump_openmetrics │
│ allocs/op │ allocs/op vs base │
XXX-10 4.083M ± 0% 3.473M ± 0% -14.92% (p=0.000 n=10)
again, the prom dump puts more stuff into buffers, and has only to pass metric names into strconv.Quote
.
I could add some benchmarks for the dump, if you want, but it'll be in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't get my point across. Once you know the metric name you don't need to drop it from the Labels, you can just skip it in the loop. But it's not important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this benchmark code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said, I can open a PR if you're interested in adding it.
It's just a:
func BenchmarkXXX(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for n := 0; n < b.N; n++ {
err := dumpSamples(
context.Background(),
"/XXX/github/prometheus/data",
math.MinInt64,
math.MaxInt64,
[]string{"{__name__=~'(?s:.*)'}"},
formatSeriesSetOpenMetrics,
)
if err != nil {
panic(err)
}
}
}
I kept it send everything to stdout and grep
ped benchmarks results.
/XXX/github/prometheus/data
containing prometheus metrics that I made it scrape itself every 1s for some minutes. (of course I stopped it before running the benchmarks)
Maybe I shared the first benchstat twice, but I'm sure both the changes yielded inferior perfs than the code that was merged. (you'll be able to validate that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I profiled that code, using TSDB data from TestTSDBDumpOpenMetrics
, and it spends all its time in loadDataAsQuerier
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expand on that? Are you talking about ‘ formatSeriesSetOpenMetrics’ vs ‘ formatSeriesSet’?
When you have time to take a look at my PRs @bboreham it'd be great if you start with this one :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this; seems pretty close but I think it could be a little neater if the output function did what was needed directly.
cmd/promtool/tsdb.go
Outdated
for ss.Next() { | ||
series := ss.At() | ||
lbs := series.Labels() | ||
labelsBuilder := labels.NewBuilder(lbs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recently added labels.DropMetricName()
which does this more efficiently.
However it's probably even more efficient to write out what you need, i.e. re-code CondensedString
to pull MetricName
out first and add the non-name labels after.
It could even print to stdout to save allocating a string, but that is perhaps overkill.
This closes the loop, as the output can be fed into "tsdb create-blocks-from openmetrics" Native histograms are not supported. Signed-off-by: machine424 <ayoubmrini424@gmail.com>
lbs := series.Labels() | ||
metricName := lbs.Get(labels.MetricName) | ||
lbs = lbs.DropMetricName() | ||
it := series.Iterator(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw the reason for passing a pointer to Iterator
is to re-use the memory of the old iterator.
cmd/promtool/tsdb.go
Outdated
for ss.Next() { | ||
series := ss.At() | ||
lbs := series.Labels() | ||
labelsBuilder := labels.NewBuilder(lbs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't get my point across. Once you know the metric name you don't need to drop it from the Labels, you can just skip it in the loop. But it's not important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This closes the loop, as the output can be fed into "tsdb create-blocks-from openmetrics"
Native histograms are not supported.
Fixes #12022