Skip to content

Commit

Permalink
lib/protoparser/opentelemetry: follow-up after 47892b4
Browse files Browse the repository at this point in the history
- Rename -opentelemetry.sanitizeMetrics command-line flag to more clear -opentelemetry.usePrometheusNaming
- Clarify the description of the change at docs/CHANGELOG.md
- Rename promrelabel.SanitizeLabelNameParts to more clear promrelabel.SplitMetricNameToTokens
- Properly split metric names at '_' char in promerlabel.SplitMetricNameToTokens.
- Add tests for various edge cases for Prometheus metric names' normalization
  according to the code at https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b8655058501bed61a06bb660869051491f46840b/pkg/translator/prometheus/normalize_name.go
- Extract the code responsible for Prometheus metric names' normalization into a separate file (santize.go)

Updates #6037
Updates #6035
  • Loading branch information
valyala committed Apr 2, 2024
1 parent 3de8656 commit fb42380
Show file tree
Hide file tree
Showing 9 changed files with 291 additions and 140 deletions.
3 changes: 3 additions & 0 deletions README.md
Expand Up @@ -1546,6 +1546,9 @@ VictoriaMetrics supports data ingestion via [OpenTelemetry protocol for metrics]
VictoriaMetrics expects `protobuf`-encoded requests at `/opentelemetry/v1/metrics`.
Set HTTP request header `Content-Encoding: gzip` when sending gzip-compressed data to `/opentelemetry/v1/metrics`.

VictoriaMetrics stores the ingested OpenTelemetry [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) as is without any transformations.
Pass `-opentelemetry.usePrometheusNaming` command-line flag to VictoriaMetrics for automatic conversion of metric names and labels into Prometheus-compatible format.

See [How to use OpenTelemetry metrics with VictoriaMetrics](https://docs.victoriametrics.com/guides/getting-started-with-opentelemetry/).

## JSON line format
Expand Down
4 changes: 2 additions & 2 deletions docs/CHANGELOG.md
Expand Up @@ -61,13 +61,13 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
* FEATURE: [vmctl](https://docs.victoriametrics.com/vmctl.html): support client-side TLS configuration for VictoriaMetrics destination specified via `--vm-*` cmd-line flags used in [InfluxDB](https://docs.victoriametrics.com/vmctl/#migrating-data-from-influxdb-1x), [Remote Read protocol](https://docs.victoriametrics.com/vmctl/#migrating-data-by-remote-read-protocol), [OpenTSDB](https://docs.victoriametrics.com/vmctl/#migrating-data-from-opentsdb), [Prometheus](https://docs.victoriametrics.com/vmctl/#migrating-data-from-prometheus) and [Promscale](https://docs.victoriametrics.com/vmctl/#migrating-data-from-promscale) migration modes.
* FEATURE: [vmctl](https://docs.victoriametrics.com/vmctl.html): split [explore phase](https://docs.victoriametrics.com/vmctl/#migrating-data-from-victoriametrics) in `vm-native` mode by time intervals when [--vm-native-step-interval](https://docs.victoriametrics.com/vmctl/#using-time-based-chunking-of-migration) is specified. This should reduce probability of exceeding complexity limits for number of selected series during explore phase. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5369).
* FEATURE: [graphite](https://docs.victoriametrics.com/#graphite-render-api-usage): add support for [aggregateSeriesLists](https://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.aggregateSeriesLists), [diffSeriesLists](https://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.diffSeriesLists), [multiplySeriesLists](https://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.multiplySeriesLists) and [sumSeriesLists](https://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.sumSeriesLists) functions. Thanks to @rbizos for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5809).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): added command line argument that enables OpenTelementry metric names and labels sanitization.
* FEATURE: [OpenTelemetry](https://docs.victoriametrics.com/#sending-data-via-opentelemetry): add `-opentelemetry.usePrometheusNaming` command-line flag, which can be used for enabling automatic conversion of the ingested metric names and labels into Prometheus-compatible format. See [these docs](https://docs.victoriametrics.com/#sending-data-via-opentelemetry) and [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6037).

* BUGFIX: prevent from automatic deletion of newly registered time series when it is queried immediately after the addition. The probability of this bug has been increased significantly after [v1.99.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.99.0) because of optimizations related to registering new time series. See [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5948) and [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959) issue.
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly set `Host` header in requests to scrape targets if it is specified via [`headers` option](https://docs.victoriametrics.com/sd_configs/#http-api-client-options). Thanks to @fholzer for [the bugreport](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5969) and [the fix](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5970).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly set `Host` header in requests to scrape targets when [`server_name` option at `tls_config`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config) is set. Previously the `Host` header was set incorrectly to the target hostname in this case.
* BUGFIX: do not drop `match[]` filter at [`/api/v1/series`](https://docs.victoriametrics.com/url-examples/#apiv1series) if `-search.ignoreExtraFiltersAtLabelsAPI` command-line flag is set, since missing `match[]` filter breaks `/api/v1/series` requests.
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): return proper resonses for [AWS Firehose](https://docs.aws.amazon.com/firehose/latest/dev/httpdeliveryrequestresponse.html#requestformat) requests according to [these docs](https://docs.aws.amazon.com/firehose/latest/dev/httpdeliveryrequestresponse.html#responseformat). See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6016).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): return proper resonses for [AWS Firehose](https://docs.aws.amazon.com/firehose/latest/dev/httpdeliveryrequestresponse.html#requestformat) requests according to [these docs](https://docs.aws.amazon.com/firehose/latest/dev/httpdeliveryrequestresponse.html#responseformat). See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6016) and [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6037).
* BUGFIX: [vmctl](https://docs.victoriametrics.com/vmctl.html): properly parse TLS key and CA files for [InfluxDB](https://docs.victoriametrics.com/vmctl/#migrating-data-from-influxdb-1x) and [OpenTSDB](https://docs.victoriametrics.com/vmctl/#migrating-data-from-opentsdb) migration modes.
* BUGFIX: [vmui](https://docs.victoriametrics.com/#vmui): fix VictoriaLogs UI query handling to correctly apply `_time` filter across all queries. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5920).
* BUGFIX: [Single-node VictoriaMetrics](https://docs.victoriametrics.com/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): limit duration of requests to /api/v1/labels, /api/v1/label/.../values or /api/v1/series with `-search.maxLabelsAPIDuration` duration. Before, `-search.maxExportDuration` value was used by mistake. The bug has been introduced in [v1.99.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.99.0). Thanks to @kbweave for the [pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5992).
Expand Down
3 changes: 3 additions & 0 deletions docs/README.md
Expand Up @@ -1549,6 +1549,9 @@ VictoriaMetrics supports data ingestion via [OpenTelemetry protocol for metrics]
VictoriaMetrics expects `protobuf`-encoded requests at `/opentelemetry/v1/metrics`.
Set HTTP request header `Content-Encoding: gzip` when sending gzip-compressed data to `/opentelemetry/v1/metrics`.

VictoriaMetrics stores the ingested OpenTelemetry [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) as is without any transformations.
Pass `-opentelemetry.usePrometheusNaming` command-line flag to VictoriaMetrics for automatic conversion of metric names and labels into Prometheus-compatible format.

See [How to use OpenTelemetry metrics with VictoriaMetrics](https://docs.victoriametrics.com/guides/getting-started-with-opentelemetry/).

## JSON line format
Expand Down
3 changes: 3 additions & 0 deletions docs/Single-server-VictoriaMetrics.md
Expand Up @@ -1557,6 +1557,9 @@ VictoriaMetrics supports data ingestion via [OpenTelemetry protocol for metrics]
VictoriaMetrics expects `protobuf`-encoded requests at `/opentelemetry/v1/metrics`.
Set HTTP request header `Content-Encoding: gzip` when sending gzip-compressed data to `/opentelemetry/v1/metrics`.

VictoriaMetrics stores the ingested OpenTelemetry [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) as is without any transformations.
Pass `-opentelemetry.usePrometheusNaming` command-line flag to VictoriaMetrics for automatic conversion of metric names and labels into Prometheus-compatible format.

See [How to use OpenTelemetry metrics with VictoriaMetrics](https://docs.victoriametrics.com/guides/getting-started-with-opentelemetry/).

## JSON line format
Expand Down
10 changes: 7 additions & 3 deletions lib/promrelabel/relabel.go
Expand Up @@ -663,11 +663,15 @@ func SanitizeLabelName(name string) string {
return labelNameSanitizer.Transform(name)
}

// SanitizeLabelNameParts returns label name slice generated from metric name divided by unsupported characters
func SanitizeLabelNameParts(name string) []string {
return unsupportedLabelNameChars.Split(name, -1)
// SplitMetricNameToTokens returns tokens generated from metric name divided by unsupported Prometheus characters
//
// See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
func SplitMetricNameToTokens(name string) []string {
return nonAlphaNumChars.Split(name, -1)
}

var nonAlphaNumChars = regexp.MustCompile(`[^a-zA-Z0-9]`)

var labelNameSanitizer = bytesutil.NewFastStringTransformer(func(s string) string {
return unsupportedLabelNameChars.ReplaceAllString(s, "_")
})
Expand Down
138 changes: 138 additions & 0 deletions lib/protoparser/opentelemetry/stream/sanitize.go
@@ -0,0 +1,138 @@
package stream

import (
"flag"
"slices"
"strings"

"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/pb"
)

var (
usePrometheusNaming = flag.Bool("opentelemetry.usePrometheusNaming", false, "Whether to convert metric names and labels into Prometheus-compatible format for the metrics ingested "+
"via OpenTelemetry protocol; see https://docs.victoriametrics.com/#sending-data-via-opentelemetry")
)

// unitMap is obtained from https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b8655058501bed61a06bb660869051491f46840b/pkg/translator/prometheus/normalize_name.go#L19
var unitMap = map[string]string{
// Time
"d": "days",
"h": "hours",
"min": "minutes",
"s": "seconds",
"ms": "milliseconds",
"us": "microseconds",
"ns": "nanoseconds",

// Bytes
"By": "bytes",
"KiBy": "kibibytes",
"MiBy": "mebibytes",
"GiBy": "gibibytes",
"TiBy": "tibibytes",
"KBy": "kilobytes",
"MBy": "megabytes",
"GBy": "gigabytes",
"TBy": "terabytes",

// SI
"m": "meters",
"V": "volts",
"A": "amperes",
"J": "joules",
"W": "watts",
"g": "grams",

// Misc
"Cel": "celsius",
"Hz": "hertz",
"1": "",
"%": "percent",
}

// perUnitMap is copied from https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b8655058501bed61a06bb660869051491f46840b/pkg/translator/prometheus/normalize_name.go#L58
var perUnitMap = map[string]string{
"s": "second",
"m": "minute",
"h": "hour",
"d": "day",
"w": "week",
"mo": "month",
"y": "year",
}

// See https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b8655058501bed61a06bb660869051491f46840b/pkg/translator/prometheus/normalize_label.go#L26
func sanitizeLabelName(labelName string) string {
if !*usePrometheusNaming {
return labelName
}
return sanitizePrometheusLabelName(labelName)
}

func sanitizePrometheusLabelName(labelName string) string {
if len(labelName) == 0 {
return ""
}
labelName = promrelabel.SanitizeLabelName(labelName)
if labelName[0] >= '0' && labelName[0] <= '9' {
return "key_" + labelName
} else if strings.HasPrefix(labelName, "_") && !strings.HasPrefix(labelName, "__") {
return "key" + labelName
}
return labelName
}

// See https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b8655058501bed61a06bb660869051491f46840b/pkg/translator/prometheus/normalize_name.go#L83
func sanitizeMetricName(m *pb.Metric) string {
if !*usePrometheusNaming {
return m.Name
}
return sanitizePrometheusMetricName(m)
}

func sanitizePrometheusMetricName(m *pb.Metric) string {
nameTokens := promrelabel.SplitMetricNameToTokens(m.Name)

unitTokens := strings.SplitN(m.Unit, "/", 2)
if len(unitTokens) > 0 {
mainUnit := strings.TrimSpace(unitTokens[0])
if mainUnit != "" && !strings.ContainsAny(mainUnit, "{}") {
if u, ok := unitMap[mainUnit]; ok {
mainUnit = u
}
if mainUnit != "" && !slices.Contains(nameTokens, mainUnit) {
nameTokens = append(nameTokens, mainUnit)
}
}

if len(unitTokens) > 1 {
perUnit := strings.TrimSpace(unitTokens[1])
if perUnit != "" && !strings.ContainsAny(perUnit, "{}") {
if u, ok := perUnitMap[perUnit]; ok {
perUnit = u
}
if perUnit != "" && !slices.Contains(nameTokens, perUnit) {
nameTokens = append(nameTokens, "per", perUnit)
}
}
}
}

if m.Sum != nil && m.Sum.IsMonotonic {
nameTokens = moveOrAppend(nameTokens, "total")
} else if m.Unit == "1" && m.Gauge != nil {
nameTokens = moveOrAppend(nameTokens, "ratio")
}
return strings.Join(nameTokens, "_")
}

func moveOrAppend(tokens []string, value string) []string {
for i := range tokens {
if tokens[i] == value {
tokens = append(tokens[:i], tokens[i+1:]...)
break
}
}
return append(tokens, value)
}
127 changes: 127 additions & 0 deletions lib/protoparser/opentelemetry/stream/sanitize_test.go
@@ -0,0 +1,127 @@
package stream

import (
"testing"

"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/pb"
)

func TestSanitizePrometheusLabelName(t *testing.T) {
f := func(labelName, expectedResult string) {
t.Helper()

result := sanitizePrometheusLabelName(labelName)
if result != expectedResult {
t.Fatalf("unexpected result; got %q; want %q", result, expectedResult)
}
}

f("", "")
f("foo", "foo")
f("foo_bar/baz:abc", "foo_bar_baz_abc")
f("1foo", "key_1foo")
f("_foo", "key_foo")
f("__bar", "__bar")
}

func TestSanitizePrometheusMetricName(t *testing.T) {
f := func(m *pb.Metric, expectedResult string) {
t.Helper()

result := sanitizePrometheusMetricName(m)
if result != expectedResult {
t.Fatalf("unexpected result; got %q; want %q", result, expectedResult)
}
}

f(&pb.Metric{}, "")

f(&pb.Metric{
Name: "foo",
}, "foo")

f(&pb.Metric{
Name: "foo",
Unit: "s",
}, "foo_seconds")

f(&pb.Metric{
Name: "foo_seconds",
Unit: "s",
}, "foo_seconds")

f(&pb.Metric{
Name: "foo",
Sum: &pb.Sum{
IsMonotonic: true,
},
}, "foo_total")

f(&pb.Metric{
Name: "foo_total",
Sum: &pb.Sum{
IsMonotonic: true,
},
}, "foo_total")

f(&pb.Metric{
Name: "foo",
Sum: &pb.Sum{
IsMonotonic: true,
},
Unit: "s",
}, "foo_seconds_total")

f(&pb.Metric{
Name: "foo_seconds",
Sum: &pb.Sum{
IsMonotonic: true,
},
Unit: "s",
}, "foo_seconds_total")

f(&pb.Metric{
Name: "foo_total",
Sum: &pb.Sum{
IsMonotonic: true,
},
Unit: "s",
}, "foo_seconds_total")

f(&pb.Metric{
Name: "foo_seconds_total",
Sum: &pb.Sum{
IsMonotonic: true,
},
Unit: "s",
}, "foo_seconds_total")

f(&pb.Metric{
Name: "foo_total_seconds",
Sum: &pb.Sum{
IsMonotonic: true,
},
Unit: "s",
}, "foo_seconds_total")

f(&pb.Metric{
Name: "foo",
Gauge: &pb.Gauge{},
Unit: "1",
}, "foo_ratio")

f(&pb.Metric{
Name: "foo",
Unit: "m/s",
}, "foo_meters_per_second")

f(&pb.Metric{
Name: "foo_second",
Unit: "m/s",
}, "foo_second_meters")

f(&pb.Metric{
Name: "foo_meters",
Unit: "m/s",
}, "foo_meters_per_second")
}

0 comments on commit fb42380

Please sign in to comment.