Skip to content

feat(remap, lua, codecs): add auto value to metric_tag_values#25376

Open
kaarolch wants to merge 3 commits intovectordotdev:masterfrom
kaarolch:support_auto_metric_tag_values
Open

feat(remap, lua, codecs): add auto value to metric_tag_values#25376
kaarolch wants to merge 3 commits intovectordotdev:masterfrom
kaarolch:support_auto_metric_tag_values

Conversation

@kaarolch
Copy link
Copy Markdown
Contributor

@kaarolch kaarolch commented May 6, 2026

Summary

Add a new auto value to the metric_tag_values option of the remap and lua transforms.

In auto mode each tag is exposed using the shape that matches its underlying storage:

  • Single-value tags are exposed as strings (or null for bare tags).
  • Multi-value tags are exposed as arrays of strings (with null entries for bare values).

Writes follow the same convention -- assigning a string or null stores a single-value tag, assigning an array stores a multi-value tag. This preserves the on-the-wire shape of metrics that mix single- and multi-value tags, instead of either collapsing everything to a single string (single) or wrapping everything as an array (full).

When we wrap everything as an array (full) we have to always remember to additional check per each tags. I agree that we should not used multi-value but unfortunately format like Dogstatsd allow to it and sometimes we have to process them.

Vector configuration

sources:
  http_metrics:
    type: http_server
    address: "0.0.0.0:8080"
    decoding:
      codec: influxdb
    path: "/api/v1/write"

transforms:
  inject_multi_value:
    type: remap
    inputs: [http_metrics]
    metric_tag_values: full
    source: |
      .tags.region = ["us-east-1", "us-west-2"]
      .tags.tier   = ["frontend", "backend", "cache"]

  iter_auto:
    type: remap
    inputs: [inject_multi_value]
    metric_tag_values: auto
    source: |
      # Iterate over every tag without caring whether it is a string or
      # an array. With `auto`, single-value tags stay scalars and
      # multi-value tags stay arrays after the assignment.
      .tags = map_values(object!(.tags)) -> |v| {
        if v == null { "null" } else { v }
      }

sinks:
  out:
    type: console
    inputs: [iter_auto]
    encoding:
      codec: json
      metric_tag_values: full

How did you test this PR?

  • cargo check -p vector --lib --all-features
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test -p vector-core (covering the new vrl_target and
    lua/metric round-trip tests)
  • New unit tests in lib/vector-core/src/event/vrl_target.rs and
    lib/vector-core/src/event/lua/metric.rs exercising:
    • reading single-value tags as strings in auto mode
    • reading multi-value tags as arrays in auto mode
    • reading mixed-shape tags preserves both shapes
    • writing a string keeps the tag single-value
    • writing an array keeps the tag multi-value
    • writing null keeps the tag as a bare single-value tag
  • End-to-end POC against a local vector build using
    vector-poc-auto.yaml -- verified that for the same input metric,
    the iter_single branch collapsed multi-value tags, the iter_full
    branch wrapped single-value tags as arrays in the VRL view, and the
    iter_auto branch preserved both shapes through a noop
    map_values iteration.

Local result:

{"name":"[AUTO] http_requests_total_count","tags":{"env":"staging","instance":"node-1","region":["us-east-1","us-west-2"],"service":"api","tier":["frontend","backend","cache"]},"timestamp":"2026-05-06T06:48:48.170928Z","kind":"absolute","gauge":{"value":1.0}}
{"name":"[AUTO] http_requests_total_latency_ms","tags":{"env":"staging","instance":"node-1","region":["us-east-1","us-west-2"],"service":"api","tier":["frontend","backend","cache"]},"timestamp":"2026-05-06T06:48:48.170928Z","kind":"absolute","gauge":{"value":11.0}}

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

The new auto value is purely additive. Existing configurations using
single (the default) or full keep their current behavior, and no
internal API used outside the transforms changed shape.

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.

The `metric_tag_values` option (used by the `remap` and `lua` transforms)
gains an `auto` value that exposes single-value tags as strings and
multi-value tags as arrays, instead of forcing every tag into one form.
Writes follow the same convention -- assigning a string or null to a
tag stores it as a single tag, assigning an array stores it as a
multi-value tag. This preserves the on-the-wire shape of metrics that
mix single- and multi-value tags.

The change replaces the internal `multi_value_tags: bool` plumbing with
a `MetricTagMode` enum (`Single`/`Full`/`Auto`) in `vector-core`, with a
`From<MetricTagValues>` conversion at the codecs boundary. `VrlTarget`
and the Lua serialization paths dispatch on the new variant for both
read (precompute) and write (insert) operations.

Includes round-trip tests for Auto mode covering single/multi tag
shapes, and write tests for string, array, and null assignments.

Co-authored-by: Cursor <cursoragent@cursor.com>
@kaarolch kaarolch requested review from a team as code owners May 6, 2026 06:57
@github-actions github-actions Bot added the docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. label May 6, 2026
@github-actions github-actions Bot added domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components domain: external docs Anything related to Vector's external, public documentation domain: core Anything related to core crates i.e. vector-core, core-common, etc labels May 6, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed2edf3fe3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread lib/vector-core/src/event/lua/event.rs Outdated
super::{Event, LogEvent, Metric},
metric::LuaMetric,
};
use crate::event::vrl_target::MetricTagMode;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid importing MetricTagMode from vrl-only module

MetricTagMode is now imported from crate::event::vrl_target, but vrl_target is compiled only with feature = "vrl" while the lua feature does not enable that feature in vector-core. This makes vector-core --no-default-features --features lua fail with an unresolved module/type, so the commit breaks a valid feature combination that previously compiled.

Useful? React with 👍 / 👎.

Comment thread lib/vector-core/src/event/vrl_target.rs Outdated
Two follow-up fixes for PR vectordotdev#25376.

P1 -- `vector-core --features lua` (without `vrl`) failed to compile
because both lua submodules imported `MetricTagMode` from
`crate::event::vrl_target`, but `vrl_target` is gated behind the `vrl`
feature. Move `MetricTagMode` to a new always-compiled
`event/metric_tag_mode.rs` module and re-export it from
`crate::event`. Both vrl_target and the lua modules now reference the
same type without forcing extra features. The codecs `From` impl
already uses the canonical `vector_core::event::MetricTagMode` path,
so no callers need to change.

P2 -- in `auto` mode an empty tag set (`TagValueSet::Empty`,
`len() == 0`) was emitted as `null` instead of `[]`, so a noop
`.tags = .tags` round-trip silently converted an empty multi-value tag
into a bare single-value tag. Change the discriminator in both
`get_auto_value_tags` (VRL) and `LuaMetricTags::into_lua` (Lua) so
only `len() == 1` scalarises -- empty sets keep their array shape.

Adds two regression tests:
* `metric_auto_read_empty_tag_set_returns_array` (vrl_target)
* `auto_mode_empty_set_stays_array` (lua/metric)

Verified:
* `cargo check -p vector-core --no-default-features --features lua`
* `cargo check -p vector-core --no-default-features --features vrl`
* `cargo check -p vector-core --all-features`
* `cargo test -p vector-core --features lua,vrl --lib auto`
  (9 passed, including the 2 new tests)
* `cargo clippy -p vector-core --all-features --all-targets -- -D warnings`

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2f9283d63a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +638 to +641
let value = match tag_set.len() {
1 => match tag_set.iter().next() {
Some(Some(s)) => Value::Bytes(s.as_bytes().to_vec().into()),
_ => Value::Null,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve one-element arrays in auto metric tag mode

In auto mode, this branch scalarizes every tag set with len() == 1, which means an assignment like .tags.region = ["us-east-1"] is read back as a scalar string on the next transform pass instead of an array. That breaks the new mode’s documented write contract (arrays should remain multi-value) and can silently change downstream behavior that checks is_array(.tags.region).

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: external docs Anything related to Vector's external, public documentation domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant