Skip to content

chore(config): Fix InlineSingleUseReferencesVisitor failing merges #23207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 17, 2025

Conversation

bruceg
Copy link
Member

@bruceg bruceg commented Jun 13, 2025

Summary

The above visitor is intended to inline schemas in the references section for which there is only a single reference. This visitor is, however, unable to detect some cases where there are in fact multiple references, ending up inlining the schema multiple times. This has led to bloat in generated schemas.

In the core Vector schema, this affects 121 structures, shaving 844 lines off of the generated schema.

I've labelled this as a chore as it doesn't actually make any functional change to the result.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • The CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
      • ./scripts/check_changelog_fragments.sh
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run cargo vdev build licenses to regenerate the license inventory and commit the changes (if any). More details here.

References

The above visitor is intended to inline schemas in the references section for
which there is only a single reference. This visitor is, however, unable to
detect some cases where there are in fact multiple references, ending up
inlining the schema multiple times. This has led to bloat in generated schemas.

In the core Vector schema, this affects 121 structures, shaving 844 lines off of
the generated schema.
@bruceg bruceg requested a review from a team as a code owner June 13, 2025 22:49
@bruceg bruceg added type: tech debt A code change that does not add user value. domain: config Anything related to configuring Vector domain: schemas Anything related to internal Vector event schemas no-changelog Changes in this PR do not need user-facing explanations in the release changelog labels Jun 13, 2025
@bruceg bruceg requested review from Copilot and removed request for a team June 13, 2025 22:51
Copilot

This comment was marked as outdated.

@datadog-vectordotdev
Copy link

datadog-vectordotdev bot commented Jun 13, 2025

Datadog Report

Branch report: bruceg/fix-inline-single-visitor
Commit report: 1c6f57e
Test service: vector

✅ 0 Failed, 7 Passed, 0 Skipped, 25.26s Total Time

@pront
Copy link
Member

pront commented Jun 16, 2025

@pront pront requested review from thomasqueirozb and Copilot June 16, 2025 18:18
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors how InlineSingleUseReferencesVisitor counts schema references by replacing a HashSet of parent references with a simple occurrence count (usize), and adds tests for various reference contexts (arrays, composition keywords, additional/pattern properties).

  • Switched OccurrenceVisitor’s occurrence_map to track counts instead of sets.
  • Removed parent-scope deduplication logic and corresponding helper method.
  • Added new tests to cover array items, oneOf/anyOf/allOf, additionalProperties, and patternProperties.
Comments suppressed due to low confidence (3)

lib/vector-config/src/schema/visitors/inline_single.rs:143

  • [nitpick] Consider renaming occurrence_map to something like occurrence_count_map or adding a doc comment, since it now stores a usize count rather than a set of references, which improves clarity.
struct OccurrenceVisitor {

lib/vector-config/src/schema/visitors/inline_single.rs:318

  • The single_ref_multiple_usages test is still expected to fail (as noted in the TODO) but its #[ignore] attribute was removed, causing CI to break. Please either restore #[ignore] or implement the required logic to satisfy this test.
#[test]

lib/vector-config/src/schema/visitors/inline_single.rs:318

  • The multiple_refs_mixed_usages test is still marked as a known issue in the TODO but is no longer ignored; this will cause the test suite to fail. Please re-add #[ignore] or update the logic to pass this test.
#[test]

@bruceg bruceg requested a review from a team as a code owner June 16, 2025 21:19
@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Jun 16, 2025
@@ -218,12 +218,8 @@ base: components: sources: static_metrics: configuration: {
type: float: {}
}
bins: {
description: """
A split representation of sketch bins.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that this bit was deleted.

@pront pront added this pull request to the merge queue Jun 17, 2025
Merged via the queue into master with commit 11e1cea Jun 17, 2025
72 checks passed
@pront pront deleted the bruceg/fix-inline-single-visitor branch June 17, 2025 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: config Anything related to configuring Vector domain: external docs Anything related to Vector's external, public documentation domain: schemas Anything related to internal Vector event schemas no-changelog Changes in this PR do not need user-facing explanations in the release changelog type: tech debt A code change that does not add user value.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants