Skip to content

dynamic_modules: add stats sink extension #44861

Open
prashanthjos wants to merge 8 commits into
envoyproxy:mainfrom
prashanthjos:dynamic-modules-stats-sink
Open

dynamic_modules: add stats sink extension #44861
prashanthjos wants to merge 8 commits into
envoyproxy:mainfrom
prashanthjos:dynamic-modules-stats-sink

Conversation

@prashanthjos
Copy link
Copy Markdown
Contributor

Commit Message: Adds envoy.stat_sinks.dynamic_modules, a contrib stats sink that delegates Envoy's flush and histogram callbacks to a shared object via the existing dynamic modules ABI. Includes new ABI hooks and snapshot readback callbacks, Go SDK bindings (StatSink / StatSinkConfigFactory / MetricSnapshot + cgo glue), a v3 proto, and full unit + integration tests with C test modules for symbol-missing and config-new-fail paths.

Alpha; security posture requires_trusted_downstream_and_upstream

Additional Description: Introduces four new event hooks and five snapshot readback callbacks in the dynamic modules ABI for stats sinks, a contrib C++ implementation that supports both name-based and explicit-path module loading (remote sources are rejected; stats sinks are a process-wide bootstrap resource), a new v3 proto under api/contrib/, and Go SDK bindings (StatSink interface, factory registry, and cgo glue that returns a C-owned handle to satisfy Go's
pointer-passing rules across the ABI boundary). Per-flush name caching is scoped to the call stack to give modules stable pointers during the callback without steady-state memory growth.

Risk Level: Low. New opt-in contrib extension; no behavioral change unless configured. Security posture is requires_trusted_downstream_and_upstream (matches access_loggers/dynamic_modules and bootstrap/dynamic_modules, since dynamic modules run in-process with full Envoy privileges).

Testing: Unit coverage under contrib/stat_sinks/dynamic_modules/test/ (config_test, sink_test, abi_impl_test) exercises factory registration, proto validation, both load paths, all four symbol-missing failures, null-return from config_new, snapshot callback correctness across counters/gauges/text readouts, and regression guards for -Wdangling-gsl on Metric::name(). Integration test boots real Envoy with a C test module that emits log markers via envoy_dynamic_module_callback_log and asserts them with EXPECT_LOG_CONTAINS. Manually validated end-to-end with a Go module posting metrics to an HTTP receiver across many flush cycles under traffic.

Docs Changes: API proto is fully commented (package-level protodoc-title, per-field docs including serialization semantics for google.protobuf.Any). No changes under docs/root/.
Release Notes: Added to changelogs/current.yaml under new_features, area: stat_sinks.
Platform Specific Features: None. Extension uses the existing dlopen-based dynamic modules infrastructure, which is already supported on Linux and macOS.
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Adds `envoy.stat_sinks.dynamic_modules`, a contrib stats sink that delegates Envoy's flush and histogram callbacks to a shared object via
the existing dynamic modules ABI. Includes new ABI hooks and snapshot
readback callbacks, Go SDK bindings (StatSink / StatSinkConfigFactory /
MetricSnapshot + cgo glue), a v3 proto, and full unit + integration tests with C test modules for symbol-missing and config-new-fail paths.

Alpha; security posture requires_trusted_downstream_and_upstream

Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
@mathetake
Copy link
Copy Markdown
Member

at least me and @agrawroh or @wbpcode can sponsor this so you can move this into a core stats extension

@prashanthjos
Copy link
Copy Markdown
Contributor Author

at least me and @agrawroh or @wbpcode can sponsor this so you can move this into a core stats extension

Thank you @mathetake I will wait for this build to finish to catch any errors, also do you mean I can move this code under source instead of contrib ?

Also here is an E2E example of how to use the Stats Sink Dynamic Module, along with a sample envoy.yaml

https://github.com/prashanthjos/go-stats-sink-dynamic-module/blob/main/http_sink.go
https://github.com/prashanthjos/go-stats-sink-dynamic-module/blob/main/envoy.yaml

@mathetake
Copy link
Copy Markdown
Member

do you mean I can move this code under source instead of contrib ?

yes

Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/coverage-shephards: FYI only for changes made to (test/coverage.yaml).
envoyproxy/coverage-shephards assignee is @RyanTheOptimist
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @wbpcode
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #44861 was synchronize by prashanthjos.

see: more, trace.

@prashanthjos prashanthjos force-pushed the dynamic-modules-stats-sink branch 2 times, most recently from 8433009 to d8b3d69 Compare May 7, 2026 05:09
Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
@prashanthjos prashanthjos force-pushed the dynamic-modules-stats-sink branch from d8b3d69 to e26108c Compare May 7, 2026 16:24
@prashanthjos
Copy link
Copy Markdown
Contributor Author

prashanthjos commented May 7, 2026

@mathetake @wbpcode @agrawroh I have moved all the code to the source, can you please take a look and leave your thoughts and in the mean time I will ask Gemini to review :)

@prashanthjos
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the DynamicModuleStatsSink extension, which allows Envoy to load shared object files via dlopen to implement custom statistics sink behavior. The extension provides a C ABI for modules to receive periodic metric snapshots (counters, gauges, and text readouts) and synchronous histogram observations. It includes a Go SDK to facilitate module development in Go, along with comprehensive documentation and tests covering configuration, ABI implementation, and end-to-end integration. I have no feedback to provide as there are no review comments.

@prashanthjos prashanthjos changed the title dynamic_modules: add stats sink extension (contrib) dynamic_modules: add stats sink extension May 8, 2026
}

// --------------------------------------------------------------------------
// Stats Sink snapshot callbacks
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not have stats sink impl here like other extension points?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done !

development; capabilities and ABI are expected to evolve.

How it works
------------
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delete this implementation detail?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted!

* ``google.protobuf.Struct`` -- serialized to JSON before delivery.
* Any other message type -- raw proto bytes.

Go-module caveats
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delete this too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted!

prefix: "envoy."

Configuration payloads
----------------------
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted!

Comment thread CODEOWNERS Outdated
/*/extensions/grpc_credentials/file_based_metadata @wozz @yanavlasov
/*/extensions/internal_redirect @botengyao @penguingao
/*/extensions/stat_sinks/dog_statsd @taiki45 @paul-r-gall
/*/extensions/stat_sinks/dynamic_modules @prashanthjos @mattklein123
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be @mathetake @agrawroh and @wbpcode and you

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added!

Comment thread test/coverage.yaml Outdated
@@ -91,3 +91,4 @@ directories:
source/extensions/config_subscription/rest: 94.9
source/extensions/matching/input_matchers/cel_matcher: 100.0
source/extensions/dynamic_modules/sdk/cpp: 0.0 # SDK code self not directly tested
source/extensions/stat_sinks/dynamic_modules: 95.0 # Symbol-resolution error paths not fully exercised.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not add this unless absolute necessary

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage LGTM

DynamicModuleStatsSink::DynamicModuleStatsSink(DynamicModuleStatsSinkConfigSharedPtr config)
: config_(std::move(config)) {}

void DynamicModuleStatsSink::flush(Stats::MetricSnapshot& snapshot) {
Copy link
Copy Markdown
Member

@mathetake mathetake May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to avoid all the vector allocations in here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is removed


void DynamicModuleStatsSink::onHistogramComplete(const Stats::Histogram& histogram,
uint64_t value) {
// Metric::name() returns std::string by value; bind it to a local so the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this kind of unnecessary comments? Smells pretty much like AI

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted!

// server start without an init manager isn't a pattern we want to introduce.
absl::StatusOr<Extensions::DynamicModules::DynamicModulePtr> dynamic_module_or_error;
if (module_config.has_module()) {
if (module_config.module().has_remote()) {
Copy link
Copy Markdown
Member

@mathetake mathetake May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not be for this PR but can't we just reuse the existing code for loading? What makes stats sink special?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have corrected this code.

* @param snapshot_envoy_ptr is the pointer to the MetricSnapshot.
* @param index is the index of the counter (0-based).
* @param name_out is the output buffer for the counter name. The buffer is owned by Envoy and
* valid only during this call.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "during this call"? What's the expected lifetime here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, changed the wording to "guaranteed to be valid until the end of the envoy_dynamic_module_on_stat_sink_flush event hook", which matches the lifetime convention used by other ABI callbacks (e.g. HTTP filter headers/data). Applied the same fix to all three snapshot callbacks (counter, gauge, text readout).

* @param snapshot_envoy_ptr is the pointer to the MetricSnapshot.
* @param index is the index of the text readout (0-based).
* @param name_out is the output buffer for the text readout name. The buffer is owned by Envoy
* and valid only during this call.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

* @param name_out is the output buffer for the text readout name. The buffer is owned by Envoy
* and valid only during this call.
* @param value_out is the output buffer for the text readout value. The buffer is owned by Envoy
* and valid only during this call.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

*
* @param snapshot_envoy_ptr is the pointer to the MetricSnapshot.
* @param index is the index of the gauge (0-based).
* @param name_out is the output buffer for the gauge name. The buffer is owned by Envoy and
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

@RyanTheOptimist
Copy link
Copy Markdown
Contributor

/wait

@prashanthjos
Copy link
Copy Markdown
Contributor Author

/wait

@RyanTheOptimist I am going to address the comments, is the wait because of an upcoming release or you see any issue with the PR?

@prashanthjos
Copy link
Copy Markdown
Contributor Author

I will wait for @wbpcode, @agrawroh comments also and address them together.

@mathetake
Copy link
Copy Markdown
Member

@prashanthjos you won't need to wait for others to comment. /wait is to just add a convenient PR label for maintainers and it stops sending notification to them until you iterate on this PR (i.e. pushing commits).

Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
Comment thread test/coverage.yaml Outdated
@mathetake
Copy link
Copy Markdown
Member

/coverage

@repokitteh-read-only
Copy link
Copy Markdown

Coverage for this Pull Request will be rendered here:

https://storage.googleapis.com/envoy-cncf-pr/44861/coverage/index.html

For comparison, current coverage on main branch is here:

https://storage.googleapis.com/envoy-cncf-postsubmit/main/coverage/index.html

The coverage results are (re-)rendered each time the CI Envoy/Checks (coverage) job completes.

🐱

Caused by: a #44861 (comment) was created by @mathetake.

see: more, trace.

Copy link
Copy Markdown
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this great contribution and the ABI LGTM. Some comments are added to the implementation.

/wait

@@ -0,0 +1,309 @@
#include "source/extensions/dynamic_modules/abi/abi.h"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you keep these test files at the test/extensions/dynamic_modules like other modules? thanks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch, I missed it, will move the test files to test/extensions/dynamic_modules/stat_sink/ to match the pattern used by the HTTP filter, bootstrap, listener, network, and UDP test directories. Will do this in the next push.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done, I moved the files.

// compiled module or plugins.
var httpFilterConfigFactoryRegistry = make(map[string]shared.HttpFilterConfigFactory)
var networkFilterConfigFactoryRegistry = make(map[string]shared.NetworkFilterConfigFactory)
var statSinkConfigFactoryRegistry = make(map[string]shared.StatSinkConfigFactory)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking Golang SDK into account 👍 But please also add the rust SDK at least. And we need to ensure our test data also cover the rust and go, cpp SDK and test are optional but if you can provided one, it would be better. (AI should be pretty helpful for this type of tasks)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wbpcode this is already a big PR, should we do the rust SDK in one more iteration? Also just want to let you know that I am not a rust expert :)

Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
@agrawroh
Copy link
Copy Markdown
Member

Thanks for adding this! Would be very useful

Signed-off-by: Prashanth Josyula <prashanth.16@gmail.com>
@prashanthjos
Copy link
Copy Markdown
Contributor Author

@wbpcode @mathetake please take another look at the PR when you get a chance

@prashanthjos
Copy link
Copy Markdown
Contributor Author

@mathetake @agrawroh @wbpcode gentle reminder !

Copy link
Copy Markdown
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I want to say we should add rust SDK and cpp SDK. But this PR is there for a while. We can left that tasks to others.

But because the go SDK is there, please add go based test data and test the feature works correctly like my previous comments. Then I am good to land this first.

Comment on lines +226 to +234

// CounterSnapshot is one counter in a MetricSnapshot.
// The Name field shares memory with Envoy's buffer and is only valid during the
// enclosing OnFlush callback — copy it with ToString() if you need to keep it.
type CounterSnapshot struct {
Name UnsafeEnvoyBuffer
Value uint64
Delta uint64
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge main, and then, you will find we will place different module's API into different file. Like the Envoy exposed API should be stats_sink.go and the interface the users need to implement should be placed in the stats_sink_api.go.

namespace {

const Envoy::Extensions::StatSinks::DynamicModules::StatSinkFlushContext*
toCtx(envoy_dynamic_module_type_stat_sink_snapshot_envoy_ptr ptr) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toStatsSinkFlushContext. In the cpp, we prefer full name if it's not that long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants