[log] [metric] Log Setting API, More TRACE Logs, and Gatherer Support #201

onelapahead · 2025-11-30T23:55:03Z

This is a slightly philosophical PR - but working on codebases using ff-common, I've found I'm often failing to find the log message I need, and often fighting to filter out all the noise of the logs I rarely need in production (but do find helpful in development). Adapted from #200 originally.

In short - I think we log too much at the debug level, especially in these libraries which should mostly log trace, and not enough at the info level. I worry we depend too much on debug to "see everything", and we can't guarantee users will actually have debug on when they hit a new problem. We need to make sure info has "enough" (which is very hard to known) to debug know issues / bugs / edge cases, and compensate additionally with metrics and tracing. For unknown issues / networking / database-related issues - thats where trace can be enabled temporarily to help triage.

Between human SREs, log aggregation systems, and agentic AI - there is a cost to every byte of waste in terms of storage, processing, and context. A "less is more" mindset is necessary. Especially as we consider adopting more modern, performant logging frameworks like https://github.com/uber-go/zap which encourage structured logging and sampling.

All that said - this PR proposes two significant changes:

It decreases a lot of internal logs for ffapi, ffresty, fftls, and dbsql to trace to avoid logging 10-100s of lines per API request (especially when TLS and databases are in play)
Allows for dynamically configuring the log level of a process using a new PUT API on the monitoring server:
```
curl http://localhost:6000/logging?level=<info|trace|debug|...>
```
This then means a user can change the log level in a process to see more details if needed, especially if they need to see the now trace logs referred to above, w/o continually over logging.

Additionally, exposes the prometheus.Gather within the metrics managers' Prometheus registry to allow for custom metrics exporting and filtering.

…patibility Signed-off-by: hfuss <hayden.fuss@kaleido.io>

…for logging Signed-off-by: hfuss <hayden.fuss@kaleido.io>

…lementation Signed-off-by: hfuss <hayden.fuss@kaleido.io>

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

…amically via monitoring server; all dbsql logs to trace Signed-off-by: hfuss <hayden.fuss@kaleido.io>

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

EnriqueL8

Thanks @onelapahead - I agree with moving those logs to Trace, I've felt the same level of pain debugging logs and being polluted by that.

I have a few questions on the way the log level is set and some confusions there

EnriqueL8 · 2025-12-01T10:48:08Z

pkg/ffapi/apiserver.go

 func (as *apiServer[T]) createMuxRouter(ctx context.Context) (*mux.Router, error) {
 	r := mux.NewRouter().UseEncodedPath()
-	hf := as.handlerFactory()
+	hf := as.handlerFactory(logrus.InfoLevel)


So by default it's info level? Not sure I understand this one, I thought it would use the log level in the apiServer as object?

EnriqueL8 · 2025-12-01T10:48:53Z

pkg/ffapi/apiserver.go

+	if logLevel != "" {
+		ctx := log.WithLogFields(req.Context(), "new_level", logLevel)
+		log.L(ctx).Warn("changing log level", logLevel)
+		log.SetLevel(logLevel)
+	}


validate the value is one of the levels supported?

EnriqueL8 · 2025-12-01T10:49:17Z

pkg/ffapi/apiserver.go

+
+	// TODO allow for toggling formatting (json, text), sampling, etc.
+
+	return http.StatusAccepted, nil


If no log level provided, shouldn't we return malformed?

EnriqueL8 · 2025-12-01T10:57:36Z

pkg/ffapi/apiserver.go

 	r := mux.NewRouter().UseEncodedPath()
-	hf := as.handlerFactory() // TODO separate factory for monitoring ??
+	// This ensures logs aren't polluted with monitoring API requests such as metrics or probes
+	hf := as.handlerFactory(logrus.TraceLevel)


Did you mean to set info level? instead of trace?

EnriqueL8 · 2025-12-01T10:57:53Z

pkg/ffapi/handler.go

+	LogLevel              *logrus.Level
 	DefaultRequestTimeout time.Duration
 	MaxTimeout            time.Duration
 	DefaultFilterLimit    uint64
 	MaxFilterSkip         uint64
 	MaxFilterLimit        uint64
 	HandleYAML            bool
 	PassthroughHeaders    []string
 	AlwaysPaginate        bool
 	SupportFieldRedaction bool
 	BasePath              string
 	BasePathParams        []*PathParam
+
+	logLevel logrus.Level


why do we have two?

onelapahead added 12 commits November 19, 2025 15:35

[log] logr.Logger and logr.LogSink Support for controller-runtime Com…

6b5ffb0

…patibility Signed-off-by: hfuss <hayden.fuss@kaleido.io>

logr stores an internal ctx for fields; withlogfields and withfields …

143bde6

…for logging Signed-off-by: hfuss <hayden.fuss@kaleido.io>

use new withlogfields internally; document inefficiencies of logr imp…

4004679

…lementation Signed-off-by: hfuss <hayden.fuss@kaleido.io>

fixed linter

4629a14

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

gatherer for custom metrics exporting

f21e9a0

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

eliminating monitoring server logs; supporting changing log level dyn…

139bc91

…amically via monitoring server; all dbsql logs to trace Signed-off-by: hfuss <hayden.fuss@kaleido.io>

fixed test; removed other noisy log from dbsql

864ecbb

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

move http conn logs to trace

f656209

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

more flexible dynamic logging API for future settings

0e811ee

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

fix panic for klog.Object structs in controller-runtime

d73a8c5

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

removing logr implementation

e77cfe7

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

less debug logs

064c97e

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

onelapahead requested a review from a team as a code owner November 30, 2025 23:55

fix lint

a1ff727

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

EnriqueL8 reviewed Dec 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[log] [metric] Log Setting API, More TRACE Logs, and Gatherer Support #201

[log] [metric] Log Setting API, More TRACE Logs, and Gatherer Support #201

onelapahead commented Nov 30, 2025 •

edited

Loading

Uh oh!

EnriqueL8 left a comment

Uh oh!

EnriqueL8 Dec 1, 2025

Uh oh!

EnriqueL8 Dec 1, 2025

Uh oh!

EnriqueL8 Dec 1, 2025

Uh oh!

EnriqueL8 Dec 1, 2025

Uh oh!

EnriqueL8 Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		// TODO allow for toggling formatting (json, text), sampling, etc.

		return http.StatusAccepted, nil

[log] [metric] Log Setting API, More TRACE Logs, and Gatherer Support #201

Are you sure you want to change the base?

[log] [metric] Log Setting API, More TRACE Logs, and Gatherer Support #201

Conversation

onelapahead commented Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EnriqueL8 left a comment

Choose a reason for hiding this comment

Uh oh!

EnriqueL8 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

EnriqueL8 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

EnriqueL8 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

EnriqueL8 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

EnriqueL8 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

onelapahead commented Nov 30, 2025 •

edited

Loading