Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: refactor how we do defaults for runtime overrides #12448

Merged
merged 13 commits into from Apr 3, 2024
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -88,6 +88,7 @@
* [11121](https://github.com/grafana/loki/pull/11121) **periklis** Ensure all lifecycler cfgs ref a valid IPv6 addr and port combination
* [10650](https://github.com/grafana/loki/pull/10650) **matthewpi** Ensure the frontend uses a valid IPv6 addr and port combination
* [11665](https://github.com/grafana/loki/pull/11665) **salvacorts** Deprecate and flip `-legacy-read-mode` flag to `false` by default.
* [12448](https://github.com/grafana/loki/pull/12448/files) **slim-bean** BREAKING CHANGE: refactor how we do defaults for runtime overrides

#### Promtail

Expand Down
2 changes: 2 additions & 0 deletions cmd/loki/main.go
Expand Up @@ -17,6 +17,7 @@ import (
"github.com/prometheus/common/version"

"github.com/grafana/loki/v3/pkg/loki"
loki_runtime "github.com/grafana/loki/v3/pkg/runtime"
"github.com/grafana/loki/v3/pkg/util"
_ "github.com/grafana/loki/v3/pkg/util/build"
"github.com/grafana/loki/v3/pkg/util/cfg"
Expand Down Expand Up @@ -49,6 +50,7 @@ func main() {
// call it atleast once, the defaults are set to an empty struct.
// We call it with the flag values so that the config file unmarshalling only overrides the values set in the config.
validation.SetDefaultLimitsForYAMLUnmarshalling(config.LimitsConfig)
loki_runtime.SetDefaultLimitsForYAMLUnmarshalling(config.OperationalConfig)

// Init the logger which will honor the log level set in config.Server
if reflect.DeepEqual(&config.Server.LogLevel, &log.Level{}) {
Expand Down
37 changes: 35 additions & 2 deletions docs/sources/configure/_index.md
Expand Up @@ -186,7 +186,9 @@ Pass the `-config.expand-env` flag at the command line to enable this way of set
# shards for performance.
[compactor: <compactor>]

# The limits_config block configures global and per-tenant limits in Loki.
# The limits_config block configures global and per-tenant limits in Loki. The
# values here can be overridden in the `overrides` section of the runtime_config
# file
[limits_config: <limits_config>]

# The frontend_worker configures the worker - running within the Loki querier -
Expand All @@ -208,6 +210,11 @@ Pass the `-config.expand-env` flag at the command line to enable this way of set
# configuration file.
[runtime_config: <runtime_config>]

# These are values which allow you to control aspects of Loki's operation, most
# commonly used for controlling types of higher verbosity logging, the values
# here can be overridden in the `configs` section of the `runtime_config` file.
[operational_config: <operational_config>]

# Configuration for tracing.
[tracing: <tracing>]

Expand Down Expand Up @@ -2747,7 +2754,7 @@ retention:

### limits_config

The `limits_config` block configures global and per-tenant limits in Loki.
The `limits_config` block configures global and per-tenant limits in Loki. The values here can be overridden in the `overrides` section of the runtime_config file

```yaml
# Whether the ingestion rate limit should be applied individually to each
Expand Down Expand Up @@ -3675,6 +3682,32 @@ Configuration for 'runtime config' module, responsible for reloading runtime con
[file: <string> | default = ""]
```

### operational_config

These are values which allow you to control aspects of Loki's operation, most commonly used for controlling types of higher verbosity logging, the values here can be overridden in the `configs` section of the `runtime_config` file.

```yaml
# Log every new stream created by a push request (very verbose, recommend to
# enable via runtime config only).
# CLI flag: -operation-config.log-stream-creation
[log_stream_creation: <boolean> | default = false]

# Log every push request (very verbose, recommend to enable via runtime config
# only).
# CLI flag: -operation-config.log-push-request
[log_push_request: <boolean> | default = false]

# Log every stream in a push request (very verbose, recommend to enable via
# runtime config only).
# CLI flag: -operation-config.log-push-request-streams
[log_push_request_streams: <boolean> | default = false]

# Log push errors with a rate limited logger, will show client push errors
# without overly spamming logs.
# CLI flag: -operation-config.limited-log-push-errors
[limited_log_push_errors: <boolean> | default = true]
```

### tracing

Configuration for `tracing`.
Expand Down
6 changes: 6 additions & 0 deletions docs/sources/setup/upgrade/_index.md
Expand Up @@ -38,6 +38,12 @@ The output is incredibly verbose as it shows the entire internal config struct u

### Loki

#### Removed the `default` section of the runtime overrides config file.

This was introduced in 2.9 and likely not widely used. This only affects you if you run Loki with a runtime config file AND you had populated the new `default` block added in 2.9.

The `default` block was removed and instead a top level config now exists in the standard Loki config called `operational_config`, you can set default values here for runtime configs.

#### Removed `shared_store` and `shared_store_key_prefix` from shipper configuration

The following CLI flags and the corresponding YAML settings to configure shared store for TSDB and BoltDB shippers are now removed:
Expand Down
8 changes: 5 additions & 3 deletions pkg/loki/loki.go
Expand Up @@ -99,9 +99,10 @@ type Config struct {
TableManager index.TableManagerConfig `yaml:"table_manager,omitempty"`
MemberlistKV memberlist.KVConfig `yaml:"memberlist"`

RuntimeConfig runtimeconfig.Config `yaml:"runtime_config,omitempty"`
Tracing tracing.Config `yaml:"tracing"`
Analytics analytics.Config `yaml:"analytics"`
RuntimeConfig runtimeconfig.Config `yaml:"runtime_config,omitempty"`
OperationalConfig runtime.Config `yaml:"operational_config,omitempty"`
Tracing tracing.Config `yaml:"tracing"`
Analytics analytics.Config `yaml:"analytics"`

LegacyReadTarget bool `yaml:"legacy_read_target,omitempty" doc:"hidden|deprecated"`

Expand Down Expand Up @@ -171,6 +172,7 @@ func (c *Config) RegisterFlags(f *flag.FlagSet) {
c.BloomCompactor.RegisterFlags(f)
c.QueryScheduler.RegisterFlags(f)
c.Analytics.RegisterFlags(f)
c.OperationalConfig.RegisterFlags(f)
}

func (c *Config) registerServerFlagsWithChangedDefaultValues(fs *flag.FlagSet) {
Expand Down
1 change: 1 addition & 0 deletions pkg/loki/modules.go
Expand Up @@ -260,6 +260,7 @@ func (t *Loki) initRuntimeConfig() (services.Service, error) {

// make sure to set default limits before we start loading configuration into memory
validation.SetDefaultLimitsForYAMLUnmarshalling(t.Cfg.LimitsConfig)
runtime.SetDefaultLimitsForYAMLUnmarshalling(t.Cfg.OperationalConfig)

var err error
t.runtimeConfig, err = runtimeconfig.New(t.Cfg.RuntimeConfig, "loki", prometheus.WrapRegistererWithPrefix("loki_", prometheus.DefaultRegisterer), util_log.Logger)
Expand Down
5 changes: 1 addition & 4 deletions pkg/loki/runtime_config.go
Expand Up @@ -21,9 +21,6 @@ type runtimeConfigValues struct {
TenantLimits map[string]*validation.Limits `yaml:"overrides"`
TenantConfig map[string]*runtime.Config `yaml:"configs"`

// DefaultConfig defines the default runtime configuration. Used if a tenant runtime configuration wasn't given.
DefaultConfig *runtime.Config `yaml:"default"`

Multi kv.MultiRuntimeConfig `yaml:"multi_kv_config"`
}

Expand Down Expand Up @@ -107,7 +104,7 @@ func (t *tenantConfigProvider) TenantConfig(userID string) *runtime.Config {
if tenantCfg, ok := cfg.TenantConfig[userID]; ok {
return tenantCfg
}
return cfg.DefaultConfig
return nil
}

func multiClientRuntimeConfigChannel(manager *runtimeconfig.Manager) func() <-chan kv.MultiRuntimeConfig {
Expand Down
23 changes: 11 additions & 12 deletions pkg/loki/runtime_config_test.go
Expand Up @@ -87,9 +87,6 @@ overrides:
func Test_DefaultConfig(t *testing.T) {
runtimeGetter := newTestRuntimeconfig(t,
`
default:
log_push_request: true
limited_log_push_errors: false
configs:
"1":
log_push_request: false
Expand All @@ -98,16 +95,15 @@ configs:
log_push_request: true
`)

user1 := runtimeGetter.TenantConfig("1")
user2 := runtimeGetter.TenantConfig("2")
user3 := runtimeGetter.TenantConfig("3")
tenantConfigs, err := runtime.NewTenantConfigs(runtimeGetter)
require.NoError(t, err)

require.Equal(t, false, user1.LogPushRequest)
require.Equal(t, false, user1.LimitedLogPushErrors)
require.Equal(t, false, user2.LimitedLogPushErrors)
require.Equal(t, true, user2.LogPushRequest)
require.Equal(t, false, user3.LimitedLogPushErrors)
require.Equal(t, true, user3.LogPushRequest)
require.Equal(t, false, tenantConfigs.LogPushRequest("1"))
require.Equal(t, false, tenantConfigs.LimitedLogPushErrors("1"))
require.Equal(t, false, tenantConfigs.LimitedLogPushErrors("2"))
require.Equal(t, true, tenantConfigs.LogPushRequest("2"))
require.Equal(t, true, tenantConfigs.LimitedLogPushErrors("3"))
require.Equal(t, false, tenantConfigs.LogPushRequest("3"))
}

func newTestRuntimeconfig(t *testing.T, yaml string) runtime.TenantConfigProvider {
Expand All @@ -126,7 +122,10 @@ func newTestRuntimeconfig(t *testing.T, yaml string) runtime.TenantConfigProvide
}
flagset := flag.NewFlagSet("", flag.PanicOnError)
var defaults validation.Limits
var operations runtime.Config
defaults.RegisterFlags(flagset)
operations.RegisterFlags(flagset)
runtime.SetDefaultLimitsForYAMLUnmarshalling(operations)
require.NoError(t, flagset.Parse(nil))

reg := prometheus.NewPedanticRegistry()
Expand Down
44 changes: 40 additions & 4 deletions pkg/runtime/config.go
@@ -1,5 +1,9 @@
package runtime

import (
"flag"
)

type Config struct {
LogStreamCreation bool `yaml:"log_stream_creation"`
LogPushRequest bool `yaml:"log_push_request"`
Expand All @@ -9,7 +13,26 @@ type Config struct {
LimitedLogPushErrors bool `yaml:"limited_log_push_errors"`
}

var EmptyConfig = &Config{}
// RegisterFlags adds the flags required to config this to the given FlagSet
func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.BoolVar(&cfg.LogStreamCreation, "operation-config.log-stream-creation", false, "Log every new stream created by a push request (very verbose, recommend to enable via runtime config only).")
f.BoolVar(&cfg.LogPushRequest, "operation-config.log-push-request", false, "Log every push request (very verbose, recommend to enable via runtime config only).")
f.BoolVar(&cfg.LogPushRequestStreams, "operation-config.log-push-request-streams", false, "Log every stream in a push request (very verbose, recommend to enable via runtime config only).")
f.BoolVar(&cfg.LimitedLogPushErrors, "operation-config.limited-log-push-errors", true, "Log push errors with a rate limited logger, will show client push errors without overly spamming logs.")
}

// When we load YAML from disk, we want the various per-customer limits
// to default to any values specified on the command line, not default
// command line values. This global contains those values. I (Tom) cannot
// find a nicer way I'm afraid.
var defaultConfig *Config

// SetDefaultLimitsForYAMLUnmarshalling sets global default limits, used when loading
// Limits from YAML files. This is used to ensure per-tenant limits are defaulted to
// those values.
func SetDefaultLimitsForYAMLUnmarshalling(defaults Config) {
defaultConfig = &defaults
}

// TenantConfigProvider serves a tenant or default config.
type TenantConfigProvider interface {
Expand All @@ -23,13 +46,26 @@ type TenantConfigs struct {
}

// DefaultTenantConfigs creates and returns a new TenantConfigs with the defaults populated.
// Only useful for testing, the provider will ignore any tenants passed in.
func DefaultTenantConfigs() *TenantConfigs {
return &TenantConfigs{
TenantConfigProvider: nil,
TenantConfigProvider: &defaultsOnlyConfigProvider{},
}
}

type defaultsOnlyConfigProvider struct {
}

// TenantConfig implementation for defaultsOnlyConfigProvider, ignores the tenant input and only returns a default config
func (t *defaultsOnlyConfigProvider) TenantConfig(_ string) *Config {
if defaultConfig == nil {
defaultConfig = &Config{}
defaultConfig.RegisterFlags(flag.NewFlagSet("", flag.PanicOnError))
}
return defaultConfig
}

// NewTenantConfig makes a new TenantConfigs
// NewTenantConfigs makes a new TenantConfigs
func NewTenantConfigs(configProvider TenantConfigProvider) (*TenantConfigs, error) {
return &TenantConfigs{
TenantConfigProvider: configProvider,
Expand All @@ -43,7 +79,7 @@ func (o *TenantConfigs) getOverridesForUser(userID string) *Config {
return l
}
}
return EmptyConfig
return defaultConfig
}

func (o *TenantConfigs) LogStreamCreation(userID string) bool {
Expand Down
9 changes: 0 additions & 9 deletions production/ksonnet/loki/overrides.libsonnet
Expand Up @@ -25,14 +25,6 @@ local k = import 'ksonnet-util/kausal.libsonnet';
// log_push_request_streams: true,
// },
},

defaultRuntimeConfigs: {
// override the default runtime configs. Useful to change a behavior globally across multiple tenants. See pkg/loki/runtime_config.go
//
// log_stream_creation: true,
// log_push_request: true,
// log_push_request_streams: true,
},
},
local configMap = k.core.v1.configMap,

slim-bean marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -43,7 +35,6 @@ local k = import 'ksonnet-util/kausal.libsonnet';
{
overrides: $._config.overrides,
configs: $._config.runtimeConfigs,
default: $._config.defaultRuntimeConfigs,
}
+ (if std.length($._config.multi_kv_config) > 0 then { multi_kv_config: $._config.multi_kv_config } else {}),
),
Expand Down
8 changes: 7 additions & 1 deletion tools/doc-generator/parse/root_blocks.go
Expand Up @@ -27,6 +27,7 @@ import (
"github.com/grafana/loki/v3/pkg/querier/queryrange"
querier_worker "github.com/grafana/loki/v3/pkg/querier/worker"
"github.com/grafana/loki/v3/pkg/ruler"
"github.com/grafana/loki/v3/pkg/runtime"
"github.com/grafana/loki/v3/pkg/scheduler"
"github.com/grafana/loki/v3/pkg/storage"
"github.com/grafana/loki/v3/pkg/storage/chunk/cache"
Expand Down Expand Up @@ -133,7 +134,7 @@ var (
{
Name: "limits_config",
StructType: []reflect.Type{reflect.TypeOf(validation.Limits{})},
Desc: "The limits_config block configures global and per-tenant limits in Loki.",
Desc: "The limits_config block configures global and per-tenant limits in Loki. The values here can be overridden in the `overrides` section of the runtime_config file",
},
{
Name: "frontend_worker",
Expand All @@ -151,6 +152,11 @@ var (
StructType: []reflect.Type{reflect.TypeOf(runtimeconfig.Config{})},
Desc: "Configuration for 'runtime config' module, responsible for reloading runtime configuration file.",
},
{
Name: "operational_config",
StructType: []reflect.Type{reflect.TypeOf(runtime.Config{})},
Desc: "These are values which allow you to control aspects of Loki's operation, most commonly used for controlling types of higher verbosity logging, the values here can be overridden in the `configs` section of the `runtime_config` file.",
},
{
Name: "tracing",
StructType: []reflect.Type{reflect.TypeOf(tracing.Config{})},
Expand Down