Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make flux controller limits configurable #21108

Merged
merged 5 commits into from
Apr 1, 2021

Conversation

lesam
Copy link
Contributor

@lesam lesam commented Mar 30, 2021

Closes #17212

A sample of the new config:

[flux-controller]
  query-concurrency = 0
  query-initial-memory-bytes = 0
  query-max-memory-bytes = 0
  total-max-memory-bytes = 0
  query-queue-size = 0

Also use the prometheus metrics in debug/vars, here is a sample:

"query_control_all_active": {"name":"query_control_all_active","tags":null,"values":{"gauge":0}},
"query_control_all_duration_seconds": {"name":"query_control_all_duration_seconds","tags":null,"values":{"0.001":0,"0.005":0,"0.025":0,"0.125":0,"0.625":0,"15.625":2,"3.125":2,"count":2,"sum":2.9953034240000003}},
"query_control_compiling_active": {"name":"query_control_compiling_active","tags":null,"values":{"gauge":0}},
"query_control_compiling_duration_seconds": {"name":"query_control_compiling_duration_seconds","tags":null,"values":{"0.001":2,"0.005":2,"0.025":2,"0.125":2,"0.625":2,"15.625":2,"3.125":2,"count":2,"sum":0.0010411650000000001}},
"query_control_executing_active": {"name":"query_control_executing_active","tags":null,"values":{"gauge":0}},
"query_control_executing_duration_seconds": {"name":"query_control_executing_duration_seconds","tags":null,"values":{"0.001":0,"0.005":0,"0.025":0,"0.125":0,"0.625":0,"15.625":2,"3.125":2,"count":2,"sum":2.994032791}},
"query_control_memory_unused_bytes": {"name":"query_control_memory_unused_bytes","tags":null,"values":{"gauge":0}},
"query_control_queueing_active": {"name":"query_control_queueing_active","tags":null,"values":{"gauge":0}},
"query_control_queueing_duration_seconds": {"name":"query_control_queueing_duration_seconds","tags":null,"values":{"0.001":2,"0.005":2,"0.025":2,"0.125":2,"0.625":2,"15.625":2,"3.125":2,"count":2,"sum":0.000087963}},
"query_control_requests_total": {"name":"query_control_requests_total","tags":null,"values":{"counter":1}},
"query_control_requests_total:1": {"name":"query_control_requests_total","tags":null,"values":{"counter":1}}

A sample of the new config:

```
[flux-controller]
  query-concurrency = 0
  query-initial-memory-bytes = 0
  query-max-memory-bytes = 0
  total-max-memory-bytes = 0
  query-queue-size = 0
```

Also use the prometheus metrics in debug/vars, here is a sample:

```
"query_control_all_active": {"name":"query_control_all_active","tags":null,"values":{"gauge":0}},
"query_control_all_duration_seconds": {"name":"query_control_all_duration_seconds","tags":null,"values":{"0.001":0,"0.005":0,"0.025":0,"0.125":0,"0.625":0,"15.625":2,"3.125":2,"count":2,"sum":2.9953034240000003}},
"query_control_compiling_active": {"name":"query_control_compiling_active","tags":null,"values":{"gauge":0}},
"query_control_compiling_duration_seconds": {"name":"query_control_compiling_duration_seconds","tags":null,"values":{"0.001":2,"0.005":2,"0.025":2,"0.125":2,"0.625":2,"15.625":2,"3.125":2,"count":2,"sum":0.0010411650000000001}},
"query_control_executing_active": {"name":"query_control_executing_active","tags":null,"values":{"gauge":0}},
"query_control_executing_duration_seconds": {"name":"query_control_executing_duration_seconds","tags":null,"values":{"0.001":0,"0.005":0,"0.025":0,"0.125":0,"0.625":0,"15.625":2,"3.125":2,"count":2,"sum":2.994032791}},
"query_control_memory_unused_bytes": {"name":"query_control_memory_unused_bytes","tags":null,"values":{"gauge":0}},
"query_control_queueing_active": {"name":"query_control_queueing_active","tags":null,"values":{"gauge":0}},
"query_control_queueing_duration_seconds": {"name":"query_control_queueing_duration_seconds","tags":null,"values":{"0.001":2,"0.005":2,"0.025":2,"0.125":2,"0.625":2,"15.625":2,"3.125":2,"count":2,"sum":0.000087963}},
"query_control_requests_total": {"name":"query_control_requests_total","tags":null,"values":{"counter":1}},
"query_control_requests_total:1": {"name":"query_control_requests_total","tags":null,"values":{"counter":1}}
```
@lesam lesam requested a review from danxmoran March 30, 2021 22:44
cmd/influxd/run/config.go Show resolved Hide resolved
query/control/controller.go Show resolved Hide resolved
@@ -117,7 +116,7 @@ func validateUnusedMemory(t testing.TB, reg *prometheus.Registry, c control.Conf
}

func TestController_QuerySuccess(t *testing.T) {
ctrl, err := control.New(config)
ctrl, err := control.New(config, nil, executorDependencies)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of nil for these, you could use zaptest.Logger(t). That way we'll get the log info on failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

query/control/controller_test.go Show resolved Hide resolved
Copy link
Contributor Author

@lesam lesam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong PR - comment removed.

danxmoran
danxmoran previously approved these changes Mar 31, 2021
@lesam
Copy link
Contributor Author

lesam commented Mar 31, 2021

Note query_control in the metric names have been replaced to qc

@lesam
Copy link
Contributor Author

lesam commented Mar 31, 2021

Looks like the race detector failed on the control test...

c.shutdown = true
if len(c.queries) == 0 {
c.queriesMu.Unlock()
return nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was racy - even if all the queries are removed from the map, that doesn't mean that all the spawned goroutines are finished. When I added the zaptest logging I started seeing failures from the race detector because we would get a bit of logging after the end of the test, when the test logger was no longer valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the same problem in 2.x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely yes.

danxmoran
danxmoran previously approved these changes Mar 31, 2021
Previously some tests were failing due to logging after the end of the test.
@lesam lesam merged commit f1aa051 into influxdata:master-1.x Apr 1, 2021
lesam added a commit that referenced this pull request Apr 1, 2021
* feat: make flux controller limits configurable

A sample of the new config:

```
[flux-controller]
  query-concurrency = 0
  query-initial-memory-bytes = 0
  query-max-memory-bytes = 0
  total-max-memory-bytes = 0
  query-queue-size = 0
```

Also use the prometheus metrics in debug/vars, here is a sample:

```
"query_control_all_active": {"name":"query_control_all_active","tags":null,"values":{"gauge":0}},
"query_control_all_duration_seconds": {"name":"query_control_all_duration_seconds","tags":null,"values":{"0.001":0,"0.005":0,"0.025":0,"0.125":0,"0.625":0,"15.625":2,"3.125":2,"count":2,"sum":2.9953034240000003}},
"query_control_compiling_active": {"name":"query_control_compiling_active","tags":null,"values":{"gauge":0}},
"query_control_compiling_duration_seconds": {"name":"query_control_compiling_duration_seconds","tags":null,"values":{"0.001":2,"0.005":2,"0.025":2,"0.125":2,"0.625":2,"15.625":2,"3.125":2,"count":2,"sum":0.0010411650000000001}},
"query_control_executing_active": {"name":"query_control_executing_active","tags":null,"values":{"gauge":0}},
"query_control_executing_duration_seconds": {"name":"query_control_executing_duration_seconds","tags":null,"values":{"0.001":0,"0.005":0,"0.025":0,"0.125":0,"0.625":0,"15.625":2,"3.125":2,"count":2,"sum":2.994032791}},
"query_control_memory_unused_bytes": {"name":"query_control_memory_unused_bytes","tags":null,"values":{"gauge":0}},
"query_control_queueing_active": {"name":"query_control_queueing_active","tags":null,"values":{"gauge":0}},
"query_control_queueing_duration_seconds": {"name":"query_control_queueing_duration_seconds","tags":null,"values":{"0.001":2,"0.005":2,"0.025":2,"0.125":2,"0.625":2,"15.625":2,"3.125":2,"count":2,"sum":0.000087963}},
"query_control_requests_total": {"name":"query_control_requests_total","tags":null,"values":{"counter":1}},
"query_control_requests_total:1": {"name":"query_control_requests_total","tags":null,"values":{"counter":1}}
```

* chore: update changelog

* fix: shorten metric names for query control

* fix: zaptest logger and goimports

* fix: races in the query controller

Previously some tests were failing due to logging after the end of the test.
@@ -130,10 +131,23 @@ func (c *Config) complete() (Config, error) {
}

func (c *Config) validate(isComplete bool) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation of config - for docs team

ExecutorDependencies: []flux.Dependency{storageDep},
}
srv.Handler.Controller, err = control.New(config)
srv.Handler.Controller, err = control.New(s.config.FluxController, s.Logger.With(zap.String("service", "flux-controller")), []flux.Dependency{storageDep})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Controller configuration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants