Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alllow for database connection, grpc worker thread pool, and k8s client level tuning of the watcher and api servers #744

Merged

Conversation

gabemontero
Copy link
Contributor

@gabemontero gabemontero commented Apr 1, 2024

Changes

pulls in #741 and now does
- add open and idle database connection pool settings to the config.env file
- but don't change golang's database/sql defaults
- add the ablity to increase GRPC worker thread poll count
- add the ability set set K8s client QPS and burst settings to the api sever via config.env and to the watcher via command line arguments

/kind feature

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you review them:

  • Has Docs included if any changes are user facing
  • Has Tests included if any functionality added or changed
  • Tested your changes locally (if this is a code change)
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including functionality, content, code)
  • Has a kind label. You can add a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user-facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings)
  • Release notes contain the string "action required" if the change requires additional action from users switching to the new release

Release Notes

The number of idle and max database connections, grpc worker pool goroutines, k8s client qps and burst,  in the api server can be now configured using environment variables.
The k8s client qps and burst settings can be set via command line arguments on the watcher
No defaults from prior settings have been changed.

@tekton-robot tekton-robot added kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Apr 1, 2024
@tekton-robot tekton-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 1, 2024
@gabemontero
Copy link
Contributor Author

@khrm @sayan-biswas @avinal @enarha PTAL

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/api/main.go 11.4% 9.6% -1.8
pkg/api/server/config/config.go Do not exist 0.0%

@gabemontero
Copy link
Contributor Author

bump @khrm @sayan-biswas can I get a look at this please ??

also, in relation to the discussions in #741 about the db pool size, if it will expedite things, I'm fine at this point setting the defaults to 2 or whatever default size will expedite this getting merged

thanks

@khrm
Copy link
Contributor

khrm commented Apr 3, 2024

@gabemontero Sure. I will take a look tomorrow.

BTW, I wanted to test the effect of db maxPool to 10.

k8scfg := injection.ParseAndGetRESTConfigOrDie()

if qps != nil {
k8scfg.QPS = float32(*qps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we multiply by the number of controllers we have? @vdemeester, we do something similar in pipelines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So see here: https://github.com/knative/pkg/blob/5d4af76051e4daa445151321f4a53953d2ce7cbb/injection/sharedmain/main.go#L241
For default, we multiply by the number of controllers. So we should multiply for values provided also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah interesting ... and yeah by comparison in core tekton there is this FIXME from @vdemeester :-) - https://github.com/tektoncd/pipeline/blob/main/cmd/controller/main.go#L72-L75

so we have 2 controller at the top of the watcher and my change here switches to MainWithConfig per

results/cmd/watcher/main.go

Lines 123 to 129 in 89b591d

sharedmain.MainWithConfig(injection.WithNamespaceScope(ctx, *namespace), "watcher", k8scfg,
func(ctx context.Context, cmw configmap.Watcher) *controller.Impl {
return pipelinerun.NewControllerWithConfig(ctx, results, cfg, cmw)
}, func(ctx context.Context, cmw configmap.Watcher) *controller.Impl {
return taskrun.NewControllerWithConfig(ctx, results, cfg, cmw)
},
)

and that there ^^ @khrm where we are now calling the MainWithConfig function you highlighted with your link https://github.com/knative/pkg/blob/5d4af76051e4daa445151321f4a53953d2ce7cbb/injection/sharedmain/main.go#L241 where we are already passing non-zero value for QPS and Burst means the code you highlighted, which I am now calling with this change, is doing the multiplication you are asking for.

So we are good here. Correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code does multiplication for zero value. To have consistent behaviour across Tekton components, we should do the multiplication.

	if cfg.Burst == 0 {
		cfg.Burst = len(ctors) * rest.DefaultBurst
	}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, knative/pkg itself should do the multiplication but I don't think that's coming anytime soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep looked too fast your right ... will update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated @khrm in separate commit to ease review

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/api/main.go 11.4% 9.6% -1.8
pkg/api/server/config/config.go Do not exist 0.0%

cmd/api/main.go Outdated
// Set DB connection limits
maxIdle := serverConfig.DB_MAX_IDLE_CONNECTIONS
maxOpen := serverConfig.DB_MAX_OPEN_CONNECTIONS
if maxOpen < 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one more problem with this code is that elides standard library behavior. There's no way to get standard library behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow you point entirely. Some clarification could help.

My best guess is that you don't want to have a default setting at all.

So that if they do not set the env vars, just do nothing, and leave the current behavior.

I'm fine with that change, if that is what you mean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I meant. Or setting 0 means we are going to use the current behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool thanks ... I've updated to PR to this effect

@khrm
Copy link
Contributor

khrm commented Apr 10, 2024

To verify this PR behavior vs default behavior, I install local DB in the cluster as we do in ./test/e2e/01-install.sh.

Then I ran tests once parallelly with 10:
Main branch results:
https://gist.github.com/khrm/62a6e24eb3b7926fd8678c7d257775e3
This PR results:
https://gist.github.com/khrm/58ba044b5dc9436a91ab5554a44ed020

I merged the main branch to this PR to avoid any discrepancy from earlier PRs. We can see that the former is more performant than the latter.

@gabemontero
Copy link
Contributor Author

To verify this PR behavior vs default behavior, I install local DB in the cluster as we do in ./test/e2e/01-install.sh.

Then I ran tests once parallelly with 10: Main branch results: https://gist.github.com/khrm/62a6e24eb3b7926fd8678c7d257775e3 This PR results: https://gist.github.com/khrm/58ba044b5dc9436a91ab5554a44ed020

I merged the main branch to this PR to avoid any discrepancy from earlier PRs. We can see that the former is more performant than the latter.

OK I will do the change I note in #744 (comment) so that the main branch behavior is not changed at all unless those env are set.

Not sure what precisely is different between your tests, my tests, and @stuartwdouglas tests but I don't think it matters at this point.

As long as we can change these settings and run tests in larger test clusters to see improvements, that is sufficient.

@gabemontero gabemontero force-pushed the stuarts-db-pool-plus-k8s-client-tuning branch from 55c6f62 to aba9aa9 Compare April 10, 2024 18:07
@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/api/main.go 11.4% 9.3% -2.1
pkg/api/server/config/config.go Do not exist 0.0%

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 10, 2024
cmd/api/README.md Outdated Show resolved Hide resolved
@sayan-biswas
Copy link
Contributor

/approve

The README needs a minor change and commits squashed.

…nt level tuning of the watcher and api servers

- add open and idle database connection pool settings to the config.env file
- but don't change golang's database/sql defaults
- add the ablity to increase GRPC worker thread poll count
- add the ability set set K8s client QPS and burst settings to the api sever via config.env and to the watcher via command line arguments

rh-pre-commit.version: 2.2.0
rh-pre-commit.check-secrets: ENABLED
@gabemontero gabemontero changed the title Add ability to set DB connection pool limits, grpc worker pool size, k8s client qps/burst for both api server and watcher alllow for database connection, grpc worker thread pool, and k8s client level tuning of the watcher and api servers Apr 11, 2024
@gabemontero gabemontero force-pushed the stuarts-db-pool-plus-k8s-client-tuning branch from aba9aa9 to 033e897 Compare April 11, 2024 13:41
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 11, 2024
| DB_ENABLE_AUTO_MIGRATION | Auto-migrate the database on startup (create/update schemas). For further details, refer to <https://gorm.io/docs/migration.html> | true (default) |
| PROFILING | Enable profiling server | false (default) |
| PROFILING_PORT | Profiling Server Port | 6060 (default) |
| DB_MAX_IDLE_CONNECTIONS | The number of idle database connections to keep open | 2 (default for golang, but specific database drivers may have settings for this too) |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the additional explanation widened the columns, and is why the git delta seem larger

use the w=1 parameter a la https://stackoverflow.com/questions/37007300/how-to-ignore-whitespace-in-github-when-comparing if you want to filter whitespace our

@gabemontero
Copy link
Contributor Author

README updated, commits squashed, PR title / description / release notes refreshed @sayan-biswas @khrm

I suggest viewing diffs with ignoring whitespace, i.e. https://github.com/tektoncd/results/pull/744/files?w=1 to filter out the widening of the READM md columns as I added more explanation about golang vs. DB driver defaults

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/api/main.go 11.4% 9.3% -2.1
pkg/api/server/config/config.go Do not exist 0.0%

Copy link
Contributor

@khrm khrm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Looks good.

@tekton-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: khrm, sayan-biswas

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@khrm
Copy link
Contributor

khrm commented Apr 16, 2024

@sayan-biswas @avinal Can you take a cursory look? I think it looks good. We can give lgtm here.

@gabemontero
Copy link
Contributor Author

@sayan-biswas @avinal Can you take a cursory look? I think it looks good. We can give lgtm here.

@sayan-biswas did previously apply /approve with #744 (comment)

But of course if either @sayan-biswas or @avinal could take another look soon that would be great.

@sayan-biswas
Copy link
Contributor

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 16, 2024
@tekton-robot tekton-robot merged commit 714dfd2 into tektoncd:main Apr 16, 2024
5 of 6 checks passed
@gabemontero gabemontero deleted the stuarts-db-pool-plus-k8s-client-tuning branch April 16, 2024 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants