NE-822 Don't scale route weight on single service routes #377

gcs278 · 2022-03-22T21:36:55Z

Change to not scale weight to 256 if there is only one service for the route. More information can be found here: NE-709 Impact of Server Weight on Memory Allocation

In a nutshell, scaling the weight to 256 is redundant for single services routes since all servers in the haproxy backend will always have the same weight, so therefore weight 1 == weight 256. Reducing the weight helps with reducing the static memory allocation.

frobware · 2022-03-23T05:28:30Z

28 failures to create the sandbox

/retest

frobware · 2022-03-23T08:21:24Z

cluster bootstrap failed

/test e2e-upgrade

pkg/router/template/router.go

gcs278 · 2022-03-23T18:53:25Z

/retest

gcs278 · 2022-03-24T15:27:28Z

/retest

gcs278 · 2022-03-24T18:33:44Z

/retest

gcs278 · 2022-03-25T17:01:21Z

/retest

gcs278 · 2022-03-30T16:33:38Z

/hold

I think we should wait for perf&scale test.

gcs278 · 2022-03-31T13:26:11Z

/retest

gcs278 · 2022-04-05T13:57:15Z

/retest

Miciah · 2022-04-06T16:28:07Z

pkg/router/template/router.go

+		for key := range serviceUnits {
+			serviceUnitNames[key] = 1
+		}


What if the service unit has 0 endpoints? We should check for that case:

Suggested change

for key := range serviceUnits {

serviceUnitNames[key] = 1

}

for key := range serviceUnits {

if r.numberOfEndpoints(key) > 0 {

serviceUnitNames[key] = 1

}

}

Granted, omitting the r.numberOfEndpoints(key) > 0 check might not change the ultimate result from the config template:

{{- range $serviceUnitName, $weight := $cfg.ServiceUnitNames }} {{- if ge $weight 0 }}{{/* weight=0 is reasonable to keep existing connections to backends with cookies as we can see the HTTP headers */}} {{- with $serviceUnit := index $.ServiceUnits $serviceUnitName }} {{- range $idx, $endpoint := processEndpointsForAlias $cfg $serviceUnit (env "ROUTER_BACKEND_PROCESS_ENDPOINTS" "") }} {{/* [actual HAProxy config stuff here] */}} {{- end }}{{/* end range processEndpointsForAlias */}} {{- end }}{{/* end get serviceUnit from its name */}} {{- end }}{{/* end range over serviceUnitNames */}}

In the template, the effect is the same whether range $serviceUnitName, $weight := $cfg.ServiceUnitNames iterates 0 times or whether range $idx, $endpoint := processEndpointsForAlias $cfg $serviceUnit (env "ROUTER_BACKEND_PROCESS_ENDPOINTS" "") iterates 0 times. However, it could cause problems for the dynamic config manager, which has logic like this:

// As the endpoints have changed, recalculate the weights. newWeights := r.calculateServiceWeights(cfg.ServiceUnits) // Get the weight for this service unit. weight, ok := newWeights[id] if !ok { weight = 0 }

Anyway, from a strict correctness perspective, I believe calculateServiceWeights needs the r.numberOfEndpoints(key) > 0 check.

Please add a unit test case to ensure we don't regress:

--- a/pkg/router/template/router_test.go +++ b/pkg/router/template/router_test.go @@ -873,6 +873,16 @@ func TestCalculateServiceWeights(t *testing.T) { serviceWeights map[ServiceUnitKey]int32 expectedWeights map[ServiceUnitKey]int32 }{ + { + name: "service with no endpoint", + serviceUnits: map[ServiceUnitKey][]Endpoint{ + suKey1: {}, + }, + serviceWeights: map[ServiceUnitKey]int32{ + suKey1: 100, + }, + expectedWeights: map[ServiceUnitKey]int32{}, + }, { name: "equally weighted services with same number of endpoints", serviceUnits: map[ServiceUnitKey][]Endpoint{

Good point, didn't think about a single service with no endpoints and exposing that datapath. Agreed it doesn't functionally changing anything, but keep our code clean for the future.

Done.

gcs278 · 2022-04-07T13:50:01Z

/retest

gcs278 · 2022-04-11T14:22:14Z

/retest

gcs278 · 2022-04-12T22:40:14Z

/retest

Miciah · 2022-04-22T16:18:33Z

/lgtm
/hold cancel

openshift-bot · 2022-04-24T19:13:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-24T19:37:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-24T20:01:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-24T20:13:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-24T20:49:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-24T21:13:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-24T22:01:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-24T22:37:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T02:25:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T03:01:46Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T03:13:50Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T04:39:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T05:05:52Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T06:49:50Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

quarterpin · 2022-04-25T07:45:01Z

label /qe-approved

quarterpin · 2022-04-25T07:47:38Z

/label qe-approved

openshift-bot · 2022-04-25T08:46:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T11:43:52Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

gcs278 · 2022-04-25T14:59:59Z

/retest

openshift-bot · 2022-04-25T15:31:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T16:43:49Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-25T20:07:49Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

gcs278 · 2022-04-25T20:12:42Z

/skip

gcs278 · 2022-04-25T23:46:54Z

/label px-approved
Parent Epic NE-709 has px not needed.

openshift-bot · 2022-04-26T01:39:11Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-04-26T03:23:14Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-ci · 2022-04-26T05:08:55Z

@gcs278: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-metal-ipi-ovn-ipv6	`9656da7`	link	false	`/test e2e-metal-ipi-ovn-ipv6`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2022

openshift-ci bot requested review from candita and Miciah March 22, 2022 21:39

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 22, 2022

gcs278 force-pushed the NE822-Backend-Weight-Scale branch from 6b7042e to fc7e579 Compare March 22, 2022 21:39

frobware reviewed Mar 23, 2022

View reviewed changes

pkg/router/template/router.go Outdated Show resolved Hide resolved

gcs278 force-pushed the NE822-Backend-Weight-Scale branch from fc7e579 to 76028a0 Compare March 23, 2022 13:40

gcs278 mentioned this pull request Mar 23, 2022

NE-825: Update router to use random balancing algorithm once again openshift/cluster-ingress-operator#727

Merged

gcs278 force-pushed the NE822-Backend-Weight-Scale branch from 76028a0 to e24ca3a Compare March 25, 2022 17:51

gcs278 changed the title ~~[WIP] NE-822 Don't scale route weight on single service routes~~ NE-822 Don't scale route weight on single service routes Mar 30, 2022

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 30, 2022

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 30, 2022

Miciah reviewed Apr 6, 2022

View reviewed changes

NE-822 Don't scale route weight on single service routes

9656da7

gcs278 force-pushed the NE822-Backend-Weight-Scale branch from e24ca3a to 9656da7 Compare April 6, 2022 22:04

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 22, 2022

openshift-ci bot assigned Miciah Apr 22, 2022

openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Apr 25, 2022

openshift-ci bot added the px-approved Signifies that Product Support has signed off on this PR label Apr 25, 2022

openshift-merge-robot merged commit 585d783 into openshift:master Apr 26, 2022

NE-822 Don't scale route weight on single service routes #377

NE-822 Don't scale route weight on single service routes #377

Conversation

gcs278 commented Mar 22, 2022

frobware commented Mar 23, 2022

frobware commented Mar 23, 2022

gcs278 commented Mar 23, 2022

gcs278 commented Mar 24, 2022

gcs278 commented Mar 24, 2022

gcs278 commented Mar 25, 2022

gcs278 commented Mar 30, 2022

gcs278 commented Mar 31, 2022

gcs278 commented Apr 5, 2022

Miciah Apr 6, 2022 • edited

Choose a reason for hiding this comment

gcs278 Apr 6, 2022

Choose a reason for hiding this comment

gcs278 commented Apr 7, 2022

gcs278 commented Apr 11, 2022

gcs278 commented Apr 12, 2022

Miciah commented Apr 22, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 24, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

quarterpin commented Apr 25, 2022 • edited

quarterpin commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

gcs278 commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

openshift-bot commented Apr 25, 2022

gcs278 commented Apr 25, 2022

gcs278 commented Apr 25, 2022 • edited by openshift-ci bot

openshift-bot commented Apr 26, 2022

openshift-bot commented Apr 26, 2022

openshift-ci bot commented Apr 26, 2022

Miciah Apr 6, 2022 •

edited

quarterpin commented Apr 25, 2022 •

edited

gcs278 commented Apr 25, 2022 •

edited by openshift-ci bot