Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unsupported config override for maxconn #638

Conversation

Miciah
Copy link
Contributor

@Miciah Miciah commented Jul 22, 2021

This PR bundles (in logically separate commits) a fix to the reload interval override, a small fix to the E2E tests, and a new unsupported config override for ROUTER_MAX_CONNECTIONS.

waitForDeploymentComplete: Actually use timeout arg

  • test/e2e/operator_test.go (waitForDeploymentEnvVar): Use the provided timeout parameter instead of a hard-coded value of 1 minute.

Specify a time unit in RELOAD_INTERVAL

Specify a time unit suffix "s" for seconds in the value of the RELOAD_INTERVAL environment variable in router deployments. Omitting the time unit causes the following warning:

router "msg"="invalid interval, using default"  "default"=5 "interval"="5" "name"="RELOAD_INTERVAL"
  • pkg/operator/controller/ingress/deployment.go (desiredRouterDeployment): Add "s" time unit suffix to the reload interval.
  • pkg/operator/controller/ingress/deployment_test.go (TestDesiredRouterDeployment):
  • test/e2e/operator_test.go (TestReloadIntervalUnsupportedConfigOverride): Expect the time unit suffix.

Add unsupported config override for maxconn

Add an unsupported config override for setting HAProxy's global maxconn setting. If the config override is unspecified, the current default of 20000 is used. If the config override has the value "-1", the maxconn setting is omitted so that HAProxy automatically detects a reasonable value based on the value of ulimit -n.

  • pkg/operator/controller/ingress/deployment.go (RouterMaxConnectionsEnvName): New const.
    (desiredRouterDeployment): Add unsupported config override for ROUTER_MAX_CONNECTIONS.
  • pkg/operator/controller/ingress/deployment_test.go (TestDesiredRouterDeployment): Verify that desiredRouterDeployment sets ROUTER_MAX_CONNECTIONS as expected.
  • test/e2e/operator_test.go (TestMaxConnectionsUnsupportedConfigOverride): Verify that the operator sets ROUTER_MAX_CONNECTIONS to the appropriate value based on the value specified in the unsupported configuration override if an override is specified.

test/e2e: Get ingresscontroller before update

When updating an ingresscontroller, do a get immediately before doing the update.

This is a cleanup and should have no significant functional effect.

  • test/e2e/operator_test.go (TestProxyProtocolAPI, TestLoadBalancingAlgorithmUnsupportedConfigOverride, TestDynamicConfigManagerUnsupportedConfigOverride, TestReloadIntervalUnsupportedConfigOverride): Get ingresscontroller immediately before updating it.

test/e2e: Ignore some not-found errors on delete

Ignore not-found errors when deleting client pods. These pods run and then exit, and it is possible that they get garbage collected before the test code deletes them, in which case it is expected that they be not found.

  • test/e2e/canary_test.go (TestCanaryRoute):
  • test/e2e/client_tls_test.go (TestClientTLS):
  • test/e2e/http_header_buffer_test.go (TestHTTPHeaderBufferSize):
  • test/e2e/http_header_name_case_adjustment_test.go (TestHeaderNameCaseAdjustment):
  • test/e2e/operator_test.go (TestHTTPHeaderCapture, TestHTTPCookieCapture, TestUniqueIdHeader): Ignore not-found errors when deleting client pods.

Setting a config override of maxConnections: -1 requires openshift/router#304 to work properly.

* test/e2e/operator_test.go (waitForDeploymentEnvVar): Use the provided
timeout parameter instead of a hard-coded value of 1 minute.
Specify a time unit suffix "s" for seconds in the value of the
RELOAD_INTERVAL environment variable in router deployments.  Omitting the
time unit causes the following warning:

    router "msg"="invalid interval, using default"  "default"=5 "interval"="5" "name"="RELOAD_INTERVAL"

* pkg/operator/controller/ingress/deployment.go (desiredRouterDeployment):
Add "s" time unit suffix to the reload interval.
* pkg/operator/controller/ingress/deployment_test.go
(TestDesiredRouterDeployment):
* test/e2e/operator_test.go (TestReloadIntervalUnsupportedConfigOverride):
Expect the time unit suffix.
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 22, 2021
@frobware
Copy link
Contributor

error getting cluster version: Get "https://api.ci-op-3v7xm0xs-2945d.origin-ci-int-aws.dev.rhcloud.com:6443/apis/config.openshift.io/v1/clusterversions/version": dial tcp 52.200.26.31:6443: i/o timeout
ClusterID:
ClusterVersion: Installing "" for :
error getting cluster operators: Get "https://api.ci-op-3v7xm0xs-2945d.origin-ci-int-aws.dev.rhcloud.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp 52.200.26.31:6443: i/o timeout
ClusterOperators:
clusteroperators are missing

/retest

@frobware
Copy link
Contributor

Failure:

  • Operator degraded (APIServerDeployment_UnavailablePod::OAuthServerDeployment_UnavailablePod): APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver (container is not ready in apiserver-86489765cf-jvf27 pod) OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication (container is not ready in oauth-openshift-9b57696b7-4nnkj pod)

/retest

@frobware
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 23, 2021
@frobware
Copy link
Contributor

Looks like cluster is not stood up:

  • Operator unavailable (null): Cluster not available for 4.9.0-0.ci.test-2021-07-23-101951-ci-op-dqg56vm1-latest

/retest

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

Add an unsupported config override for setting HAProxy's global maxconn
setting.  If the config override is unspecified, the current default of
20000 is used.  If the config override has the value "-1", the maxconn
setting is omitted so that HAProxy automatically detects a reasonable value
based on the value of ulimit -n.

* pkg/operator/controller/ingress/deployment.go
(RouterMaxConnectionsEnvName): New const.
(desiredRouterDeployment): Add unsupported config override for
ROUTER_MAX_CONNECTIONS.
* pkg/operator/controller/ingress/deployment_test.go
(TestDesiredRouterDeployment): Verify that desiredRouterDeployment sets
ROUTER_MAX_CONNECTIONS as expected.
* test/e2e/operator_test.go (TestMaxConnectionsUnsupportedConfigOverride):
Verify that the operator sets ROUTER_MAX_CONNECTIONS to the appropriate
value based on the value specified in the unsupported configuration
override if an override is specified.
When updating an ingresscontroller, do a get immediately before doing the
update.

This is a cleanup and should have no significant functional effect.

* test/e2e/operator_test.go (TestProxyProtocolAPI)
(TestLoadBalancingAlgorithmUnsupportedConfigOverride)
(TestDynamicConfigManagerUnsupportedConfigOverride)
(TestReloadIntervalUnsupportedConfigOverride): Get ingresscontroller
immediately before updating it.
@Miciah
Copy link
Contributor Author

Miciah commented Jul 23, 2021

Working on a fix for this:

=== RUN   TestMaxConnectionsUnsupportedConfigOverride
    operator_test.go:2191: failed to update ingresscontroller: Operation cannot be fulfilled on ingresscontrollers.operator.openshift.io "max-connections": the object has been modified; please apply your changes to the latest version and try again
    panic.go:613: deleted ingresscontroller max-connections
--- FAIL: TestMaxConnectionsUnsupportedConfigOverride (37.07s)

Ignore not-found errors when deleting client pods.  These pods run and then
exit, and it is possible that they get garbage collected before the test
code deletes them, in which case it is expected that they be not found.

* test/e2e/canary_test.go (TestCanaryRoute):
* test/e2e/client_tls_test.go (TestClientTLS):
* test/e2e/http_header_buffer_test.go (TestHTTPHeaderBufferSize):
* test/e2e/http_header_name_case_adjustment_test.go
(TestHeaderNameCaseAdjustment):
* test/e2e/operator_test.go (TestHTTPHeaderCapture, TestHTTPCookieCapture)
(TestUniqueIdHeader): Ignore not-found errors when deleting client pods.
@Miciah Miciah force-pushed the add-unsupported-config-override-for-maxconn branch from e0519a0 to 8041f49 Compare July 23, 2021 14:54
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 23, 2021
@frobware
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 23, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 23, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: frobware, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit e2cdf40 into openshift:master Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants