Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud Events delivery leaks open connections #3190

Closed
afrittoli opened this issue Sep 10, 2020 · 5 comments · Fixed by #3201
Closed

Cloud Events delivery leaks open connections #3190

afrittoli opened this issue Sep 10, 2020 · 5 comments · Fixed by #3201
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@afrittoli
Copy link
Member

afrittoli commented Sep 10, 2020

Expected Behavior

The pipeline controller does not leak open connection.
Possible fixes:

  • re-use the HTTP client across reconciles
  • do not keep idle connections open
  • a combination of the two

Actual Behavior

As described in tektoncd/triggers#687, the pipeline controller seems to create a new HTTP client on every reconcile. The transport associated to it keeps the connections open.

Steps to Reproduce the Problem

  1. This is reproducible in the upstream CI/CD based Tekton (dogfooding)
  2. Configure pipeline to send cloud events to an event listener sink
  3. Run many taskruns and pipelineruns

Additional Info

See tektoncd/triggers#687 for more details

  • Kubernetes version:
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-gke.1", GitCommit:"688c6543aa4b285355723f100302d80431e411cc", GitTreeState:"clean", BuildDate:"2020-07-21T02:37:26Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}
  • Tekton Pipeline version:
Pipeline version: v0.15.2
Triggers version: v0.7.0
@afrittoli afrittoli added the kind/bug Categorizes issue or PR as related to a bug. label Sep 10, 2020
@afrittoli afrittoli self-assigned this Sep 10, 2020
@afrittoli
Copy link
Member Author

The HTTP client is only created once, so that cannot be changed.
I tried setting the target when the client is created but that did not help. The number of established connection just keeps growing:

$ k exec -it pod/tekton-pipelines-controller-86c98684cc-5g2x5 -n tekton-pipelines -- sh
/home/nonroot $ while true; do CONNS=$(netstat -a | grep el-tekton-events | grep -c EST); printf "${CONNS} "; sleep 2; done
23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 23 36 44 53 59 70 81 97 108 119 132 144 158 173 188 202 213 220 229 243 262 271 280 290 296 304 321 334 346 360 375 390 404 419 432 450 469 486 496 516 530 541 549 556 564 584 599 613 629 639 649 660 672 678 691 705 720 729 734 742 745 745 745 745 745 745 745 745 745 745 745 745 745 745 745 745 752 773 787 805 816 827 838 847 855 867 878 887 895 898 898 898 898 898 898 898 898 898 898 898 898 898 898 898 898 898 918 935 950 960 974 984 994 1003 1013 1022 1032 1036 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039 1039

@afrittoli
Copy link
Member Author

@n3wscott FYI

@n3wscott
Copy link
Contributor

any way to make a test in the sdk to detect this? a test would help fix it

@afrittoli
Copy link
Member Author

This small change works, but also it does not re-use connections:

-       p, err := cloudevents.NewHTTP()
+       var useOnceTrasport http.RoundTripper = &http.Transport{
+               DisableKeepAlives: true,
+       }

+       p, err := cloudevents.NewHTTP(cloudevents.WithRoundTripper(useOnceTransport))

@afrittoli
Copy link
Member Author

I've noticed that the max idle connection in the transport seems to be ignored to, which makes me think that those connections are not considered idle for some reason.

afrittoli added a commit to afrittoli/pipeline that referenced this issue Sep 10, 2020
Disable keep alive forces the HTTP client to drop the connection
once a response is received. This avoids building up large numbers
of idle connections and it fixes the immediate issue.

After this we may want to see how to ensure we can re-use connection
and also set and idle-connection timeout.

Fixes tektoncd#3190
afrittoli added a commit to afrittoli/pipeline that referenced this issue Sep 10, 2020
Disable keep alive forces the HTTP client to drop the connection
once a response is received. This avoids building up large numbers
of idle connections and it fixes the immediate issue.

After this we may want to see how to ensure we can re-use connection
and also set and idle-connection timeout.

Fixes tektoncd#3190
afrittoli added a commit to afrittoli/pipeline that referenced this issue Sep 10, 2020
Disable keep alive forces the HTTP client to drop the connection
once a response is received. This avoids building up large numbers
of idle connections and it fixes the immediate issue.

After this we may want to see how to ensure we can re-use connection
and also set and idle-connection timeout.

Fixes tektoncd#3190
afrittoli added a commit to afrittoli/pipeline that referenced this issue Sep 11, 2020
Disable keep alive forces the HTTP client to drop the connection
once a response is received. This avoids building up large numbers
of idle connections and it fixes the immediate issue.

After this we may want to see how to ensure we can re-use connection
and also set and idle-connection timeout.

Fixes tektoncd#3190
tekton-robot pushed a commit that referenced this issue Sep 11, 2020
Disable keep alive forces the HTTP client to drop the connection
once a response is received. This avoids building up large numbers
of idle connections and it fixes the immediate issue.

After this we may want to see how to ensure we can re-use connection
and also set and idle-connection timeout.

Fixes #3190
bobcatfish pushed a commit to bobcatfish/pipeline that referenced this issue Sep 11, 2020
Disable keep alive forces the HTTP client to drop the connection
once a response is received. This avoids building up large numbers
of idle connections and it fixes the immediate issue.

After this we may want to see how to ensure we can re-use connection
and also set and idle-connection timeout.

Fixes tektoncd#3190

(cherry picked from commit 7a9a85b)
tekton-robot pushed a commit that referenced this issue Sep 11, 2020
Disable keep alive forces the HTTP client to drop the connection
once a response is received. This avoids building up large numbers
of idle connections and it fixes the immediate issue.

After this we may want to see how to ensure we can re-use connection
and also set and idle-connection timeout.

Fixes #3190

(cherry picked from commit 7a9a85b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants