Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: event api connection may be recycled if keeps idle too long #2274

Merged

Conversation

HusterWan
Copy link
Contributor

Signed-off-by: Michael Wan zirenwan@gmail.com

If pouchd start http server with parameters below, http server will recycle the long connection if it keeps idle too long.

			s := &http.Server{
				Handler:           router,
				ErrorLog:          log.New(stdFilterLogWriter, "", 0),
				ReadTimeout:       time.Minute * 10,
				ReadHeaderTimeout: time.Minute * 10,
				WriteTimeout:      time.Minute * 10,
				IdleTimeout:       time.Minute * 10,
			}

So, i add a ticker in event api too keep the connection alive.

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

none

Ⅲ. Why don't you add test cases (unit test/integration test)? (你真的觉得不需要加测试吗?)

none

Ⅳ. Describe how to verify it

none

Ⅴ. Special notes for reviews

none

@pouchrobot pouchrobot added kind/bug This is bug report for project size/S labels Sep 25, 2018
@codecov
Copy link

codecov bot commented Sep 25, 2018

Codecov Report

Merging #2274 into master will increase coverage by 0.12%.
The diff coverage is 54.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2274      +/-   ##
==========================================
+ Coverage    66.7%   66.82%   +0.12%     
==========================================
  Files         208      208              
  Lines       16934    16937       +3     
==========================================
+ Hits        11295    11318      +23     
+ Misses       4269     4262       -7     
+ Partials     1370     1357      -13
Flag Coverage Δ
#criv1alpha1test 32.56% <54.54%> (+0.13%) ⬆️
#criv1alpha2test 36.05% <54.54%> (+0.07%) ⬆️
#integrationtest 39.45% <54.54%> (-0.07%) ⬇️
#nodee2etest 33.47% <54.54%> (+0.17%) ⬆️
#unittest 23.75% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
apis/server/system_bridge.go 50% <0%> (-3.58%) ⬇️
apis/server/router.go 85.62% <100%> (ø) ⬆️
ctrd/client.go 57.07% <83.33%> (+0.2%) ⬆️
cri/stream/httpstream/spdy/upgrade.go 54.28% <0%> (-5.72%) ⬇️
apis/server/utils.go 61.9% <0%> (-4.77%) ⬇️
daemon/containerio/cri_log_file.go 84.31% <0%> (-3.93%) ⬇️
ctrd/container.go 58.8% <0%> (-0.96%) ⬇️
daemon/mgr/container.go 57.39% <0%> (ø) ⬆️
cri/v1alpha2/cri_utils.go 90.62% <0%> (+0.29%) ⬆️
cri/v1alpha1/cri.go 61.88% <0%> (+0.33%) ⬆️
... and 8 more

// Notes(ziren): The pouchd http server set some http timeout parameters (see pouch/apis/server/server.go),
// that may cause long idle connections been recycled. So, creating a tick here to keep the
// event api alive. set the ticker to 8s because of the IdleTimeout of http server is 10s.
tickChan := time.NewTicker(time.Second * 10).C
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use time.NewTicker and defer tick.Stop() here, because we don't want to leave it active in the timer heap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid we may not need the defer because i checked the example of NewTicker when we should add defer tick.Stop() when using NewTicker, but not need the Stop when using NewTicker().C

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, when i read the test cases in time package, i realize i am wrong. we should call defer tick.Stop() when creat a ticker.

@@ -11,6 +11,7 @@ import (
"github.com/alibaba/pouch/apis/types"
"github.com/alibaba/pouch/pkg/httputils"
"github.com/alibaba/pouch/pkg/utils"
"github.com/sirupsen/logrus"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move it into next group

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

@HusterWan HusterWan force-pushed the zr/fix-events-connect branch 3 times, most recently from b266303 to 9ebdce1 Compare September 25, 2018 14:46
@pouchrobot pouchrobot added size/M and removed size/S labels Sep 25, 2018
@@ -384,15 +384,15 @@ func (c *Client) collectContainerdEvents() {
}

switch e.Topic {
case ContainersDeleteEventTopic:
cDelEvent, ok := out.(*eventstypes.ContainerDelete)
case TaskExitEventTopic:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need know the exit code when a container exited, so we can get the task exit code from containerd task/exit event

…ed if it keeps idle too long

Signed-off-by: Michael Wan <zirenwan@gmail.com>
@HusterWan
Copy link
Contributor Author

events api connection broken because of the WriteTimeout parameter of http server, when i delete the WriteTimeout, the connection never die.

Refer: golang/go#24461

Copy link
Contributor

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid fuweid merged commit 8ac6e9e into AliyunContainerService:master Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is bug report for project size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants