New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support system variable wait_timeout. (#8245) #8346

Merged
merged 3 commits into from Nov 28, 2018

Conversation

Projects
None yet
5 participants
@hhxcc
Contributor

hhxcc commented Nov 17, 2018

What problem does this PR solve?

close client connection when the server waited for a long time , support system variable wait_timeout to define the wait time. #8245

What is changed and how it works?

use TCPConn.SetReadDeadline() before readPacket()

Check List

Tests

  • Integeration test

Code changes

  • Has exported variable/fields change

Related changes

  • Need to be included in the release note

This change is Reviewable

@sre-bot

This comment has been minimized.

sre-bot commented Nov 17, 2018

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

@shenli shenli added the contribution label Nov 17, 2018

@shenli

This comment has been minimized.

Member

shenli commented Nov 17, 2018

@hhxcc Thanks! Please add a test case.

@shenli

This comment has been minimized.

Member

shenli commented Nov 17, 2018

/ok-to-test

@shenli

This comment has been minimized.

Member

shenli commented Nov 17, 2018

/run-all-tests

Show resolved Hide resolved server/conn.go Outdated
@hhxcc

This comment has been minimized.

Contributor

hhxcc commented Nov 18, 2018

@shenli I can only see a litter test case in conn_test.go, could you give some advise to write test case ?

@hhxcc hhxcc changed the title from WIP: support system variable wait_timeout. (#8245) to [WIP] support system variable wait_timeout. (#8245) Nov 18, 2018

@hhxcc hhxcc force-pushed the hhxcc:wait_timeout branch from b5012a6 to c61a92a Nov 19, 2018

@shenli

This comment has been minimized.

Member

shenli commented Nov 19, 2018

@lysu Please help @hhxcc to add a test case.

@hhxcc hhxcc force-pushed the hhxcc:wait_timeout branch 3 times, most recently from 978f6f7 to 557114f Nov 20, 2018

@hhxcc

This comment has been minimized.

Contributor

hhxcc commented Nov 20, 2018

@lysu PTAL . I have do some refactor, add one test case , bu still do not know how to test the function readPacketWithTimeout.

@lysu

This comment has been minimized.

Member

lysu commented Nov 20, 2018

@hhxcc I will try to do it in tomorrow tks~

@lysu

This comment has been minimized.

Member

lysu commented Nov 21, 2018

/run-all-tests tidb-test=pr/654

@lysu

This comment has been minimized.

Member

lysu commented Nov 21, 2018

func TestConnectionWaitTimeout(t *testing.T) {
 	// this test does not use sql.database, it uses the driver directly
 	if !available {
 		t.Skipf("MySQL server not running on %s", netAddr)
 	}

 	idleDB, err := sql.Open("mysql", dsn)
 	if err != nil {
 		t.Fatalf("error connecting: %s", err.Error())
 	}
 	defer idleDB.Close()
 	_, err = idleDB.Exec("create database if not exists waittimeouttest")
 	assertNoError(t, err)
 	_, err = idleDB.Exec("use waittimeouttest")
 	assertNoError(t, err)

 	checkDB, err := sql.Open("mysql", dsn)
 	if err != nil {
 		t.Fatalf("error connecting: %s", err.Error())
 	}
 	defer checkDB.Close()
 	r := checkDB.QueryRow("select count(1) from information_schema.processlist where db = 'waittimeouttest'")
 	var pc int
 	err = r.Scan(&pc)
 	assertNoError(t, err)
 	assertEqual(t, pc, 1)

 	_, err = idleDB.Exec("set @@wait_timeout=1")
 	assertNoError(t, err)

 	time.Sleep(2*time.Second)

 	r = checkDB.QueryRow("select count(1) from information_schema.processlist where db = 'waittimeouttest'")
 	err = r.Scan(&pc)
 	assertNoError(t, err)
 	assertEqual(t, pc, 0)
 }

@hhxcc I just add a simple testcase in integeration repo which will setup a tidbserver and test them via network.

@hhxcc

This comment has been minimized.

Contributor

hhxcc commented Nov 21, 2018

@lysu thanks a lot.

if !strings.Contains(errStack, "use of closed network connection") {
if strings.Contains(errStack, "i/o timeout") {
log.Infof("con:%d read packet timeout, close this connection.", cc.connectionID)
} else if !strings.Contains(errStack, "use of closed network connection") {

This comment has been minimized.

@lysu

lysu Nov 21, 2018

Member

It's better use this way to detect timeout error

		start := time.Now()
		data, err := cc.readPacketWithTimeout()
		idleTime := time.Now().Sub(start)
		if err != nil {
			if terror.ErrorNotEqual(err, io.EOF) {
				if netErr, isNetErr :=  errors.Cause(err).(net.Error); isNetErr && netErr.Timeout() {
					log.Infof("con:%d read packet timeout, close this connection, idle: %v, timeout: %v", cc.connectionID, idleTime, cc.getSessionVarsWaitTimeout())
				} else {
					errStack := errors.ErrorStack(err)
					if !strings.Contains(errStack, "use of closed network connection") {
						log.Errorf("con:%d read packet error, close this connection %s",
							cc.connectionID, errStack)
					}
				}
			}
			return
		}

detect close maybe be improve in furture too - - golang/go#4373

@lysu

This comment has been minimized.

Member

lysu commented Nov 21, 2018

@hhxcc here, maybe we also take care about a not easy question:

now, deadline is set on readPacket, but one readPacket maybe trigger several readOnePacket then go though the buffer and trigger or not more read() call on wire and deadline works on final read.

so for a corner case: someone make a big data input to tidb-server, then mysql protocol will transfer them in multiple packet, first packet arrive time is t1, last packet arrive time is t2, and t2 - t1 > wait_time, current PR will give a error.

but I think wait_timeout should be "the timeout that no data on the wire", we should close connection only when it's idle, no matter how, read deadline should be set for per unbuffed read(), just like https://github.com/go-sql-driver/mysql/blob/6be42e0ff99645d7d9626d779001a46e39c5f280/buffer.go#L63 does and it seems need change buffer- -?

at last, we should take care situation that one conn read with deadline and later read without deadline, once deadline be set, later read need be reset new deadline or 0(no timeout) or it will got unexpect timeout error.

@hhxcc

This comment has been minimized.

Contributor

hhxcc commented Nov 21, 2018

@lysu thank you for the explanation with so much details, I will add SetReadDeadline before each readOnePacket .
If there are any other problem , please let me know.
thanks again.

@lysu

This comment has been minimized.

Member

lysu commented Nov 22, 2018

@hhxcc I think set on each readOnePacket fix mainly question, although set on ech read is more better, I'm not confirm whelther bufferedConn will introduce other question in some corner case~?

@tiancaiamao @jackysp how do you think about this?

@hhxcc hhxcc force-pushed the hhxcc:wait_timeout branch 2 times, most recently from 0bb8283 to a929e18 Nov 22, 2018

@hhxcc

This comment has been minimized.

Contributor

hhxcc commented Nov 22, 2018

@lysu PTAL SetReadDeadline has been added before each io.ReadFull , and change the way to detect timeout error.

@hhxcc hhxcc changed the title from [WIP] support system variable wait_timeout. (#8245) to support system variable wait_timeout. (#8245) Nov 26, 2018

@lysu

This comment has been minimized.

Member

lysu commented Nov 26, 2018

/run-all-tests tidb-test=pr/654

@lysu

This comment has been minimized.

Member

lysu commented Nov 26, 2018

LGTM

&

@jackysp PTAL thx

@@ -535,7 +535,7 @@ var defaultSysVars = []*SysVar{
{ScopeGlobal, "innodb_buffer_pool_size", "134217728"},
{ScopeGlobal, "innodb_adaptive_flushing", "ON"},
{ScopeNone, "datadir", "/usr/local/mysql/data/"},
{ScopeGlobal | ScopeSession, "wait_timeout", "28800"},
{ScopeGlobal | ScopeSession, WaitTimeout, "28800"},

This comment has been minimized.

@jackysp

jackysp Nov 26, 2018

Member

It's better to define a const for 28800.

This comment has been minimized.

@hhxcc

hhxcc Nov 26, 2018

Contributor

fixed

@lysu lysu added the status/LGT1 label Nov 26, 2018

@lysu

This comment has been minimized.

Member

lysu commented Nov 26, 2018

/run-sqllogic-test

@hhxcc hhxcc force-pushed the hhxcc:wait_timeout branch from c2595f2 to 6679da1 Nov 26, 2018

// WaitTimeout is the name for 'wait_timeout' system variable.
WaitTimeout = "wait_timeout"
// WaitTimeoutDefaultValue is the default value of 'wait_timeout' system variable.
WaitTimeoutDefaultValue = "28800"

This comment has been minimized.

@jackysp

jackysp Nov 27, 2018

Member

Thanks for your quick fix, @hhxcc ! But we usually define the default values of system variable at

This comment has been minimized.

@hhxcc

hhxcc Nov 28, 2018

Contributor

ok , it has been moved to tidb_vars.go

hhxcc added some commits Nov 27, 2018

@jackysp

LGTM

@jackysp jackysp merged commit bb7bb14 into pingcap:master Nov 28, 2018

4 of 5 checks passed

code-review/reviewable 7 files, 4 discussions left (jackysp, lysu)
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jenkins-ci-tidb/build Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment