Skip to content

Commit

Permalink
packets: Check connection liveness before writing query
Browse files Browse the repository at this point in the history
This commit contains a potential fix to the issue reported in #657.

As a summary: when a MySQL server kills a connection on the server-side
(either because it is actively pruning connections, or because the
connection has hit the server-side timeout), the Go MySQL client does
not immediately become aware of the connection being dead.

Because of the way TCP works, the client cannot know that the connection
has received a RST packet from the server (i.e. the server-side has
closed) until it actually reads from it. This causes an unfortunate bug
wherein a MySQL idle connection is pulled from the connection pool, a
query packet is written to it without error, and then the query fails
with an "unexpected EOF" error when trying to read the response packet.

Since the initial write to the socket does not fail with an error, it is
generally not safe to return `driver.ErrBadConn` when the read fails,
because in theory the write could have arrived to the server and could
have been committed. Returning `ErrBadConn` could lead to duplicate
inserts on the database and data corruption because of the way the Go
SQL package performs retries.

In order to significantly reduce the circumstances where this
"unexpected EOF" error is returned for stale connections, this commit
performs a liveness check before writing a new query.

When do we check?
-----------------

This check is not performed for all writes. Go 1.10 introduced a new
`sql/driver` interface called `driver.SessionResetter`, which calls the
`ResetSession` method on any connections _when they are returned to the
connection pool_. Since performing the liveness check during
`ResetSession` is not particularly useful (the connection can spend a
long time in the pool before it's checked out again, and become stale),
we simply mark the connection with a `reset` flag instead.

This `reset` flag is then checked from `mysqlConn.writePacket` to
perform the liveness checks. This ensures that the liveness check will
only be performed for the first query on a connection that has been
checked out of the connection pool. These are pretty much the semantics
we want: a fresh connection from the pool is more likely to be stale,
and it has not performed any previous writes that could cause data
corruption. If a connection is being consistently used by the client
(i.e. through an open transaction), we do NOT perform liveness checks.
If MySQL Server kills such active connection, we want to bubble up the
error to the user because any silent retrying can and will lead to data
corruption.

Since the `ResetSession` interface is only available in Go 1.10+, the
liveness checks will only be performed starting with that Go version.

How do we check?
----------------

To perform the actual liveness test on the connection, we use the new
`syscall.Conn` interface which is available for all `net.Conn`s since Go
1.9. The `SyscallConn` method returns a `RawConn` that lets us read
directly from the connection's file descriptor using syscalls, and
skipping the default read pipeline of the Go runtime.

When reading directly from the file descriptor using `syscall.Read`, we
pass in a 1-length buffer, as passing a 0-length buffer will always
result in a 0-length read, and the 1-length buffer will never be filled
because we're not expecting any reads from MySQL before we have written
any request packets in a fresh connection.

All sockets created in the Go runtime are set to non-blocking
(O_NONBLOCK). Consequently, we can detect a socket that has been closed
on the server-side because the `read` syscall will return a 0-length read
_and_ no error.

We assume that any other errors returned from the `read` also mean the
connection is in a bad state, except for `EAGAIN`/`EWOULDBLOCK`, which is
the expected return for a healthy non-blocking socket in this circumstance.

Because of the dependency on `syscall.Conn`, liveness checks can only be
performed in Go 1.9+. This restriction however overlaps with the fact
that we only mark connections as having been reset in Go 1.10+, as
explained in the previous section.
  • Loading branch information
vmg committed Mar 12, 2019
1 parent 2c9d54f commit 9ec0a8f
Show file tree
Hide file tree
Showing 6 changed files with 110 additions and 0 deletions.
1 change: 1 addition & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ Zhenye Xie <xiezhenye at gmail.com>

Barracuda Networks, Inc.
Counting Ltd.
GitHub Inc.
Google Inc.
InfoSum Ltd.
Keybase Inc.
Expand Down
17 changes: 17 additions & 0 deletions conncheck.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
// Go MySQL Driver - A MySQL-Driver for Go's database/sql package
//
// Copyright 2012 The Go-MySQL-Driver Authors. All rights reserved.
//
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this file,
// You can obtain one at http://mozilla.org/MPL/2.0/.

// +build !go1.9

package mysql

import "net"

func connCheck(c net.Conn) error {
return nil
}
53 changes: 53 additions & 0 deletions conncheck_go19.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
// Go MySQL Driver - A MySQL-Driver for Go's database/sql package
//
// Copyright 2012 The Go-MySQL-Driver Authors. All rights reserved.
//
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this file,
// You can obtain one at http://mozilla.org/MPL/2.0/.

// +build go1.9

package mysql

import (
"errors"
"io"
"net"
"syscall"
)

var errUnexpectedRead = errors.New("unexpected read from socket")

func connCheck(c net.Conn) error {
var (
n int
err error
buff [1]byte
)

sconn, ok := c.(syscall.Conn)
if !ok {
return nil
}
rc, err := sconn.SyscallConn()
if err != nil {
return err
}
rerr := rc.Read(func(fd uintptr) bool {
n, err = syscall.Read(int(fd), buff[:])
return true
})
switch {
case rerr != nil:
return rerr
case n == 0 && err == nil:
return io.EOF
case n > 0:
return errUnexpectedRead
case err == syscall.EAGAIN || err == syscall.EWOULDBLOCK:
return nil
default:
return err
}
}
3 changes: 3 additions & 0 deletions connection.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
type mysqlConn struct {
buf buffer
netConn net.Conn
rawConn net.Conn
affectedRows uint64
insertId uint64
cfg *Config
Expand All @@ -40,6 +41,7 @@ type mysqlConn struct {
finished chan<- struct{}
canceled atomicError // set non-nil if conn is canceled
closed atomicBool // set when conn is closed, before closech is closed
reset atomicBool // set when the Go SQL package calls ResetSession
}

// Handles parameters set in DSN after the connection is established
Expand Down Expand Up @@ -639,5 +641,6 @@ func (mc *mysqlConn) ResetSession(ctx context.Context) error {
if mc.closed.IsSet() {
return driver.ErrBadConn
}
mc.reset.Set(true)
return nil
}
22 changes: 22 additions & 0 deletions driver_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2938,3 +2938,25 @@ func TestValuerWithValueReceiverGivenNilValue(t *testing.T) {
// This test will panic on the INSERT if ConvertValue() does not check for typed nil before calling Value()
})
}

func TestStaleConnectionChecks(t *testing.T) {
runTests(t, dsn, func (dbt *DBTest) {
dbt.mustExec("SET @@SESSION.wait_timeout = 2")

if err := dbt.db.Ping(); err != nil {
dbt.Fatal(err)
}

// wait for MySQL to close our connection
time.Sleep(3 * time.Second)

tx, err := dbt.db.Begin()
if err != nil {
dbt.Fatal(err)
}

if err := tx.Rollback(); err != nil {
dbt.Fatal(err)
}
})
}
14 changes: 14 additions & 0 deletions packets.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,19 @@ func (mc *mysqlConn) writePacket(data []byte) error {
return ErrPktTooLarge
}

if mc.reset.IsSet() {
mc.reset.Set(false)
conn := mc.netConn
if mc.rawConn != nil {
conn = mc.rawConn
}
if err := connCheck(conn); err != nil {
errLog.Print("closing bad idle connection: ", err)
mc.Close()
return driver.ErrBadConn
}
}

for {
var size int
if pktLen >= maxPacketSize {
Expand Down Expand Up @@ -332,6 +345,7 @@ func (mc *mysqlConn) writeHandshakeResponsePacket(authResp []byte, plugin string
if err := tlsConn.Handshake(); err != nil {
return err
}
mc.rawConn = mc.netConn
mc.netConn = tlsConn
mc.buf.nc = tlsConn
}
Expand Down

0 comments on commit 9ec0a8f

Please sign in to comment.