Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

packets: Check connection liveness before writing query #934

Merged
merged 1 commit into from
Mar 29, 2019

Commits on Mar 20, 2019

  1. packets: Check connection liveness before writing query

    This commit contains a potential fix to the issue reported in go-sql-driver#657.
    
    As a summary: when a MySQL server kills a connection on the server-side
    (either because it is actively pruning connections, or because the
    connection has hit the server-side timeout), the Go MySQL client does
    not immediately become aware of the connection being dead.
    
    Because of the way TCP works, the client cannot know that the connection
    has received a RST packet from the server (i.e. the server-side has
    closed) until it actually reads from it. This causes an unfortunate bug
    wherein a MySQL idle connection is pulled from the connection pool, a
    query packet is written to it without error, and then the query fails
    with an "unexpected EOF" error when trying to read the response packet.
    
    Since the initial write to the socket does not fail with an error, it is
    generally not safe to return `driver.ErrBadConn` when the read fails,
    because in theory the write could have arrived to the server and could
    have been committed. Returning `ErrBadConn` could lead to duplicate
    inserts on the database and data corruption because of the way the Go
    SQL package performs retries.
    
    In order to significantly reduce the circumstances where this
    "unexpected EOF" error is returned for stale connections, this commit
    performs a liveness check before writing a new query.
    
    When do we check?
    -----------------
    
    This check is not performed for all writes. Go 1.10 introduced a new
    `sql/driver` interface called `driver.SessionResetter`, which calls the
    `ResetSession` method on any connections _when they are returned to the
    connection pool_. Since performing the liveness check during
    `ResetSession` is not particularly useful (the connection can spend a
    long time in the pool before it's checked out again, and become stale),
    we simply mark the connection with a `reset` flag instead.
    
    This `reset` flag is then checked from `mysqlConn.writePacket` to
    perform the liveness checks. This ensures that the liveness check will
    only be performed for the first query on a connection that has been
    checked out of the connection pool. These are pretty much the semantics
    we want: a fresh connection from the pool is more likely to be stale,
    and it has not performed any previous writes that could cause data
    corruption. If a connection is being consistently used by the client
    (i.e. through an open transaction), we do NOT perform liveness checks.
    If MySQL Server kills such active connection, we want to bubble up the
    error to the user because any silent retrying can and will lead to data
    corruption.
    
    Since the `ResetSession` interface is only available in Go 1.10+, the
    liveness checks will only be performed starting with that Go version.
    
    How do we check?
    ----------------
    
    To perform the actual liveness test on the connection, we use the new
    `syscall.Conn` interface which is available for all `net.Conn`s since Go
    1.9. The `SyscallConn` method returns a `RawConn` that lets us read
    directly from the connection's file descriptor using syscalls, and
    skipping the default read pipeline of the Go runtime.
    
    When reading directly from the file descriptor using `syscall.Read`, we
    pass in a 1-length buffer, as passing a 0-length buffer will always
    result in a 0-length read, and the 1-length buffer will never be filled
    because we're not expecting any reads from MySQL before we have written
    any request packets in a fresh connection.
    
    All sockets created in the Go runtime are set to non-blocking
    (O_NONBLOCK). Consequently, we can detect a socket that has been closed
    on the server-side because the `read` syscall will return a 0-length read
    _and_ no error.
    
    We assume that any other errors returned from the `read` also mean the
    connection is in a bad state, except for `EAGAIN`/`EWOULDBLOCK`, which is
    the expected return for a healthy non-blocking socket in this circumstance.
    
    Because of the dependency on `syscall.Conn`, liveness checks can only be
    performed in Go 1.9+. This restriction however overlaps with the fact
    that we only mark connections as having been reset in Go 1.10+, as
    explained in the previous section.
    vmg committed Mar 20, 2019
    Configuration menu
    Copy the full SHA
    307ca13 View commit details
    Browse the repository at this point in the history