driver: Detect EOF in driverError #186

MathieuBordere · 2022-04-26T10:25:13Z

Fixes #182

Signed-off-by: Mathieu Borderé mathieu.bordere@canonical.com

tomponline

Excellent thanks!

tomponline · 2022-04-26T10:32:50Z

@MathieuBordere once this is merged, please can you create a new release version and tag for this so we can update LXD's dependencies ready for LXD 5.1?

freeekanayaka · 2022-04-26T10:50:15Z

With this change you would be able to re-merge the dqlite change that was reverted, right?

tomponline · 2022-04-26T10:50:51Z

With this change you would be able to re-merge the dqlite change that was reverted, right?

Yes indeed.

MathieuBordere · 2022-04-26T10:52:58Z

But maybe I should test for EOF at the end of the other error handling? I'm just a bit confused by err being reassigned a couple of times e.g. do we get the original err back here?

go-dqlite/driver/driver.go

Line 775 in 5724311

err = root.Unwrap()

tomponline · 2022-04-26T10:54:14Z

I think that it is fine as it is as it will find EOF anywhere in the error chain (it will call Unwrap internally).

tomponline · 2022-04-26T10:55:33Z

@MathieuBordere can you fix up the test golint issues please (looks like just comment annotation issues)

https://github.com/canonical/go-dqlite/pull/186/files "Unchanged files with check annotations Beta "

freeekanayaka

Looks good to me.

It somehow reinforces the feeling that in the future we should also work on proper error propagation vs closing connections, but I'm happy that it solves the problem for now.

MathieuBordere · 2022-04-26T11:04:37Z

@MathieuBordere can you fix up the test golint issues please (looks like just comment annotation issues)

https://github.com/canonical/go-dqlite/pull/186/files "Unchanged files with check annotations Beta "

Done, the expected type, found '=' in type NodeInfo = client.NodeInfo doesn't show up locally running golint and is maybe related to golang/lint#335

freeekanayaka · 2022-04-26T11:05:10Z

But maybe I should test for EOF at the end of the other error handling? I'm just a bit confused by err being reassigned a couple of times e.g. do we get the original err back here?

go-dqlite/driver/driver.go

Line 775 in 5724311

err = root.Unwrap()

You have a good point. Yes, I do think it's a good idea to place this at the end of the chain, in the default statement, and yes you should be fine with the err object returned by root.Unwrap().

So hopefully something like:

default:
		// When using a TLS connection, the underlying error might get
		// wrapped by the stdlib itself with the new errors wrapping
		// conventions available since go 1.13. In that case we check
		// the underlying error with Unwrap() instead of Cause().
		if root, ok := err.(unwrappable); ok {
			err = root.Unwrap()
		}
		switch err.(type) {
		case *net.OpError:
                        fallthrough
		case *io.EOF:
			log(client.LogDebug, "network connection lost: %v", err)
			return driver.ErrBadConn
		}

should work.

freeekanayaka · 2022-04-26T11:10:12Z

One thing that concerns me is that I believe the proxy (both the LXD one and the go-dqlite/app one) might actually depend on getting back an io.EOF for graceful stopping and cleanup. It might be just a speculation, but the whole mechanism seems to have gotten a bit wild and might need some attention, designing and implementing a good system for leadership and network-related error handling and propagation, as we were saying in the other PR too.

tomponline · 2022-04-26T11:44:36Z

I suppose if the lxd tests pass OK then it should be fine

MathieuBordere · 2022-04-26T12:07:40Z

I suppose if the lxd tests pass OK then it should be fine

suite is running now, will report back.

tomponline · 2022-04-26T12:12:19Z

There are also tests in this repo that check that the result set iterators return io.EOF at the end of the rows, so if they pass that will be encouraging. But moving the check to the default: section shouldn't hurt either if it makes the more specific error handlers have an opportunity to check first.

MathieuBordere · 2022-04-26T12:23:03Z

I will mark this as draft for now, and investigate further before quickly patching up the issue uncovered by that dqlite PR, don't want to add to the mess.

I will handle the dqlite leadership loss in a different way than closing the connection.

tomponline · 2022-04-26T12:25:34Z

I will handle the dqlite leadership loss in a different way than closing the connection.

Which way are you thinking?

MathieuBordere · 2022-04-26T12:36:29Z

I will handle the dqlite leadership loss in a different way than closing the connection.

Which way are you thinking?

It will close the SQLite leader connection to the database, eliminating the Assert you see in the lxd testsuite, it will finish the ongoing SQL request with a LEADERSHIP_LOST and all future db queries on that connection will return LEADERSHIP_LOST causing the driver to mark the connection as bad see

go-dqlite/driver/driver.go

Line 752 in 5724311

case errIoErrLeadershipLost:

freeekanayaka · 2022-04-26T13:29:37Z

I will handle the dqlite leadership loss in a different way than closing the connection.

Which way are you thinking?

It will close the SQLite leader connection to the database, eliminating the Assert you see in the lxd testsuite, it will finish the ongoing SQL request with a LEADERSHIP_LOST and all future db queries on that connection will return LEADERSHIP_LOST causing the driver to mark the connection as bad see

go-dqlite/driver/driver.go

Line 752 in 5724311

case errIoErrLeadershipLost:

Looking again at the description of #354, I feel there might be a bit of confusion about what LEADERSHIP_LOST is supposed to mean/do.

Essentially, LEADERSHIP_LOST should be returned in case the client asks for a commit and that commit can't be executed because the node lost leadership while waiting for a quorum of the frames command log entry that it had produced as soon as it received the commit request from the client.

In that case the client can't make any assumption about whether the commit was (or will be) actually successful, because there's no way to know if a quorum of nodes had received the entry. This is basically the situation described in section 6.3 (Implementing linearizable semantics) of the Raft dissertation, and the solution indicated there would be to implement client sessions and request IDs in the FSM. With that in place, the client would retry the commit request as soon as it finds a new leader, using sessions and request IDs the new leader would know if it is a brand new commit request, or actually a duplicated one (since the original one got committed despite the client received a LEADERSHIP_LOST error). This design is also described in the dqlite docs (see the "Client sessions" section at the end of the page), but it is still to be implemented.

Now, after having described what LEADERSHIP_LOST was meant to be, let's look at the problematic sequence of events described in #354 :

Node 1 is leader, Node 2 is follower
Node 1 receives BEGIN - QUERY XXXX - ... (notice that it doesn't receive a COMMIT message)
Node 1 transfers leadership to Node 2, or loses leadership in some way

^^^ at this point what I believe should happen is:

For the client: as soon as it tries to send the COMMIT message it will get back a NOT_LEADER error, since the gateway will notice that it's not the leader anymore. The Go driver should already handle that and propagate a driver.ErrBadConn so the connection gets closed by the client itself (both the network connection and the associated internal SQLite object/connection) with no need to do anything special from the server.
For the server: as soon as it detects that the node is not anymore the leader (using the monitor_cb) it rolls back any uncommitted transactions in the VFS, using VfsAbort().

And then, proceeding in the problematic sequence of events:

Node 2 writes some data to the db
Node 1 receives the frames, from Node 2, tries to apply them to the DB and triggers an Assert due to the pending transaction in the leader connection from step 2 that is holding locks.

^^^ at this point the assertion shouldn't be triggered since the server had rolled back that pending transaction.

What I described should result in relatively small changes in the monitor_cb to issue the relevant VfsAbort() calls, and nothing else. In the future we might want to tackle the much larger problem of proper LEADERSHIP_LOST handling using client sessions and request IDs.

MathieuBordere · 2022-04-26T13:48:20Z

I will handle the dqlite leadership loss in a different way than closing the connection.

Which way are you thinking?

It will close the SQLite leader connection to the database, eliminating the Assert you see in the lxd testsuite, it will finish the ongoing SQL request with a LEADERSHIP_LOST and all future db queries on that connection will return LEADERSHIP_LOST causing the driver to mark the connection as bad see

go-dqlite/driver/driver.go

Line 752 in 5724311

case errIoErrLeadershipLost:

Looking again at the description of #354, I feel there might be a bit of confusion about what LEADERSHIP_LOST is supposed to mean/do.

Essentially, LEADERSHIP_LOST should be returned in case the client asks for a commit and that commit can't be executed because the node lost leadership while waiting for a quorum of the frames command log entry that it had produced as soon as it received the commit request from the client.

In that case the client can't make any assumption about whether the commit was (or will be) actually successful, because there's no way to know if a quorum of nodes had received the entry. This is basically the situation described in section 6.3 (Implementing linearizable semantics) of the Raft dissertation, and the solution indicated there would be to implement client sessions and request IDs in the FSM. With that in place, the client would retry the commit request as soon as it finds a new leader, using sessions and request IDs the new leader would know if it is a brand new commit request, or actually a duplicated one (since the original one got committed despite the client received a LEADERSHIP_LOST error). This design is also described in the dqlite docs (see the "Client sessions" section at the end of the page), but it is still to be implemented.

Now, after having described what LEADERSHIP_LOST was meant to be, let's look at the problematic sequence of events described in #354 :
* Node 1 is leader, Node 2 is follower

* Node 1 receives BEGIN - QUERY XXXX - ... (notice that it doesn't receive a COMMIT message)

* Node 1 transfers leadership to Node 2, or loses leadership in some way
^^^ at this point what I believe should happen is:
* For the client: as soon as it tries to send the COMMIT message it will get back a `NOT_LEADER` error, since the gateway will notice that it's not the leader anymore. The Go driver should already handle that and propagate a `driver.ErrBadConn` so the connection gets closed by the client itself (both the network connection and the associated internal SQLite object/connection) with no need to do anything special from the server.

* For the server: as soon as it detects that the node is not anymore the leader (using the `monitor_cb`) it rolls back any uncommitted transactions in the VFS, using `dqlite_vfs_abort`.
And then, proceeding in the problematic sequence of events:
* Node 2 writes some data to the db

* Node 1 receives the frames, from Node 2, tries to apply them to the DB and triggers an Assert due to the pending transaction in the leader connection from step 2 that is holding locks.
^^^ at this point the assertion shouldn't be triggered since the server had rolled back that pending transaction.

What I described should result in relatively small changes in the monitor_cb to issue the relevant dqlite_vfs_abort calls, and nothing else. In the future we might want to tackle the much larger problem of proper LEADERSHIP_LOST handling using client sessions and request IDs.

I understand your reasoning but VfsAbort doesn't get rid off the Assert, it's a read lock that is still being held in the test, it's just a read transaction.

integration-test: src/vfs.c:2372: vfsInvalidateWalIndexHeader: Assertion 'shm->shared[4] == 0' failed.

while VfsAbort basically just releases an exclusive lock
https://github.com/canonical/dqlite/blob/bc1cb2d51cb7e0c3c72d3233c3dfa5c1453d33a8/src/vfs.c#L2449

I thought that closing the sqlite connection would be a fool proof way to get rid off all the state introduced by the start of the transaction.

freeekanayaka · 2022-04-26T14:01:22Z

I understand your reasoning but VfsAbort doesn't get rid off the Assert, it's a read lock that is still being held in the test, it's just a read transaction.

integration-test: src/vfs.c:2372: vfsInvalidateWalIndexHeader: Assertion 'shm->shared[4] == 0' failed.

while VfsAbort basically just releases an exclusive lock https://github.com/canonical/dqlite/blob/bc1cb2d51cb7e0c3c72d3233c3dfa5c1453d33a8/src/vfs.c#L2449

I thought that closing the sqlite connection would be a fool proof way to get rid off all the state introduced by the start of the transaction.

Yeah, I was going to follow-up since I just realized that. Indeed VfsAbort should be used in the future for LEADERSHIP_LOST error, because in that case there will be a write lock.

In this case, instead of using VfsAbort, I'd suggest using a normal "ROLLBACK" SQL statement, e.g. sqlite3_exec(conn, "ROLLBACK", NULL, NULL, NULL). That should do it.

freeekanayaka · 2022-04-26T14:12:10Z

Closing the connection altogether, probably works too, but we'd end up in bit of a mixed state in the gateway, which feels potentially a bit fragile, or will always need some care (i.e. we have to know that there are cases where the leader object or SQLite object is not there because it was closed elsewhere). Having a single path for releasing resource usually simplifies things.

MathieuBordere · 2022-04-26T14:12:26Z

I understand your reasoning but VfsAbort doesn't get rid off the Assert, it's a read lock that is still being held in the test, it's just a read transaction.
integration-test: src/vfs.c:2372: vfsInvalidateWalIndexHeader: Assertion 'shm->shared[4] == 0' failed.
while VfsAbort basically just releases an exclusive lock https://github.com/canonical/dqlite/blob/bc1cb2d51cb7e0c3c72d3233c3dfa5c1453d33a8/src/vfs.c#L2449
I thought that closing the sqlite connection would be a fool proof way to get rid off all the state introduced by the start of the transaction.

Yeah, I was going to follow-up since I just realized that. Indeed VfsAbort should be used in the future for LEADERSHIP_LOST error, because in that case there will be a write lock.

In this case, instead of using VfsAbort, I'd suggest using a normal "ROLLBACK" SQL statement, e.g. sqlite3_exec(conn, "ROLLBACK", NULL, NULL, NULL). That should do it.

Yeah, I thought about this too, but there is a case where this can lead to inconsistent data I think.

Client issues BEGIN
Client issues some queries
Node to which client is connected loses leadership
We abort the transaction with ROLLBACK
SQLite is now in autocommit mode
Node to which client is connected regains leadership
Client continues queries (these will autocommit, while the queries from earlier will not be comitted)
Client issues COMMIT (this wil fail, as there is no BEGIN).

freeekanayaka · 2022-04-26T14:26:06Z

I understand your reasoning but VfsAbort doesn't get rid off the Assert, it's a read lock that is still being held in the test, it's just a read transaction.
integration-test: src/vfs.c:2372: vfsInvalidateWalIndexHeader: Assertion 'shm->shared[4] == 0' failed.
while VfsAbort basically just releases an exclusive lock https://github.com/canonical/dqlite/blob/bc1cb2d51cb7e0c3c72d3233c3dfa5c1453d33a8/src/vfs.c#L2449
I thought that closing the sqlite connection would be a fool proof way to get rid off all the state introduced by the start of the transaction.

Yeah, I was going to follow-up since I just realized that. Indeed VfsAbort should be used in the future for LEADERSHIP_LOST error, because in that case there will be a write lock.
In this case, instead of using VfsAbort, I'd suggest using a normal "ROLLBACK" SQL statement, e.g. sqlite3_exec(conn, "ROLLBACK", NULL, NULL, NULL). That should do it.

Yeah, I thought about this too, but there is a case where this can lead to inconsistent data I think.
* Client issues BEGIN

* Client issues some queries

* Node to which client is connected loses leadership

* We abort the transaction with ROLLBACK

* SQLite is now in autocommit mode

* Node to which client is connected regains leadership

* Client continues queries (these will autocommit, while the queries from earlier will not be comitted)

* Client issues COMMIT (this wil fail, as there is no BEGIN).

Okay, thanks for pointing this out. I actually had also another idea, which I felt was actually better but a bit more subtle. Your point makes me feel that perhaps it's actually the best approach.

Basically we would not do anything in the monitor_cb when the raft state changes, that's because there's always the chance that the node regains leadership and everything can proceed just as it was (like in your example).

Instead we should turn the assert(shm->shared[i] == 0) in vfsInvalidateWalIndexHeader() into a check: "if there are pending read transactions, rollback them, assuming they were started when this node was the leader and we do now have proof that there is another leader that has sent us new frames".

The only two things that I'm not totally sure about are:

Do we have a way to turn that "assuming they were started when this node was the leader" part into something we can actually check instead of assuming? To avoid masking genuine cases where the assertion is valid and we should blow up because the read transaction was not started when the node was a leader.
What does it mean to rollback here, probably releasing the read lock using vfsShmUnlock() should be enough, but I didn't check.

Point 1. is kind of minor, but something that would be nice to have.

freeekanayaka · 2022-04-26T14:31:36Z

One possible way to solve point 1. would maybe to use some sort of flag that gets set in the monitor_cb call when converting from leader to follower: if the node regains leadership the flag would be unset (by monitor_cb, when noticing that the new state is leader), if the node receives frames from a new leader the VFS would see that flag and positively know that it has pending state to rollback.

MathieuBordere · 2022-04-26T14:39:54Z

I don't really feel too comfortable with that, it feels a bit brittle as I don't fully understand it, I'd rather have SQLite itself do the cleanups for us for whatever state it introduced. I think that a case can be made for the fact that a connection to a leader node that at one point has lost leadership is "busted", shouldn't be reused and cleaned up.

I would propose to close the sqlite leader connection and add extra checks in functions that now assume that gateway->leader can never be NULL.

MathieuBordere · 2022-04-26T15:03:21Z

If you close the SQLite connection only (and I guess destroy the struct leader object too?), and not the network one, how do you communicate that to the client? Assuming that no information gets sent to the client within the context of the monitor_cb (but correct me if that's not what you think), then the client will eventually try to issue a COMMIT statement, and what happens in that case if the node has regained leadership in the meantime? (as per your example).

I would always reply "LEADERSHIP_LOST" in that case (or some other error message if you prefer), basically the connection will be unusable and should be interpreted that way by the client/driver.

Okay, so essentially you would set a flag on the gateway object (like the leadership_lost flag in your very first PR)? I would suggest then to not use LEADERSHIP_LOST (since it has a different semantics, i.e. "retry with new leader"), but instead use something else, e.g. SQLITE_NOTFOUND or anything more appropriate (e.g. one of the SQLITE_IOERR_XXX codes).

Or maybe no flag is needed, since the LOOKUP_DB(ID) call in gateway.c already checks for req->gateway->leader == NULL and returns SQLITE_NOTFOUND in that case.

Ok, I'll open a PR for that.

MathieuBordere · 2023-05-11T08:30:53Z

I'll give this another go, if the LXD test suite is happy, I'll probably merge it.

tomponline · 2023-05-11T08:31:37Z

Thanks! :)

MathieuBordere · 2023-05-11T15:10:56Z

Possibly related, possibly to be implemented in conjunction with this change.

https://pkg.go.dev/database/sql/driver

All Conn implementations should implement the following interfaces: Pinger, SessionResetter, and Validator.

type SessionResetter interface {
	// ResetSession is called prior to executing a query on the connection
	// if the connection has been used before. If the driver returns ErrBadConn
	// the connection is discarded.
	ResetSession(ctx [context](https://pkg.go.dev/context).[Context](https://pkg.go.dev/context#Context)) [error](https://pkg.go.dev/builtin#error)
}

type Validator interface {
	// IsValid is called prior to placing the connection into the
	// connection pool. The connection will be discarded if false is returned.
	IsValid() [bool](https://pkg.go.dev/builtin#bool)
}

MathieuBordere · 2023-05-12T07:15:57Z

I've been looking at the mysql go driver a bit, what they do, is implement SessionResetter

https://github.com/go-sql-driver/mysql/blob/081308f66228fdc51224614d1cf414c918cc1596/connection.go#L638

This will trigger a stale connection check https://github.com/go-sql-driver/mysql/blob/081308f66228fdc51224614d1cf414c918cc1596/packets.go#L106 on the first write on a connection taken from the connection pool in which they detect EOF with https://github.com/go-sql-driver/mysql/blob/081308f66228fdc51224614d1cf414c918cc1596/conncheck.go#L23

The connection will be discarded when e.g. EOF is detected.

We could do something similar in our case. Still investigating the effect of the original change proposed in this PR.

MathieuBordere · 2023-05-12T07:22:42Z

Specifically, will check if the change (and existing code) violates the rules for ErrBadCon layed out in the driver docs

ErrBadConn should only be returned from Validator, SessionResetter, or a query method if the connection is already in an invalid (e.g. closed) state.
ErrBadConn should be returned by a driver to signal to the sql package that a driver.Conn is in a bad state (such as the server having earlier closed the connection) and the sql package should retry on a new connection.
To prevent duplicate operations, ErrBadConn should NOT be returned if there's a possibility that the database server might have performed the operation. Even if the server sends back an error, you shouldn't return ErrBadConn.

edit:

e.g. I think we violate 3 in protocol.Call. When that call fails, we call driverError which could return ErrBadConn, but the send (and thus the query) could have succeeded while the recv could have failed due to a closed connection.

see

go-dqlite/internal/protocol/protocol.go

Lines 66 to 72 in 407917c

    
           if err = p.send(request); err != nil { 
        
           	return errors.Wrapf(err, "call %s (budget %s): send", desc, budget) 
        
           } 
        
           if err = p.recv(response); err != nil { 
        
           	return errors.Wrapf(err, "call %s (budget %s): receive", desc, budget) 
        
           }

freeekanayaka · 2023-05-12T12:17:38Z

Please would you recap what the issue exactly is? Like what conditions lead to it. A reproducer or a unit test would be great too, although I guess that might be tricky.

I've tried to read the comments, but I'm still a bit lost.

tomponline · 2023-05-12T12:30:33Z

Please would you recap what the issue exactly is? Like what conditions lead to it. A reproducer or a unit test would be great too, although I guess that might be tricky.

I've tried to read the comments, but I'm still a bit lost.

The original issue has a reproducer #182

tomponline · 2023-05-12T12:31:40Z

Its not as bad as it was since I added TCP user timeouts in LXD but I still see these sorts of errors for a bit when cluster members go down.

tomponline · 2023-05-12T12:32:21Z

Basically go dqlite keeps offering up a closed connection to be used.

MathieuBordere · 2023-05-12T12:41:17Z

Please would you recap what the issue exactly is? Like what conditions lead to it. A reproducer or a unit test would be great too, although I guess that might be tricky.

I've tried to read the comments, but I'm still a bit lost.

I'll try to come up with a reproducer in the unit tests, I'll understand it better myself too then. The crux of the problem is that as long as a connection is not marked with ErrBadCon then the go sql driver will not try a new connection see e.g here in go 1.18 https://cs.opensource.google/go/go/+/refs/tags/go1.18.4:src/database/sql/sql.go;drc=2580d0e08d5e9f979b943758d3c49877fb2324cb;l=1709 and in case of EOF a connection is not marked with ErrBadCon in go-dqlite.

The SessionResetter interface allows us to mark a connection with ErrBadCon when a connection is taken from the connection pool. In this case e.g. a connection to an offline leader is taken from the pool, SessionResetter will try to detect EOF and close the connection and immediately try a new one.

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

MathieuBordere · 2023-05-23T12:22:52Z

hi @tomponline , I'm having trouble running the LXD test suite on one of Stéphane's MAAS machines with this branch. It's probably more related to me doing something wrong. Could you run the LXD test suite with the contents of this branch please? Might be quicker.

tomponline · 2023-05-30T14:06:22Z

hi @tomponline , I'm having trouble running the LXD test suite on one of Stéphane's MAAS machines with this branch. It's probably more related to me doing something wrong. Could you run the LXD test suite with the contents of this branch please? Might be quicker.

Testing this now....

tomponline · 2023-05-30T14:44:47Z

@MathieuBordere clean test run here using:

cd ~
go work init
go work use ./lxd/
go work use ./go-dqlite/ # With this checked out on your branch
cd lxd
make # I could see changes in ~/go-dqlite being compiled

MathieuBordere · 2023-05-30T15:14:34Z

I'm okay to merge this if @cole-miller or @freeekanayaka don't have objections, we can always revert if something unexpected would happen. This branch also ran fine with the jepsen suite.

freeekanayaka · 2023-05-30T15:49:28Z

driver/driver.go

@@ -787,6 +787,12 @@ type unwrappable interface {
 	Unwrap() error
 }

+// TODO driver.ErrBadConn should not be returned when there's a possibility that
+// the query has been executed. In our case there is a window in protocol.Call


It probably goes without saying, but just to be sure we are on the same page: this issue should be dealt with by implementing session support in the dqlite state machine, as per paragraph 6.3 of the raft dissertation.

Until then, I'm fine converting io.EOF into driver.ErrBadConn if it turns out that it solves more problems than it creates (i.e. the chance of returning driver.ErrBadConn also if a query was executed is low, while the chance that converting io.EOF into driver.ErrBadConn solves actual real world issues by retrying is high).

Still, I would have been happier if there was an end-to-end test for this change, showing that we at least precisely understand and can reproduce the exact sequence of events around the issue. With such a test in place it would be easier to avoid regressions later down the road, especially when we'll implement sessions as per 6.3.

MathieuBordere · 2023-05-30T16:09:57Z

I'll merge this tomorrow morning (CET) and make a new release then.

MathieuBordere · 2023-05-31T10:55:14Z

@tomponline https://github.com/canonical/go-dqlite/releases/tag/v1.20.0

tomponline · 2023-05-31T11:06:52Z

@tomponline https://github.com/canonical/go-dqlite/releases/tag/v1.20.0

Many thanks! :)

tomponline approved these changes Apr 26, 2022

View reviewed changes

MathieuBordere requested a review from freeekanayaka April 26, 2022 10:35

freeekanayaka approved these changes Apr 26, 2022

View reviewed changes

MathieuBordere force-pushed the detect-EOF branch from 368e8c4 to 2322814 Compare April 26, 2022 12:06

MathieuBordere marked this pull request as draft April 26, 2022 12:23

MathieuBordere closed this Apr 26, 2022

MathieuBordere reopened this Apr 26, 2022

MathieuBordere mentioned this pull request Apr 26, 2022

Close leader sql connection canonical/dqlite#366

Merged

MathieuBordere force-pushed the detect-EOF branch from 2322814 to 25bae79 Compare May 12, 2023 10:01

MathieuBordere force-pushed the detect-EOF branch from 6d4973b to 9273fe5 Compare May 12, 2023 13:35

Mathieu Borderé added 2 commits May 15, 2023 10:16

driver: Return ErrBadConn upon EOF in driverError.

47bb557

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

driver: gofmt

1a24816

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

MathieuBordere force-pushed the detect-EOF branch from 9273fe5 to 1a24816 Compare May 22, 2023 15:17

MathieuBordere marked this pull request as ready for review May 30, 2023 15:14

freeekanayaka reviewed May 30, 2023

View reviewed changes

freeekanayaka approved these changes May 30, 2023

View reviewed changes

MathieuBordere merged commit fd457b7 into canonical:master May 31, 2023
33 checks passed

cole-miller mentioned this pull request Jun 2, 2023

Expected PASS Test failure: investigation canonical/jepsen.dqlite#102

Closed

MathieuBordere mentioned this pull request Jun 22, 2023

receive: header: EOF triggered by LXD canonical/dqlite#413

Closed

driver: Detect EOF in driverError #186

driver: Detect EOF in driverError #186

Conversation

MathieuBordere commented Apr 26, 2022

tomponline left a comment

Choose a reason for hiding this comment

tomponline commented Apr 26, 2022

freeekanayaka commented Apr 26, 2022

tomponline commented Apr 26, 2022

MathieuBordere commented Apr 26, 2022 • edited Loading

tomponline commented Apr 26, 2022 • edited Loading

tomponline commented Apr 26, 2022 • edited Loading

freeekanayaka left a comment

Choose a reason for hiding this comment

MathieuBordere commented Apr 26, 2022

freeekanayaka commented Apr 26, 2022

freeekanayaka commented Apr 26, 2022

tomponline commented Apr 26, 2022

MathieuBordere commented Apr 26, 2022

tomponline commented Apr 26, 2022

MathieuBordere commented Apr 26, 2022

tomponline commented Apr 26, 2022

MathieuBordere commented Apr 26, 2022 • edited Loading

freeekanayaka commented Apr 26, 2022 • edited Loading

MathieuBordere commented Apr 26, 2022 • edited Loading

freeekanayaka commented Apr 26, 2022

freeekanayaka commented Apr 26, 2022

MathieuBordere commented Apr 26, 2022 • edited Loading

freeekanayaka commented Apr 26, 2022 • edited Loading

freeekanayaka commented Apr 26, 2022

MathieuBordere commented Apr 26, 2022

MathieuBordere commented Apr 26, 2022

MathieuBordere commented May 11, 2023

tomponline commented May 11, 2023

MathieuBordere commented May 11, 2023

MathieuBordere commented May 12, 2023 • edited Loading

MathieuBordere commented May 12, 2023 • edited Loading

freeekanayaka commented May 12, 2023

tomponline commented May 12, 2023

tomponline commented May 12, 2023

tomponline commented May 12, 2023

MathieuBordere commented May 12, 2023 • edited Loading

MathieuBordere commented May 23, 2023

tomponline commented May 30, 2023

tomponline commented May 30, 2023

MathieuBordere commented May 30, 2023

freeekanayaka May 30, 2023

Choose a reason for hiding this comment

MathieuBordere commented May 30, 2023

MathieuBordere commented May 31, 2023

tomponline commented May 31, 2023

MathieuBordere commented Apr 26, 2022 •

edited

Loading

tomponline commented Apr 26, 2022 •

edited

Loading

tomponline commented Apr 26, 2022 •

edited

Loading

MathieuBordere commented Apr 26, 2022 •

edited

Loading

freeekanayaka commented Apr 26, 2022 •

edited

Loading

MathieuBordere commented Apr 26, 2022 •

edited

Loading

MathieuBordere commented Apr 26, 2022 •

edited

Loading

freeekanayaka commented Apr 26, 2022 •

edited

Loading

MathieuBordere commented May 12, 2023 •

edited

Loading

MathieuBordere commented May 12, 2023 •

edited

Loading

MathieuBordere commented May 12, 2023 •

edited

Loading