Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary not being set on failover #76

Closed
tjheeta opened this issue Aug 25, 2022 · 13 comments
Closed

Primary not being set on failover #76

tjheeta opened this issue Aug 25, 2022 · 13 comments

Comments

@tjheeta
Copy link

tjheeta commented Aug 25, 2022

When I was testing failover, I found some odd behaviour testing against two nodes (sea and mia).

  1. Assuming mia was the primary, on reboot occasionally it would re-acquire the lease. It would advertise that it the lease was acquired, but no ltx were being shipped to sea.
2022-08-25T02:30:13Z app[bfd1ef5d] mia [info]primary lease acquired, advertising as http://fdaa:0:2fff:a7b:2cc3:1:932e:2:20202

In sea, the .primary contains:

root@199f22d0:/# cat /mnt/litefs/db/.primary 
fdaa:0:2fff:a7b:2dbb:1:932d:2

And the ip address of mia is:

root@bfd1ef5d:/# cat /etc/hosts|grep 6pn
fdaa:0:2fff:a7b:2cc3:1:932e:2   bfd1ef5d.vm.litefs-liveview.internal bfd1ef5d fly-local-6pn

It looks like that sea is not acquiring the new primary information from consul?

  1. Now if we reboot mia and it does not re-acquire the lease:
2022-08-25T02:46:41Z app[bfd1ef5d] mia [info]existing primary found (fdaa:0:2fff:a7b:2cc3:1:932e:2), connecting as replica
root@199f22d0:/# env|grep REGI
FLY_REGION=sea
root@199f22d0:/# cat /etc/hosts|grep 6pn
fdaa:0:2fff:a7b:2dbb:1:932d:2   199f22d0.vm.litefs-liveview.internal 199f22d0 fly-local-6pn
root@199f22d0:/# cat /mnt/litefs/db/.primary 
fdaa:0:2fff:a7b:2dbb:1:932d:2

We do not see any lease acquisition logs from sea. Both hosts believe they are replicas. Trying to insert into the database manually gives:

root@199f22d0:/# sqlite3 /mnt/litefs/db/testdb.sqlite 
SQLite version 3.34.1 2021-01-20 14:10:07
Enter ".help" for usage hints.
sqlite> insert into counts values(2, "ams", 1);
Error: unable to open database file
2022-08-25T02:59:01Z app[199f22d0] sea [info]fuse: create(): cannot create journal: read only replica

root@bfd1ef5d:/# sqlite3 /mnt/litefs/db/testdb.sqlite 
SQLite version 3.34.1 2021-01-20 14:10:07
Enter ".help" for usage hints.
sqlite> insert into counts values(5, "ams", 1);
Error: unable to open database file
2022-08-25T03:01:53Z app[bfd1ef5d] mia [info]fuse: create(): cannot create journal: read only replica

Also, I built a library for elixir assuming the .primary file would be deleted on after the lease was acquired on the primary. Essentially, it connects to all the nodes and checks if the .primary does not exist. This way if the user is on different platforms, etc. we don't have to parse the contents of the file to figure out the location of the primary. Can you confirm that is the expected behaviour? And if it isn't, can it be changed so only the replicas have the .primary?

Thanks.

@tjheeta
Copy link
Author

tjheeta commented Aug 25, 2022

More info. So it looks like by using the ip address as the hostname, it does not identify itself as the owner of the previous lease which causes it to consider itself a replica and connect to itself. It appears that store.go:monitorLease or store.go:acquireLeaseOrPrimaryInfo needs another check.

mia = [fdaa:0:2fff:a7b:2cc3:1:932e:2]
sea = [fdaa:0:2fff:a7b:2dbb:1:932d:2]

  1. sea connects to mia (fdaa:0:2fff:a7b:2cc3:1:932e:2). Everything works fine.

2022-08-25T05:56:30.588 app[a4585d78] sea [info] existing primary found (fdaa:0:2fff:a7b:2cc3:1:932e:2), connecting as replica
2022-08-25T05:56:30.707 app[b994b193] mia [info] stream connected 
2022-08-25T05:58:00Z app[b994b193] mia [info]send frame<ltx>: db=00000001 tx=(0000000000000058,0000000000000058) chksum=(b17c0b41bad826fd,d4f75e5965b7ed1d) size=8320
2022-08-25T05:58:00Z app[a4585d78] sea [info]recv frame<ltx>: db=1 tx=(88,88)
2022-08-25T05:58:00Z app[b994b193] mia [info]send frame<ltx>: db=00000001 tx=(0000000000000059,0000000000000059) chksum=(d4f75e5965b7ed1d,a31c373dbe16f997) size=8320
2022-08-25T05:58:00Z app[a4585d78] sea [info]recv frame<ltx>: db=1 tx=(89,89)
2022-08-25T05:58:05Z app[b994b193] mia [info]removing ltx file, per retention: db=testdb.sqlite file=000000000000004f-000000000000004f.ltx
2022-08-25T05:58:30Z app[a4585d78] sea [info]removing ltx file, per retention: db=testdb.sqlite file=0000000000000001-000000000000004f.ltx
2022-08-25T05:58:42Z runner[b994b193] mia [info]Shutting down virtual machine
2022-08-25T05:58:43Z app[b994b193] mia [info]Sending signal SIGTERM to main child process w/ PID 523
2022-08-25T05:58:43Z app[b994b193] mia [info]05:58:43.043 [notice] SIGTERM received - shutting down

  1. Restarted mia, but it connects to it's former address (fdaa:0:2fff:a7b:2cc3:1:932e:2)!
2022-08-25T05:58:50Z app[b994b193] mia [info]Using Consul to determine primary
2022-08-25T05:58:53Z app[b994b193] mia [info]existing primary found (fdaa:0:2fff:a7b:2cc3:1:932e:2), connecting as replica
2022-08-25T05:58:53Z app[b994b193] mia [info]stream connected
  1. The stream replication breaks for sea and it acquires the primary lease.
2022-08-25T05:59:01Z app[a4585d78] sea [info]replica disconnected, retrying: next frame: read tcp [fdaa:0:2fff:a7b:2dbb:1:932d:2]:55132->[fdaa:0:2fff:a7b:2cc3:1:932e:2]:20202: read: connection reset by peer
2022-08-25T05:59:03Z app[a4585d78] sea [info]cannot acquire lease or find primary, retrying: no primary
2022-08-25T05:59:04Z app[a4585d78] sea [info]cannot acquire lease or find primary, retrying: no primary
2022-08-25T05:59:06Z app[a4585d78] sea [info]cannot acquire lease or find primary, retrying: no primary
2022-08-25T05:59:09Z app[a4585d78] sea [info]primary lease acquired, advertising as http://fdaa:0:2fff:a7b:2dbb:1:932d:2:20202
  1. mia does not acquire the new lease as it is now connected to itself after a reboot.
litefs   531 root    7u  IPv6   4460      0t0  TCP [fdaa:0:2fff:a7b:2cc3:1:932e:2]:59844->[fdaa:0:2fff:a7b:2cc3:1:932e:2]:20202 (ESTABLISHED)
litefs   531 root    9u  IPv6   4461      0t0  TCP [fdaa:0:2fff:a7b:2cc3:1:932e:2]:20202->[fdaa:0:2fff:a7b:2cc3:1:932e:2]:59844 (ESTABLISHED)
  1. If we kill litefs, it does attempt to connect correctly.
2022-08-25T06:32:32Z app[b994b193] mia [info]signal received, litefs shutting down
2022-08-25T06:32:32Z app[b994b193] mia [info]stream disconnected
2022-08-25T06:32:32Z app[b994b193] mia [info]replica disconected, retrying: connect to primary: Post "http://fdaa:0:2fff:a7b:2cc3:1:932e:2:20202/stream": EOF ('http://fdaa:0:2fff:a7b:2cc3:1:932e:2:20202')
2022-08-25T06:32:33Z app[b994b193] mia [info]existing primary found (fdaa:0:2fff:a7b:2dbb:1:932d:2), connecting as replica
2022-08-25T06:32:33Z app[a4585d78] sea [info]stream connected
2022-08-25T06:32:33Z app[a4585d78] sea [info]send frame<ltx>: db=00000001 tx=(0000000000000001,0000000000000068) chksum=(0,fa8ce48fab8434e0) (snapshot)
2022-08-25T06:32:33Z app[b994b193] mia [info]recv frame<ltx>: db=1 tx=(1,104)
2022-08-25T06:32:38Z app[b994b193] mia [info]replica disconected, retrying: process ltx stream frame: apply ltx: invalidate db: bad file descriptor
2022-08-25T06:32:38Z app[b994b193] mia [info]fuse serve error: bad file descriptor
2022-08-25T06:32:38Z app[a4585d78] sea [info]stream disconnected
2022-08-25T06:32:39Z app[b994b193] mia [info]exit status 1: fusermount: failed to unmount /mnt/litefs/db: Device or resource busy

2022-08-25T06:35:36Z app[b994b193] mia [info]http server listening on: http://localhost:20202
2022-08-25T06:35:36Z app[b994b193] mia [info]existing primary found (fdaa:0:2fff:a7b:2dbb:1:932d:2), connecting as replica
2022-08-25T06:35:36Z app[a4585d78] sea [info]stream connected
2022-08-25T06:36:09Z app[a4585d78] sea [info]send frame<ltx>: db=00000001 tx=(0000000000000069,0000000000000069) chksum=(fa8ce48fab8434e0,88c78deb7025206a) size=8320
2022-08-25T06:36:09Z app[b994b193] mia [info]recv frame<ltx>: db=1 tx=(105,105)
2022-08-25T06:36:13Z app[a4585d78] sea [info]send frame<ltx>: db=00000001 tx=(000000000000006a,000000000000006a) chksum=(88c78deb7025206a,a8da36461cc61df5) size=8320
2022-08-25T06:36:13Z app[b994b193] mia [info]recv frame<ltx>: db=1 tx=(106,106)

@tjheeta
Copy link
Author

tjheeta commented Aug 25, 2022

Another amusing one. After a fly deploy, both sea and mia are connected to themselves. I'm not even sure how this is possible, but it's one hell of a race-condition.

root@454b2362:/# env|grep REG
FLY_REGION=mia

root@454b2362:/# cat /etc/hosts|grep 6pn
fdaa:0:2fff:a7b:2cc3:1:932e:2   454b2362.vm.litefs-liveview.internal 454b2362 fly-local-6pn
root@454b2362:/# cat /mnt/litefs/db/.primary 
fdaa:0:2fff:a7b:2cc3:1:932e:2

root@454b2362:/# lsof -ni|grep lite
litefs   534 root    3u  IPv6   4437      0t0  TCP *:20202 (LISTEN)
litefs   534 root    7u  IPv6   4466      0t0  TCP [fdaa:0:2fff:a7b:2cc3:1:932e:2]:50734->[fdaa:0:2fff:a7b:2cc3:1:932e:2]:20202 (ESTABLISHED)
litefs   534 root    9u  IPv6   4467      0t0  TCP [fdaa:0:2fff:a7b:2cc3:1:932e:2]:20202->[fdaa:0:2fff:a7b:2cc3:1:932e:2]:50734 (ESTABLISHED)

root@a0ed11e6:/# env |grep REG
FLY_REGION=sea

root@a0ed11e6:/# cat /etc/hosts|grep 6pn
fdaa:0:2fff:a7b:2dbb:1:932d:2   a0ed11e6.vm.litefs-liveview.internal a0ed11e6 fly-local-6pn
root@a0ed11e6:/# cat /mnt/litefs/db/.primary 
fdaa:0:2fff:a7b:2dbb:1:932d:2

root@a0ed11e6:/# lsof -ni|grep lite
litefs   533 root    3u  IPv6   4438      0t0  TCP *:20202 (LISTEN)
litefs   533 root    7u  IPv6   4467      0t0  TCP [fdaa:0:2fff:a7b:2dbb:1:932d:2]:38156->[fdaa:0:2fff:a7b:2dbb:1:932d:2]:20202 (ESTABLISHED)
litefs   533 root    9u  IPv6   4468      0t0  TCP [fdaa:0:2fff:a7b:2dbb:1:932d:2]:20202->[fdaa:0:2fff:a7b:2dbb:1:932d:2]:38156 (ESTABLISHED)

@tjheeta
Copy link
Author

tjheeta commented Aug 25, 2022

The following patch seems to fix this particular issue. First time writing go, hope this is adequate.

$ git diff 
diff --git a/store.go b/store.go
index 2629a98..8947974 100644
--- a/store.go
+++ b/store.go
@@ -7,8 +7,10 @@ import (
        "fmt"
        "io"
        "log"
+       "net"
        "os"
        "path/filepath"
+       "strings"
        "sync"
        "time"
 
@@ -382,6 +384,29 @@ func (s *Store) monitorLease(ctx context.Context) error {
                        continue
                }
 
+               // If another primary already exists, double-check that we are not the primary
+               // This solves various race conditions on reboot / crash / etc.
+               is_primary := false
+               primary_addr, err := net.LookupHost(info.Hostname)
+               interface_addr, err := net.InterfaceAddrs()
+               for _, x := range primary_addr {
+                       for _, y := range interface_addr {
+                               z := y.String()
+                               z = strings.Split(z, "/")[0]
+                               log.Printf("LEASE comparing %s, %s", x, z)
+                               if x == z {
+                                       is_primary = true
+                               }
+                       }
+               }
+               if is_primary == true {
+                       log.Printf("primary lease acquired again, advertising as %s", s.Leaser.AdvertiseURL())
+                       if err := s.monitorLeaseAsPrimary(ctx, lease); err != nil {
+                               log.Printf("primary lease lost, retrying: %s", err)
+                       }
+                       continue
+               }
+
                // Monitor as replica if another primary already exists.
                log.Printf("existing primary found (%s), connecting as replica", info.Hostname)
                if err := s.monitorLeaseAsReplica(ctx, info); err != nil {

@tjheeta tjheeta mentioned this issue Aug 25, 2022
@tjheeta
Copy link
Author

tjheeta commented Aug 25, 2022

After some more testing, it seems this patch has a problem I'm not sure how to resolve.

To summarize the issue, when a primary restarts, it will connect to consul and check for a lease. The lease exists, so by default, it will connect as a replica. Unfortunately, it connects to itself which causes replication issues and transaction corruptions.

So if we are the previous owner, I'm not sure how to get an actual instance of lease.

Going down the function chain:

  1. If we checkLeaseOwnership in store:monitorLease, lease, info, err := s.acquireLeaseOrPrimaryInfo(ctx) returns a nil lease. Calling monitorLeaseAsPrimary will crash as it is dereferencing nil.
  2. If we checkLeaseOwnership in store:acquireLeaseOrPrimaryInfo, lease, err := s.Leaser.Acquire(ctx) will still return a nil, litefs.ErrPrimaryExists
  3. consul:acquireExisting looks promising, however, it requires a leaseID which I'm not sure we have stored anywhere.
    .

A few options:

  1. Having a lease token persisted to compare against the existing lease could be a way to solve the issue.
  2. Sleep for some duration. At that point, the lease will have expired, another candidate will take over the primary and this problem goes away. This may have unintended side-effects.

Not sure where to go from here, but I'm hoping I missed an easy fix.


For reference, this function compares the hostname to all the local interfaces to determine ownership.

func checkLeaseOwnership(hostname string) bool {
	primary_addr, err := net.LookupHost(hostname)
	if err != nil {
		return false
	}
	interface_addr, err := net.InterfaceAddrs()
	if err != nil {
		return false
	}
	for _, x := range primary_addr {
		for _, y := range interface_addr {
			z := y.String()
			z = strings.Split(z, "/")[0]
			if x == z {
				return true
			}
		}
	}
	return false
}

@benbjohnson
Copy link
Collaborator

I'm adding some fixes to prevent this:

  1. Add a "node ID" that's persisted to the data directory so that a node can identify itself. It'll pass this along to the HTTP server when connecting the stream and the server will return an error if it tries to connect to itself.

  2. Add a check to the HTTP server to not allow connections if it is not a primary.

  3. Add a context cancelation to the http.Request.Context() so that it will automatically disconnect requests when leadership changes.

I'd like to avoid checking the host/interface and rely on the API to determine if it should accept connections or not.

@tjheeta
Copy link
Author

tjheeta commented Aug 25, 2022

Yes, this is way more elegant and it doesn't break the lease acquisition state machine. 🙂

The tradeoff of writes being dropped till lease expiry is well documented, so there are no surprises here.

As a side note, forcing a primary change might be the safest thing to do in general as we don't know why the original crashed in the first place.

@benbjohnson
Copy link
Collaborator

A new primary should automatically be promoted once the key expires in Consul and the lock delay passes. We can't force a primary change in order to prevent split brain. The safest thing to do is rely on the TTL. It does introduce a little bit of downtime in the case of unclean exits but it's only about 15s. There are some tricks we can do in the future to help minimize that time as well.

@tjheeta
Copy link
Author

tjheeta commented Aug 25, 2022

Ah, I used some confusing language there: "forcing a primary change" -> "ensuring the primary will change" may be safer than trying to re-acquire the lease since we don't know why the primary crashed.

From my perspective, given the unclean exit, I do believe 15 seconds of no writes is quite acceptable.

@benbjohnson
Copy link
Collaborator

I added the node ID here: #79

@benbjohnson
Copy link
Collaborator

It also changes the disk layout a bit. It moves the database directories from under the root data directory (e.g. $DATADIR/00000001) to a subdirectory (e.g. $DATADIR/dbs/00000001). The node ID is persisted to $DATADIR/id.

@benbjohnson
Copy link
Collaborator

I added context wrapping so that it will get canceled when the primary status changes. The PR still needs some tests but I figured I'd throw it up on GitHub. I'm AFK for a couples days but I'll add those tests on Monday. #80

@tjheeta
Copy link
Author

tjheeta commented Aug 25, 2022

Wow, thanks for the incredibly quick fix. Manually tested against dd6e73e.

Both quickly restartingfly vm restart and destroying the vm and have it come back upfly vm stop seem to be working now and both are under 40 seconds end to end with replication re-established.

fly vm restart (which would simulate a reboot or OOM)

2022-08-25T20:42:12Z app[707e6198] mia [info]error umounting /mnt/litefs: EBUSY: Device or resource busy, retrying in a bit
2022-08-25T20:42:18Z app[707e6198] mia [info]Using Consul to determine primary
2022-08-25T20:42:19Z app[707e6198] mia [info]initializing consul: key= url=https://:8e69cbf3-788c-c318-c16a-da6b16a43547@consul-iad-2.fly-shared.net/litefs-liveview-pvxemq2m3579gz63/ hostname=707e6198.vm.litefs-liveview.internal advertise-url=http://707e6198.vm.litefs-liveview.internal:20202
2022-08-25T20:42:19Z app[707e6198] mia [info]LiteFS mounted to: /mnt/litefs/db
2022-08-25T20:42:19Z app[707e6198] mia [info]http server listening on: http://localhost:20202
2022-08-25T20:42:19Z app[707e6198] mia [info]existing primary found (707e6198.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:42:19Z app[707e6198] mia [info]http: error: cannot connect to self
2022-08-25T20:42:19Z app[707e6198] mia [info]replica disconected, retrying: connect to primary: invalid response: code=400 ('http://707e6198.vm.litefs-liveview.internal:20202')
2022-08-25T20:42:21Z app[707e6198] mia [info]existing primary found (707e6198.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:42:21Z app[707e6198] mia [info]http: error: cannot connect to self
2022-08-25T20:42:33Z app[707e6198] mia [info]cannot acquire lease or find primary, retrying: no primary
2022-08-25T20:42:35Z app[707e6198] mia [info]cannot acquire lease or find primary, retrying: no primary
2022-08-25T20:42:37Z app[707e6198] mia [info]primary lease acquired, advertising as http://707e6198.vm.litefs-liveview.internal:20202
2022-08-25T20:42:44Z app[7e7d96ac] sea [info]replica disconected, retrying: next frame: read tcp [fdaa:0:2fff:a7b:2dbb:1:932d:2]:36770->[fdaa:0:2fff:a7b:2cc3:1:932e:2]:20202: read: connec
2022-08-25T20:42:46Z app[7e7d96ac] sea [info]existing primary found (707e6198.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:42:46Z app[707e6198] mia [info]stream connected

fly vm stop loses the vm completely takes about 40 seconds to come back up completely.

2022-08-25T20:38:02Z app[f5fb68e2] mia [info]Umounting /dev/vdc from /mnt/litefs
2022-08-25T20:38:02Z app[f5fb68e2] mia [info]error umounting /mnt/litefs: EBUSY: Device or resource busy, retrying in a bit
2022-08-25T20:38:12Z app[707e6198] mia [info]LiteFS mounted to: /mnt/litefs/db
2022-08-25T20:38:12Z app[707e6198] mia [info]http server listening on: http://localhost:20202
2022-08-25T20:38:12Z app[707e6198] mia [info]existing primary found (f5fb68e2.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:38:12Z app[707e6198] mia [info]replica disconected, retrying: connect to primary: Post "http://f5fb68e2.vm.litefs-liveview.internal:20202/stream": dial tcp: lookup f5fb68e2.vm.litefs-liveview.internal on [fdaa::3]:53: no such host ('http://f5fb68e2.vm.litefs-liveview.internal:20202')
2022-08-25T20:38:14Z app[707e6198] mia [info]existing primary found (f5fb68e2.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:38:14Z app[707e6198] mia [info]replica disconected, retrying: connect to primary: Post "http://f5fb68e2.vm.litefs-liveview.internal:20202/stream": dial tcp: lookup f5fb68e2.vm.litefs-liveview.internal on [fdaa::3]:53: no such host ('http://f5fb68e2.vm.litefs-liveview.internal:20202')
2022-08-25T20:38:16Z app[707e6198] mia [info]existing primary found (f5fb68e2.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:38:16Z app[707e6198] mia [info]replica disconected, retrying: connect to primary: Post "http://f5fb68e2.vm.litefs-liveview.internal:20202/stream": dial tcp: lookup f5fb68e2.vm.litefs-liveview.internal on [fdaa::3]:53: no such host ('http://f5fb68e2.vm.litefs-liveview.internal:20202')
2022-08-25T20:38:16Z app[7e7d96ac] sea [info]20:38:16.428 [warning] [libcluster:fly6pn] unable to connect to :"litefs-liveview@fdaa:0:2fff:a7b:2cc3:1:932e:2"
2022-08-25T20:38:17Z app[707e6198] mia [info]existing primary found (f5fb68e2.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:38:17Z app[707e6198] mia [info]replica disconected, retrying: connect to primary: Post "http://f5fb68e2.vm.litefs-liveview.internal:20202/stream": dial tcp: lookup f5fb68e2.vm.litefs-liveview.internal on [fdaa::3]:53: no such host ('http://f5fb68e2.vm.litefs-liveview.internal:20202')
2022-08-25T20:38:18Z app[707e6198] mia [info]existing primary found (f5fb68e2.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:38:18Z app[707e6198] mia [info]replica disconected, retrying: connect to primary: Post "http://f5fb68e2.vm.litefs-liveview.internal:20202/stream": dial tcp: lookup f5fb68e2.vm.litefs-liveview.internal on [fdaa::3]:53: no such host ('http://f5fb68e2.vm.litefs-liveview.internal:20202')
2022-08-25T20:38:19Z app[707e6198] mia [info]existing primary found (f5fb68e2.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:38:19Z app[707e6198] mia [info]replica disconected, retrying: connect to primary: Post "http://f5fb68e2.vm.litefs-liveview.internal:20202/stream": dial tcp: lookup f5fb68e2.vm.litefs-liveview.internal on [fdaa::3]:53: no such host ('http://f5fb68e2.vm.litefs-liveview.internal:20202')
2022-08-25T20:38:22Z app[707e6198] mia [info]cannot acquire lease or find primary, retrying: no primary
2022-08-25T20:38:24Z app[707e6198] mia [info]cannot acquire lease or find primary, retrying: no primary
2022-08-25T20:38:26Z app[707e6198] mia [info]primary lease acquired, advertising as http://707e6198.vm.litefs-liveview.internal:20202
2022-08-25T20:38:30Z app[7e7d96ac] sea [info]replica disconected, retrying: next frame: read tcp [fdaa:0:2fff:a7b:2dbb:1:932d:2]:36768->[fdaa:0:2fff:a7b:2cc3:1:932e:2]:20202: read: connection reset by peer
2022-08-25T20:38:32Z app[7e7d96ac] sea [info]existing primary found (707e6198.vm.litefs-liveview.internal), connecting as replica
2022-08-25T20:38:32Z app[707e6198] mia [info]stream connected
2022-08-25T20:38:38Z app[707e6198] mia [info]send frame<ltx>: db=00000001 tx=(000000000000013a,000000000000013a) chksum=(abfc65df1be210fb,8aa1de7727012d64) size=8320
2022-08-25T20:38:38Z app[7e7d96ac] sea [info]recv frame<ltx>: db=1 tx=(314,314)


@benbjohnson
Copy link
Collaborator

I added tests and merged #80 so I'm closing this out. Thanks for reporting it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants