New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: support server TLS certificate change #1602
Conversation
a87930b
to
1b5370b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working well, but, if you change the cert while dexc is offline, it will only say Disconnected
after you start up dexc and navigate to the settings screen, even if the cert is the problem.
While fixing the issue of the dex settings screen showing |
2053473
to
3d7ca20
Compare
Haven't tried this pr yet, but the reconnect loop starts if the server is unreachable on dexc startup. c782ffb I would expect it to do the same on server reconfigure, but I have to look into that |
The loop starts, but it is stopped immediately because in |
OK, I'll look at it. On master and 0.4, the reconnect loops until connection succeeds. I can start dexc with network down and it starts connection attempts, backing off, and when network comes up it connects fine. |
In this commit here: f222e12 Do you not see the "no longer keeping alive" message right after it fails to connect? For me, using master, it does not reconnect to the dex if it fails to connect at startup. |
Oh WTF. It's broken on master. |
Oh, it was this: e8fd8ac I'll try to figure out what needs to be done there. |
Let's consider this: #1616 |
It works well after rebasing. I've gotten rid of the last unnecessary commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't need to do the reconnect loop if we failed for the certificate. We would expect it to continue failing until the certificate is changed. I don't know if this is a big change though. Working well.
@@ -416,7 +427,7 @@ func (conn *wsConn) Connect(ctx context.Context) (*sync.WaitGroup, error) { | |||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not start the reconnect loop when the initial connection attempt fails on account of a bad or missing cert, maybe the following:
func (conn *wsConn) Connect(ctx context.Context) (*sync.WaitGroup, error) {
var ctxInternal context.Context
ctxInternal, conn.cancel = context.WithCancel(ctx)
err := conn.connect(ctxInternal)
if err != nil {
// If the certificate is invalid or missing, do not start the reconnect
// loop, and return an error with no WaitGroup.
if errors.Is(err, ErrInvalidCert) || errors.Is(err, ErrCertRequired) {
conn.cancel()
conn.wg.Wait() // probably a no-op
close(conn.readCh)
return nil, err
}
// The read loop would normally trigger keepAlive, but it wasn't started
// on account of a connect error.
conn.log.Errorf("Initial connection failed, starting reconnect loop: %v", err)
time.AfterFunc(5*time.Second, func() {
conn.reconnectCh <- struct{}{}
})
}
conn.wg.Add(2)
go func() {
defer conn.wg.Done()
conn.keepAlive(ctxInternal)
}()
go func() {
defer conn.wg.Done()
<-ctxInternal.Done()
conn.setConnectionStatus(Disconnected)
conn.wsMtx.Lock()
if conn.ws != nil {
conn.log.Debug("Sending close 1000 (normal) message.")
conn.close()
}
conn.wsMtx.Unlock()
close(conn.readCh) // signal to MessageSource receivers that the wsConn is dead
}()
return &conn.wg, err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In feeling out the above change, I think we have some unfortunate patterns and conventions we're fighting with.
Back in (*Core).connectDEX
, consider adding this comment at this location:
@@ -5630,6 +5631,14 @@ func (c *Core) connectDEX(acctInfo *db.AccountInfo, temporary ...bool) (*dexConn
conn, err := c.wsConstructor(&wsCfg)
if err != nil {
return nil, err
}
dc.WsConn = conn
dc.connMaster = dex.NewConnectionMaster(conn)
+ // At this point, we have a valid dexConnection object whether or not we can
+ // actually connect. In any return below, we return the dexConnection so it
+ // may be tracked in the c.conns map (and listed as a known DEX). TODO:
+ // split the above code into a dexConnection constructor, and the below into
+ // a startDexConnection function so we don't have the anti-pattern of
+ // returning a non-nil object with a non-nil error and requiring the caller
+ // to check both!
+
// Start listening for messages. The listener stops when core shuts down or
// the dexConnection's ConnectionMaster is shut down. This goroutine should
// be started as long as the reconnect loop is running. It only returns when
// the wsConn is stopped.
if listen {
c.wg.Add(1)
go c.listen(dc)
}
err = dc.connMaster.Connect(c.ctx)
I don't think we should act on that TODO now, but I wanted to call out this design issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And related, I believe there's more trouble with ConnectionMaster
:
diff --git a/dex/runner.go b/dex/runner.go
index e6ae29e7..4858f7e5 100644
--- a/dex/runner.go
+++ b/dex/runner.go
@@ -110,10 +110,14 @@ func (c *ConnectionMaster) Connect(ctx context.Context) (err error) {
c.init(ctx)
c.mtx.Lock()
c.wg, err = c.connector.Connect(c.ctx)
+ if c.wg == nil { // don't expect a waitgroup if Connect errored
+ c.wg = new(sync.WaitGroup) // don't let Disconnect or Wait panic
+ c.cancel() // we're not "On"
+ }
c.mtx.Unlock()
// NOTE: Even if err is non-nil, we can't cancel the internal context
- // because the connector may be attempting to reconnect. The caller should
- // decide to let it go or to call Disconnect.
+ // because the connector may be attempting to reconnect (if wg is non-nil).
+ // The caller should decide to let it go or to call Disconnect.
return err
}
Because the Disconnect and Wait methods would panic if the Connector.Connect call returned nil for the wg (like the asset.Wallet implementations do!). I will get back to #1474 soon because removing this wg pointer field was one thing I thought should be done.
And as an aside, in retrospect, I don't think we should have used ConnectionMaster
so broadly given it's quirks and tendency to hide the actual types and their statuses, but here we are.
In general, we should resist patterns that return non-nil objects when error is non-nil. We're stuck with it for now however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
accountInfo.Cert = cert | ||
|
||
_, connected := c.connectAccount(accountInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a dexConnection already in the map, it will be orphaned. You can observe this if a failing reconnect loop is going for the old connection as it never stops even though the conn was replaced with a working one with a new TLS cert.
I believe connectAccount
needs a check that stops any existing connection (prior to connecting the new one otherwise the frontend gets the notifications in the wrong order and it looks disconnected until you refresh the page):
diff --git a/client/core/core.go b/client/core/core.go
index 48b1002f..4dfcf333 100644
--- a/client/core/core.go
+++ b/client/core/core.go
@@ -4824,7 +4824,8 @@ func (c *Core) initialize() {
// connectAccount makes a connection to the DEX for the given account. If a
// non-nil dexConnection is returned, it was inserted into the conns map even if
// the initial connection attempt failed (connected == false), and the connect
-// retry / keepalive loop is active.
+// retry / keepalive loop is active. If there was already a dexConnection, it is
+// first stopped.
func (c *Core) connectAccount(acct *db.AccountInfo) (dc *dexConnection, connected bool) {
if !acct.Paid && len(acct.FeeCoin) == 0 {
// Register should have set this when creating the account that was
@@ -4839,6 +4840,14 @@ func (c *Core) connectAccount(acct *db.AccountInfo) (dc *dexConnection, connecte
c.log.Errorf("skipping loading of %s due to address parse error: %v", host, err)
return
}
+
+ c.connMtx.RLock()
+ if dc := c.conns[host]; dc != nil {
+ dc.connMaster.Disconnect()
+ dc.acct.lock()
+ } // leave it in the map so it remains listed if connectDEX fails
+ c.connMtx.RUnlock()
+
dc, err = c.connectDEX(acct)
if dc == nil {
c.log.Errorf("Cannot connect to DEX %s: %v", host, err)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should close the book feeds too, but haven't tested with that:
c.connMtx.RLock()
if dc := c.conns[host]; dc != nil {
dc.connMaster.Disconnect()
dc.acct.lock()
dc.booksMtx.Lock()
for m, b := range dc.books {
b.closeFeeds()
if b.closeTimer != nil {
b.closeTimer.Stop()
}
delete(dc.books, m)
}
dc.booksMtx.Unlock()
} // leave it in the map so it remains listed if connectDEX fails
c.connMtx.RUnlock()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With rebase and the above diff: https://github.com/chappjc/dcrdex/commits/tlsReconfig-chappjc (untested!)
As an aside, this enUS map creates a lot of churn. In rebasing, I just used a shorter, but equally good, key to spare modifying a ton of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, the changes work well.
- `client/comms`: Instead of just keeping track of the connection being conneccted or not, a `ConnectionStatus` type is introduced which includes `InvalidCert` in addition to `Connected` and `Disconnected`. This is then passed to core and the UI through the `ConnEventNote`. - `client/core`: A new function `UpdateCert` is added which uses TLS certificate supplied by the user to connect to a server. If the connection is successful, the new certificate is stored in the db. - `client/webserver`: A new api request `/api/updatecert` is added to allow the user to update the TLS Certificate for a host. - `ui`: The settings page is updated. Instead of all the options for each server being on the settings page, there is just a link to a dex settings page specific to a server. This contains all the options that previously existed on the settings page in addition to the new option to update the TLS Certificate.
5213df4
to
98f2b25
Compare
AcctID string `json:"acctID"` | ||
Markets map[string]*Market `json:"markets"` | ||
Assets map[uint32]*dex.Asset `json:"assets"` | ||
ConnectionStatus comms.ConnectionStatus `json:"connectionStatus"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok because we are still at v0.x but note that when we reach v1, we're making a promise to Go consumers not to break the API. So doing something like this in the future would cause the module to become core/v2
.
However, if we left Connected bool
, it would not be a breaking change even if it appears redundant with ConnectionStatus
.
Just pointing it out so we can get used to stabilizing the API.
All still looking good to me with the latest update. LOL re the |
client/comms
: Instead of just keeping track of the connectionbeing conneccted or not, a
ConnectionStatus
type is introducedwhich includes
InvalidCert
in addition toConnected
andDisconnected
. This is then passed to core and the UI throughthe
ConnEventNote
.client/core
: A new functionUpdateCert
is added which usesTLS certificate supplied by the user to connect to a server. If
the connection is successful, the new certificate is stored in
the db.
client/webserver
: A new api request/api/updatecert
is addedto allow the user to update the TLS Certificate for a host.
ui
: The settings page is updated. Instead of all the options foreach server being on the settings page, there is just a link to a
dex settings page specific to a server. This contains all the options
that previously existed on the settings page in addition to the new
option to update the TLS Certificate.
Closes #1571