Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client/eth: Check provider header times. #2074

Merged
merged 1 commit into from Feb 20, 2023

Conversation

JoeGruffins
Copy link
Member

When fetching a new or cached header with a provider, do a basic check on the header's time to determine if the header, and so the provider, are up to date.

closes #2064

@@ -295,7 +295,7 @@ type ethFetcher interface {
sendSignedTransaction(ctx context.Context, tx *types.Transaction) error
sendTransaction(ctx context.Context, txOpts *bind.TransactOpts, to common.Address, data []byte) (*types.Transaction, error)
signData(data []byte) (sig, pubKey []byte, err error)
syncProgress(context.Context) (*ethereum.SyncProgress, error)
syncProgress(context.Context) (progress *ethereum.SyncProgress, bestHeaderUNIXTime uint64, err error)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The providers fetch the block header anyway, so this is an optimization that cuts off one header fetch for providers.

// Time in the header is in seconds.
timeDiff := time.Now().Unix() - int64(bh.Time)
if timeDiff > dexeth.MaxBlockInterval && eth.net != dex.Simnet {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the simnet check and mining some blocks in the harness tests. It seems reasonable to just mine some blocks when testing. May be a preference thing so I can revert.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer reverting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, ok. Also passing the net to prividers bestHeader so it can also not care.

@JoeGruffins JoeGruffins marked this pull request as ready for review January 25, 2023 11:01
@JoeGruffins
Copy link
Member Author

I guess it's possible for a provider to get out of sync and still not be found to have an old header if that provider randomly doesn't call best header but later is randomly asked to do something else. Need to look at this a little bit more.

@JoeGruffins JoeGruffins marked this pull request as draft January 25, 2023 14:10
@chappjc
Copy link
Member

chappjc commented Jan 26, 2023

Heck I think we'll have to be able to work without even eth_syncing. With blastapi.io:

[ERR] MKT: Not starting dcr_usdc.eth market because of ChainsSynced error: error checking sync status for 60001: Method not found

EDIT: the above is server! But still...

@chappjc
Copy link
Member

chappjc commented Jan 26, 2023

Wait a sec, we don't use eth_syncing at all, do we?

In fact, I think providerAnkr should be moved to the compliantProviders map because there's no eth_syncing use at all right?

Similarly, I think we can add providerBlast = "blastapi.io" to client's multirpc compliant providers.

@chappjc
Copy link
Member

chappjc commented Jan 26, 2023

For server, is there an issue with this?

diff --git a/server/asset/eth/eth.go b/server/asset/eth/eth.go
index e541304ef..8e11349a7 100644
--- a/server/asset/eth/eth.go
+++ b/server/asset/eth/eth.go
@@ -565,10 +565,7 @@ func (eth *baseBackend) Synced() (bool, error) {
        // node.SyncProgress will return nil both before syncing has begun and
        // after it has finished. In order to discern when syncing has begun,
        // check that the best header came in under MaxBlockInterval.
-       sp, err := eth.node.syncProgress(eth.ctx)
-       if err != nil {
-               return false, err
-       }
+       sp, _ := eth.node.syncProgress(eth.ctx)
        if sp != nil {
                return false, nil
        }

If there's no eth_syncing support, fall back to header age. We should do that.

@JoeGruffins
Copy link
Member Author

JoeGruffins commented Jan 26, 2023

Wait a sec, we don't use eth_syncing at all, do we?

It seems pretty useless and confusing from the very beginning. We can just do the header time everywhere. Thats what it will always come down to currently, anyway.

Server only recently (this week) became useable with providers so probably some things we weren't thinking about pop up.

Comment on lines +784 to +855
// succeeds or all have failed. The context used to run functions has a time
// limit equal to defaultRequestTimeout for all requests to return. If
// operations are expected to run longer than that the calling function should
// not use the altered context.
func (m *multiRPCClient) withOne(ctx context.Context, providers []*provider, f func(context.Context, *provider) error, acceptabilityFilters ...acceptabilityFilter) (superError error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I needed a ctx to check the header time... and so I thought it might be better to set the context deadline in this function to ensure all the functions that use providers have a deadline. I think it is easier to keep up with than setting them all separately. You can see a couple cases where this timeout is not used, but in the vast majority we are just running one rpc quest with these so I think this is cleaner. This isn't strictly needed to solve the issue however.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove the timed contexts in currentFees, nextNonce, and transactionConfirmations. They all use withOne underneath.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. The extra ctx in currentFees and transactionConfirmations were just because two requests were being done with the same limit. But, they probably only should take milliseconds, so probably fine. I was thinking nextNonce was using the same context the whole time, but now that I look again it is using a new withPreferred every time so a new timeout. I was just confused it seems.

@JoeGruffins
Copy link
Member Author

In fact, I think providerAnkr should be moved to the compliantProviders map because there's no eth_syncing use at all right?

Similarly, I think we can add providerBlast = "blastapi.io" to client's multirpc compliant providers.

Moved these two to the compliant providers.

@JoeGruffins JoeGruffins marked this pull request as ready for review January 26, 2023 06:04
@chappjc chappjc self-requested a review January 30, 2023 15:58
@chappjc chappjc added this to the 0.6 milestone Jan 30, 2023
Comment on lines 117 to 122
time.Since(p.tip.headerStamp) > stale ||
time.Now().Unix()-int64(p.tip.header.Time) > dexeth.MaxBlockInterval {
Copy link
Member

@chappjc chappjc Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this final condition hit if the stale one does not? 180 sec is dexeth.MaxBlockInterval.
I suppose p.tip.headerStamp is when we received that tip, so I guess that could be much newer. OK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the impression on matrix that @buck54321 is going to do some things around here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably we get rid of the stale check now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the stale check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, actually, with no stale check it only fetches a new header after dexeth.MaxBlockInterval... So maybe one block interval after that should fetch a new one.

Copy link
Member

@buck54321 buck54321 Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After seeing it now, I think maybe we did need the stale check. It's the MaxBlockInterval check that could have been deleted, since stale was always substantially more restrictive. But we definitely don't want to check after 13 seconds when we have a websocket connection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put it back. The header can still be 180 seconds old though. Thats a few blocks.

tip, err := p.bestHeader(ctx, m.log)
if err != nil {
return err
}
bestHeaderUNIXTime = tip.Time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call returning this instead of doing bestHeader again in the caller.


// Non-compliant providers
providerCloudflareETH = "cloudflare-eth.com" // "SuggestGasTipCap" error: Method not found
providerAnkr = "ankr.com" // "SyncProgress" error: the method eth_syncing does not exist/is not available
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep meaning to come back to this, but even before this diff, I had been using ankr and it was never treated as non-compliant nor did it have to run a compliance check (I searched the logs, and I don't have a compliant-providers.json anywhere on my drive). Not sure how it was workin. The host was https://rpc.ankr.com/eth_goerli

@JoeGruffins
Copy link
Member Author

Just rebased.

@JoeGruffins
Copy link
Member Author

I broke the the multirpc live test by not calling m.Run in TestMain. Fixed now https://github.com/decred/dcrdex/compare/589fcfe6e187f26a46b18b7ccb20e36017c8a6bb..6f3d98ec538934215f917ad05aa846e16e2c69e1

But, tests are not passing. Looking into that and the compliant providers testing on testnet.

@JoeGruffins
Copy link
Member Author

The test failures in the multirpc live test are solved by #2091 so nothing to do here about that.

@JoeGruffins
Copy link
Member Author

JoeGruffins commented Feb 3, 2023

I had been using ankr and it was never treated as non-compliant nor did it have to run a compliance check

It looks like nothing happens if a provider is on the non-compliant list atm, it only checks unknown providers. Should it just not use providers on the bad list, or maybe check them again?

If anything needs to change with those, I'd like to make a new pr.

Made #2102

@chappjc
Copy link
Member

chappjc commented Feb 3, 2023

Should it just not use providers on the bad list, or maybe check them again?

I don't think there's a point in the non-compliant providers really. Why not have it check all unknown? Ankr worked for a long time for me and it was on the non-compliant providers list. Providers can change and the list can become stale... or the RPCs we use can change (no eth_syncing anymore) and the list can become stale.

@JoeGruffins
Copy link
Member Author

// newly fetched header is not too old. If it is too
// old that indicates the provider is not in sync and
// should not be used.
if _, err := p.bestHeader(ctx, m.log); err != nil {
Copy link
Member

@chappjc chappjc Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in (*multiRPCClient).bestHeader, it's going to do:

return hdr, m.withAny(ctx, func(ctx context.Context, p *provider) error {
hdr, err = p.bestHeader(ctx, m.log)
return err
}, allRPCErrorsAreFails)

which means that through withAny -> withOne, we're going to get here and do (*provider).bestHeader twice in a row?

I kinda wish something other than withOne were in charge of flagging providers as stale. (*ETHWallet).checkForNewBlocks is going to be calling bestHeader regularly anyway, so it seems like either (*provider).bestHeader or (*multiRPCClient).bestHeader would be a good place to set some flag, which withOne could then use in a similar manner to how it gets readyProviders with !p.failed().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I stick the check in a goroutine? I'm afraid if you just put the check in a random call, it randomly might not get called while another random call is called and the stale provider is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While adding the goroutine, it seems like the other goroutine subscribeHeaders will continue to run on old providers if you reconfigure, although it does not do anything, it's just waiting. Maybe should be stopped when the rpc client is closed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While adding the goroutine, it seems like the other goroutine subscribeHeaders will continue to run on old providers if you reconfigure, although it does not do anything, it's just waiting. Maybe should be stopped when the rpc client is closed.

The docs for ethereum.Subscription.Err say

// Err returns the subscription error channel. The error channel receives
// a value if there is an issue with the subscription (e.g. the network connection
// delivering the events has been closed). Only one value will ever be sent.
// The error channel is closed by Unsubscribe.
Err() <-chan error

So it seems to me that closing the *ethclient.Client should give us something to work with in subscribeHeaders. Maybe not checking the right thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, maybe I was mistaken.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to wait for it to finish during shutdown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested some more and you are right, closing the client causes the 'ok' part of case err, ok := <-sub.Err(): to be true.

I still think the extra monitoring is sane. Can revert though.

@JoeGruffins JoeGruffins force-pushed the dontusenotsyncdethclient branch 2 times, most recently from f981b0e to 843c7ae Compare February 16, 2023 11:29
@JoeGruffins
Copy link
Member Author

@JoeGruffins
Copy link
Member Author

Just rebased.

@JoeGruffins
Copy link
Member Author

// should not be used.
innerCtx, cancel := context.WithTimeout(ctx, defaultRequestTimeout)
if _, err := p.bestHeader(innerCtx, log); err != nil {
log.Warnf("Problem getting best header: %s.", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could say the provider in the message otherwise its not too helpful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah not helpful. Adding the host.

log.Tracef("handling header refreshes for %q", p.host)
for {
select {
case <-time.After(headerCheckInterval):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A ticker would avoid the goroutine churn, but if you think it's important to start the timer after bestHeader completes, that's alright I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a time limit on the context, there should be no way to overlap. Will change to a ticker.

@chappjc
Copy link
Member

chappjc commented Feb 17, 2023

Just worked as intended for me right now as blastio fell behind, got flagged as bad, and then caught up really fast:

2023-02-17 23:15:17.366 [TRC] CORE[eth][ETH][RPC]: Fetching fresh header from "eth-mainnet.blastapi.io"
2023-02-17 23:15:17.489 [WRN] CORE[eth][ETH][RPC]: Problem getting best header: time since last eth block (222 sec) exceeds 180 sec. Assuming provider eth-mainnet.blastapi.io is not in sync. Ensure your computer's system clock is correct..
2023-02-17 23:15:22.115 [TRC] CORE: New peer count for asset eth: 1
2023-02-17 23:15:22.115 [TRC] CORE: New peer count for asset usdc.eth: 1
2023-02-17 23:15:25.154 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.g.alchemy.com" reported new tip at height 16651611 (0x095e7b51572a9da8dd735dff1dcac429274c736e18713652d082934824598450)
2023-02-17 23:15:26.115 [DBG] CORE[eth][ETH]: tip change: 16651610 (0x209a203ea224de8e366261db80b92b4fb427c084520bf0489204c1fc20ea6d2e) => 16651611 (0x095e7b51572a9da8dd735dff1dcac429274c736e18713652d082934824598450)
2023-02-17 23:15:26.115 [TRC] CORE: Processing tip change for eth
2023-02-17 23:15:26.115 [TRC] CORE: Processing tip change for usdc.eth
2023-02-17 23:15:31.250 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651593 (0x8e37c44fdca54686b245cad17e114d8807e359fd2944d1eb0967df6caf9b7610)
2023-02-17 23:15:31.539 [TRC] CORE: New peer count for asset btc: 7
2023-02-17 23:15:31.804 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651594 (0x893fb6ec2d4dbc0e60c0e529076478167c8b7f3ca142817be6067e946169ceee)
2023-02-17 23:15:32.116 [TRC] CORE: New peer count for asset eth: 2
2023-02-17 23:15:32.116 [TRC] CORE: New peer count for asset usdc.eth: 2
2023-02-17 23:15:32.470 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651595 (0x8eb5d0eb1a0779a8bffabf4bed760f1b691fc4d9456f12c0d53b2c9c598e4dbe)
2023-02-17 23:15:34.784 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651596 (0xcd59b55834a73e887383e5c8a3b20e4b5c45846e0999f014c79ba02f04065c9d)
2023-02-17 23:15:35.504 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651597 (0x6ab097173e9207edb7c79c692ad16d4b3680dcb8be0d99e62a9f6817622863c7)
2023-02-17 23:15:36.161 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651598 (0x3eca4a799bf3563ae8ab44799a67a5c72352a589c9a0b8ca34398df459b2c6ec)
2023-02-17 23:15:36.538 [TRC] CORE: New peer count for asset btc: 8
2023-02-17 23:15:36.762 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651599 (0xc5340788b2f1f146f075898e5ac37072a572d8b50207046cd822772ef6250e12)
2023-02-17 23:15:37.342 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651600 (0xd79d53be055fee5221e04556d96b421c28ece5fc1bb7897487bee72f69a0d69f)
2023-02-17 23:15:38.166 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651601 (0x796459da04c9c84b757df7cfda4fd2b2a76c50c795ca14225254e10b231f37ff)
2023-02-17 23:15:38.820 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651602 (0x5abb38b8ff4b49c3ce3f0692799167bf485e3faa8122925bde7badd0faaa5676)
2023-02-17 23:15:39.018 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.g.alchemy.com" reported new tip at height 16651612 (0x59a491808f230dc8495ce5b05cd9aa77a54a7452c6272374d8be17d89b2ad8e8)
2023-02-17 23:15:39.116 [DBG] CORE[eth][ETH]: tip change: 16651611 (0x095e7b51572a9da8dd735dff1dcac429274c736e18713652d082934824598450) => 16651612 (0x59a491808f230dc8495ce5b05cd9aa77a54a7452c6272374d8be17d89b2ad8e8)
2023-02-17 23:15:39.116 [TRC] CORE: Processing tip change for usdc.eth
2023-02-17 23:15:39.116 [TRC] CORE: Processing tip change for eth
2023-02-17 23:15:39.485 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651603 (0xc5859eb5a0df52ae0300b4494b8803229ff6df9112703b8dff945f97a0512b6b)
2023-02-17 23:15:40.112 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651604 (0xddec1707767e2b271d44a6e0b5744ae902983c93ae2d2a85443d053a200487b3)
2023-02-17 23:15:40.860 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651605 (0x518b45072927637cd51378308636102adb037e590da736014a8b4679ba4e8cba)
2023-02-17 23:15:41.539 [TRC] CORE: New peer count for asset btc: 9
2023-02-17 23:15:41.634 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651606 (0x7b6e68242bc3fc531b0ac2769f426e391e80db65c7fdfa0f8efbd79d577eba8a)
2023-02-17 23:15:48.924 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.g.alchemy.com" reported new tip at height 16651613 (0x7dfd3052ffecf1b0413021d7d1c18b67b6300ce74290cc06c630147299fbae88)
2023-02-17 23:15:49.116 [DBG] CORE[eth][ETH]: tip change: 16651612 (0x59a491808f230dc8495ce5b05cd9aa77a54a7452c6272374d8be17d89b2ad8e8) => 16651613 (0x7dfd3052ffecf1b0413021d7d1c18b67b6300ce74290cc06c630147299fbae88)
2023-02-17 23:15:49.116 [TRC] CORE: Processing tip change for usdc.eth
2023-02-17 23:15:49.116 [TRC] CORE: Processing tip change for eth
2023-02-17 23:15:59.252 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651607 (0x0238ee7ae410401bed6ad7f1cf5d1fd747b75d6f58a570467f7bb19f2d976a6d)
2023-02-17 23:16:00.013 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651608 (0xfea6a20072a76aa9214d2a99cc74cde5fec3b572facbb3f7fb02d4ac5fc7bab9)
2023-02-17 23:16:01.930 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.g.alchemy.com" reported new tip at height 16651614 (0xf78cc3cfcbe757d8ce5bf34b8767a74aba77b111f505dc0369df3653fbf252e2)
2023-02-17 23:16:02.116 [DBG] CORE[eth][ETH]: tip change: 16651613 (0x7dfd3052ffecf1b0413021d7d1c18b67b6300ce74290cc06c630147299fbae88) => 16651614 (0xf78cc3cfcbe757d8ce5bf34b8767a74aba77b111f505dc0369df3653fbf252e2)
2023-02-17 23:16:02.116 [TRC] CORE: Processing tip change for usdc.eth
2023-02-17 23:16:02.116 [TRC] CORE: Processing tip change for eth
2023-02-17 23:16:03.317 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651609 (0x9364d6592d7dc4a07984030f130a3c9cea05de314197357234345f79c463c185)
2023-02-17 23:16:04.119 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651610 (0x209a203ea224de8e366261db80b92b4fb427c084520bf0489204c1fc20ea6d2e)
2023-02-17 23:16:04.691 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651611 (0x095e7b51572a9da8dd735dff1dcac429274c736e18713652d082934824598450)
2023-02-17 23:16:05.313 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.blastapi.io" reported new tip at height 16651612 (0x59a491808f230dc8495ce5b05cd9aa77a54a7452c6272374d8be17d89b2ad8e8)
2023-02-17 23:16:06.210 [TRC] CORE[eth][ETH][RPC]: Using cached header from "eth-mainnet.g.alchemy.com"
2023-02-17 23:16:07.490 [TRC] CORE[eth][ETH][RPC]: Using cached header from "eth-mainnet.blastapi.io"
2023-02-17 23:16:12.934 [TRC] CORE[eth][ETH][RPC]: "eth-mainnet.g.alchemy.com" reported new tip at height 16651615 (0x07b2377dc57f72ee77f141839d9a54f67844835de03a7641809ddde8917ccfac)
2023-02-17 23:16:13.116 [DBG] CORE[eth][ETH]: tip change: 16651614 (0xf78cc3cfcbe757d8ce5bf34b8767a74aba77b111f505dc0369df3653fbf252e2) => 16651615 (0x07b2377dc57f72ee77f141839d9a54f67844835de03a7641809ddde8917ccfac

But, when and were did it "unfail" the provider?

@JoeGruffins
Copy link
Member Author

Just rebased.

@JoeGruffins
Copy link
Member Author

But, when and were did it "unfail" the provider?

If it is working how I expect, it is not unfailed until a minute passes. Unusuable until a minute after 2023-02-17 23:15:17.489. The logs are from new headers coming in. So... hmm. If it is caught up it should unfail I guess? There are different reasons for fail currently though, so would need a different stamp for just headers.

@JoeGruffins
Copy link
Member Author

Appling the brickedFailCount to out of date headers actually may be a bit extreme. Probably should be separate.

@JoeGruffins
Copy link
Member Author

JoeGruffins commented Feb 18, 2023

Or wait, setTip does reset the fail timer and the bricked count, so maybe just a log there.

@JoeGruffins
Copy link
Member Author

Adding a log for when a provider was unusable, but is not usable. https://github.com/decred/dcrdex/compare/37ea7c0e8ef4c5de4cdca9be80dbd8843177ad4e..813fc242776c1ab3338fd6f598dcfc3bc5bfbfe8

Copy link
Member

@buck54321 buck54321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be working.

@@ -47,7 +49,8 @@ func mine(ctx context.Context) error {
}

func testEndpoint(endpoints []string, syncBlocks uint64, tFunc func(context.Context, *multiRPCClient)) error {
ctx, cancel := context.WithTimeout(context.Background(), time.Minute)
var cancel context.CancelFunc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can delete this declaration.

Comment on lines 108 to 133
func TestMain(m *testing.M) {
var cancel context.CancelFunc
ctx, cancel = context.WithCancel(context.Background())
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
go func() {
select {
case <-c:
cancel()
case <-ctx.Done():
}
}()
exit := func(exitCode int) {
signal.Stop(c)
cancel()
os.Exit(exitCode)
}
time.Sleep(time.Second)
err := mine(ctx)
if err != nil {
fmt.Println(err)
exit(1)
}
time.Sleep(time.Second)
exit(m.Run())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need all this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mining two blocks before tests when I took out the Simnet check for old headers, but then put the Simnet check back. So.. can revert.

@buck54321
Copy link
Member

Hmm. Looks like a ci error.

--- FAIL: TestTxWaiters (0.31s)
    swap_test.go:1689: error sending taker swap request: taker swap rpc error. code: 64, msg: already received a swap contract, search in

Seems unrelated though.

@chappjc
Copy link
Member

chappjc commented Feb 19, 2023

Retriggered CI. Can merge with final iteration for nits in #2074 (review)

When fetching a new or cached header with a provider, do a basic check
on the header's time to determine if the header, and so the provider,
are up to date.
@JoeGruffins
Copy link
Member Author

JoeGruffins commented Feb 20, 2023

Removed some needless mining in tests since we have a special case for Simnet https://github.com/decred/dcrdex/compare/324705a2d1192fa5f3d491a8df2274e3bb496244..0f21cb47fbd246ded08278fb8ecba435399675ba

@JoeGruffins
Copy link
Member Author

I thought that listening for ctrl-c was necessary in tests, but it appears not. The tests stop and fail on ctrl-c. Maybe I was thinking about test frameworks like the loadbot.

^Csignal: interrupt
FAIL    decred.org/dcrdex/client/asset/eth      3.861s

@chappjc chappjc merged commit 90c05c1 into decred:master Feb 20, 2023
@chappjc chappjc added the ETH label Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

client/eth: Not synced geth nodes are used.
3 participants