Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/runtime/client: Recheck failed blocks and implement max tx age #3443

Merged
merged 1 commit into from
Oct 28, 2020

Conversation

ptrus
Copy link
Member

@ptrus ptrus commented Oct 23, 2020

Fixes: #3412

  • runtime client remembers failed roothash blocks and rechecks them on every new received block
  • additionally transactions that do not succeed in 1500 blocks are considered failed and client drops them

@ptrus ptrus force-pushed the ptrus/fix/runtime-client-txs branch from c714768 to a42e208 Compare October 23, 2020 12:23
@kostko
Copy link
Member

kostko commented Oct 23, 2020

should this be configurable?

Yeah probably.

as a future improvement should the client also support SubmitTxNoWait operation (similar to the one in consensus layer)?

Good idea, open an issue.

@codecov
Copy link

codecov bot commented Oct 23, 2020

Codecov Report

Merging #3443 into master will increase coverage by 0.20%.
The diff coverage is 71.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3443      +/-   ##
==========================================
+ Coverage   66.34%   66.54%   +0.20%     
==========================================
  Files         371      371              
  Lines       33656    33710      +54     
==========================================
+ Hits        22328    22432     +104     
+ Misses       8067     8031      -36     
+ Partials     3261     3247      -14     
Impacted Files Coverage Δ
go/runtime/client/watcher.go 72.59% <68.65%> (+4.81%) ⬆️
go/runtime/client/client.go 66.66% <83.33%> (+0.31%) ⬆️
go/oasis-node/cmd/node/node.go 53.65% <100.00%> (+0.11%) ⬆️
go/storage/mkvs/checkpoint/checkpointer.go 64.75% <0.00%> (-15.58%) ⬇️
.../consensus/tendermint/apps/epochtime_mock/state.go 75.00% <0.00%> (-5.00%) ⬇️
go/storage/mkvs/lookup.go 72.63% <0.00%> (-2.11%) ⬇️
go/oasis-node/cmd/node/control.go 61.11% <0.00%> (-1.39%) ⬇️
go/runtime/registry/registry.go 71.97% <0.00%> (-1.28%) ⬇️
go/consensus/tendermint/full/full.go 62.29% <0.00%> (-0.94%) ⬇️
go/worker/common/committee/group.go 75.45% <0.00%> (-0.91%) ⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b12fefb...8a38e35. Read the comment docs.

@ptrus ptrus force-pushed the ptrus/fix/runtime-client-txs branch 16 times, most recently from 168b924 to c06666b Compare October 26, 2020 12:04
@ptrus ptrus marked this pull request as ready for review October 26, 2020 12:10
go/runtime/client/client.go Outdated Show resolved Hide resolved
go/runtime/client/client.go Outdated Show resolved Hide resolved
go/runtime/client/watcher.go Outdated Show resolved Hide resolved
}
w.toBeChecked = failedBlocks
Copy link
Member

@kostko kostko Oct 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a (configurable) limit on the size of this list? The question is what should happen once that limit is reached.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, if anything i think it might make more sense to have a limit on number of transactions being watched. I guess it wouldn't hurt at least logging a warning in case the failedBlocks is non-zero though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this doesn't have anything to do with the number of watched transactions, right? There could be a single transaction and 1000 blocks waiting to be re-checked? And yeah logging sounds like a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's not related. I just think that it's more realistic that the number of transaction will grow big, rather than number of blocks needing to be rechecked, since in case there's actually problem with the storage nodes (the only reason checking blocks can currently fail), there probably won't be any new roothash blocks anyway. Unless the storage nodes are specifically blocking client requests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh one thing that would be useful is to clear this list in case there are no more pending transactions? Because it doesn't make sense to re-check blocks if all transactions have either completed or expired.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. Hm wondering if instead should just make checkBlock first check if there's any pending transactions and in case there aren't any just successfully return. Since it also doesn't make sense checking new blocks in case there aren't any pending transactions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I instead made checkBlock check if there's any pending transactions.

@ptrus ptrus force-pushed the ptrus/fix/runtime-client-txs branch 2 times, most recently from 082e5b5 to 62a3456 Compare October 27, 2020 15:46
@@ -25,9 +29,20 @@ import (
executor "github.com/oasisprotocol/oasis-core/go/worker/compute/executor/api"
)

const (
// CfgMaxTransactionAge is the number of consensus blocks after witch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I appreciate the seasonal spirit, which.

@ptrus ptrus force-pushed the ptrus/fix/runtime-client-txs branch from 62a3456 to ea002b5 Compare October 28, 2020 13:30
@ptrus ptrus force-pushed the ptrus/fix/runtime-client-txs branch from ea002b5 to c6caac5 Compare October 28, 2020 14:31
@ptrus ptrus force-pushed the ptrus/fix/runtime-client-txs branch from c6caac5 to 8a38e35 Compare October 28, 2020 20:39
@ptrus ptrus merged commit 85c961f into master Oct 28, 2020
@ptrus ptrus deleted the ptrus/fix/runtime-client-txs branch October 28, 2020 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runtime client should retry checking blocks
3 participants