Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover transaction pool on light client #3833

Merged
merged 48 commits into from Nov 28, 2019
Merged

Recover transaction pool on light client #3833

merged 48 commits into from Nov 28, 2019

Conversation

@svyatonik
Copy link
Contributor

svyatonik commented Oct 16, 2019

It has been broken since #3538

TLDR: this PR introduces 'generic' light tx pool, which should work on all chains. There's a way to override different parts of this 'generic' implementation with chain-specific implementations. The chain specific implementation could, sometimes, be better than generic.

PART 1. What needs to be implemented.

Basically, it is:
1.1) validation of transactions on light client;
1.2) removal of 'mined' transactions on light client;
1.3) removal of 'invalid' transactions on light client;
1.4) moving transactions between 'ready' and 'future' queues on light client.

PART 2. Validation of transactions on light client.

Options that I have considered:
2.1) do not validate transactions at all - just broadcast to full nodes
2.1.PROS) fastest option;
2.1.CONS) I assume the submitter wants to know if his transaction is valid asap + it is impossible to prune invalid transactions from the pool just by looking at the blocks (it will never be included);
2.2) require specific implementation for every chain:
2.1.PROS) chain-specific implementation could be simpler/faster than generic. Like we could use proof-of-read of account nonce instead of proof-of-call of fn TaggedTransactionQueue::validate_transaction(). Or we might have a chain where validation could be performed on full node, if all required information is available from block headers;
2.2.CONS) it isn't always possible to write better than generic validation + writing specific implementation is an optimization task + we might want to get slow, but functional light client in the beginning;
2.3) provide generic implementation, but allow overriding it by chain-specific implementations:
2.3.PROS) easy-to-start + might be optimized later;
2.3.CONS) requires proof-of-call from remote nodes

So I've chose the option 2.3. In the code it is in the LightChainApi. The specific implementation could be provided when transaction pool is created.

Part 3. Removal of 'mined' transactions on light client.

Options that I have considered:
3.1) chain-specific removal. E.g. we can fetch proof-of-read of account nonce from remote node and prune all transactions with nonce < than provided
3.1.PROS) may be faster if we could figure out whether tx has been included from the header
3.1.CONS) the same as in 2.2.CONS
3.2) generic removal by fetching proof-of-inclusion
3.2.PROS) -
3.2.CONS) should be fast enough for most of implementations

I've chose the option 3.2. But in this PR there's no proof-of-inclusion - instead we fetch bodies of all transactions for every block when txpool is not empty and prune these transactions from the pool. If PR will be accepted, I'll file an issue about this optimization.

Part 4 and 5. Moving transactions between 'future', 'ready' and 'invalid' states.

First of all, I've made a controversial decision that on light client we do not accept transactions to the 'future' queue. That's because it is impossible to know whether transaction should be moved to the 'ready' queue without revalidation. And I wanted to avoid much revalidaton calls, because they're quite heavy in 'generic' implementation. Also - 'ready'/'future' state depends on transactions that are in transaction pool. Like the same transaction could be moved to 'ready' queue if tags that it requires are satisfied by other transactions from the 'ready' queue. At the same time, on other node it could be moved to the 'future' queue if the 'satisfying' transaction is missing. That said, I'm not sure - whether this restriction makes sense, or not. So any opinion/advice is appreciated.

So after this restriction, we only have 'ready' and 'invalid' states. And the only possible transition is from 'ready' to 'invalid' state. Iiuc, on full client, this transition is detected when some of mined transactions are providing the same tags => then the 'ready' transaction is revalidated. On light client it isn't a good idea to revalidate every transaction from every block (just to receive tags that it provides). So the (not best, but functional) solution is to perform revalidation at some intervals. By default, all ready transactions are revalidated every 60 seconds or every 20 blocks (whatever happens first). These constants can be overrided. And the whole mechanism could be overrided by passing chain-specific 'TranasctionPoolMaintainer' implementation to fn ServiceBuilder::with_transaction_pool_maintainer().

Part 6, unrelated. Accounts RPC.

When I was testing this PR, I found that "Accounts RPC methods" from node are (now?) required to submit transaction to the pool. So I've also fixed it in this PR. The fix is localized in the node/rpc/src/* and additional argument is added to the fn ServiceBuilder::with_rpc_extensions(). But if required, I could extract this fix to the separate PR.

core/service/src/builder.rs Outdated Show resolved Hide resolved
core/service/src/builder.rs Outdated Show resolved Hide resolved
core/service/src/pool_maintainer.rs Outdated Show resolved Hide resolved
core/transaction-pool/graph/src/base_pool.rs Outdated Show resolved Hide resolved
core/transaction-pool/graph/src/base_pool.rs Outdated Show resolved Hide resolved
core/transaction-pool/graph/src/validated_pool.rs Outdated Show resolved Hide resolved
core/transaction-pool/graph/src/validated_pool.rs Outdated Show resolved Hide resolved
fetcher.remote_call(RemoteCallRequest {
block: best_hash,
header: best_header,
method: "AccountNonceApi_account_nonce".into(),

This comment has been minimized.

Copy link
@tomusdrw

tomusdrw Oct 21, 2019

Contributor

Hmm, that seems very fragile, can't we store that on the trait level as consts to avoid harcoding strings like this in the code?

This comment has been minimized.

Copy link
@tomusdrw

tomusdrw Oct 21, 2019

Contributor

We should at least add a test that this is the same as the runtime API.

This comment has been minimized.

Copy link
@svyatonik

svyatonik Oct 22, 2019

Author Contributor

I've made a separate issue for that - #3875

let remote_validation_request = self.fetcher.remote_call(RemoteCallRequest {
block,
header,
method: "TaggedTransactionQueue_validate_transaction".into(),

This comment has been minimized.

Copy link
@tomusdrw

tomusdrw Oct 21, 2019

Contributor

Same as with other grumble. We really need to derive these strings from the declaration/implementation.

core/transaction-pool/src/api.rs Outdated Show resolved Hide resolved
@bkchr bkchr closed this in #3866 Oct 22, 2019
@bkchr

This comment has been minimized.

Copy link
Contributor

bkchr commented Oct 22, 2019

(this pr was accidently linked in another one as being fixed)

@bkchr bkchr reopened this Oct 22, 2019
svyatonik and others added 7 commits Oct 22, 2019
Co-Authored-By: Tomasz Drwięga <tomusdrw@users.noreply.github.com>
@svyatonik

This comment has been minimized.

Copy link
Contributor Author

svyatonik commented Oct 23, 2019

So I've made a new TransactionPool trait. However, I haven't introduced two implementations - FullTransactionPool and LightTransactionPool. This is because I want to provide ability to customize PoolApi and Maintainer for (at least) light nodes (see notes about this in description). And when I made LightTransactionPool and FullTransactionPool generic over PoolApi and Maintaner, the implementation was exactly the same.

Main changes are - tx pool clients are now generic over TransactionPool instead of ChainApi. Also maintain procedure is now a part of the pool implementation itself.

I had to wrap all the futures that TransactionPool returns in the Box because otherwise too many impl details are to be exposed (also to avoid multi-multi-lines Futures declarations) and I'm unable to return impl Future from the trait, obviously. Also there's a bit more methods than simply submit(), ready() and maintain() - it appears that we need to subscribe to pool transactions (hence import_notification_stream), we need to report poll status to telemetry (hence status), ...

After doing this, I'm not sure that traitifying TransactionPool was a perfect idea - we still have a single implementation (BasicTransactionPool instead of txpool::Pool) that is generic over PoolApi and Maintainer. What do you think on that?

tomusdrw and others added 3 commits Nov 23, 2019
@svyatonik

This comment has been minimized.

Copy link
Contributor Author

svyatonik commented Nov 25, 2019

@tomusdrw Thanks for merging with master! I've resolved another bunch of conflicts - mostly caused by #4145 . The main change is that now offchain::TransactionPool is implemented for txpool_api::MaintainableTransactionPool (before it has been implemented for transaction_graph::Pool). It would be better to have blanket impl, but TransactionPool traits are in different crates && I can't move common trait TransactionPool to primitives, because it depends on sr_primitives.

(note that corresponding PR in polkadot isn't opened until this PR is approved)

tomusdrw added 10 commits Nov 25, 2019
Copy link
Contributor

tomusdrw left a comment

All issues should addressed now, it's ready for another review :)
(Pretty please 🙏 )

@gavofyork gavofyork added this to the polkadot-0.7.0 milestone Nov 27, 2019
@gavofyork gavofyork merged commit 2ffaf05 into master Nov 28, 2019
13 of 14 checks passed
13 of 14 checks passed
continuous-integration/gitlab-check_polkadot Build stage: build; status: failed
Details
continuous-integration/gitlab-cargo-check-benches Build stage: test; status: success
Details
continuous-integration/gitlab-cargo-check-subkey Build stage: test; status: success
Details
continuous-integration/gitlab-check-line-width Build stage: test; status: success
Details
continuous-integration/gitlab-check-runtime Build stage: test; status: success
Details
continuous-integration/gitlab-check-web-wasm Build stage: test; status: success
Details
continuous-integration/gitlab-check_warnings Build stage: build; status: success
Details
continuous-integration/gitlab-node-exits Build stage: test; status: success
Details
continuous-integration/gitlab-test-dependency-rules Build stage: test; status: success
Details
continuous-integration/gitlab-test-frame-staking Build stage: test; status: success
Details
continuous-integration/gitlab-test-full-crypto-feature Build stage: test; status: success
Details
continuous-integration/gitlab-test-linux-stable Build stage: test; status: success
Details
continuous-integration/gitlab-test-linux-stable-int Build stage: test; status: success
Details
continuous-integration/gitlab-test-wasmtime Build stage: test; status: success
Details
@gavofyork gavofyork deleted the light_tx_pool3 branch Nov 28, 2019
@svyatonik svyatonik mentioned this pull request Nov 28, 2019
nodebreaker0-0 added a commit to nodebreaker0-0/substrate that referenced this pull request Nov 28, 2019
# This is the 1st commit message:

# This is a combination of 3 commits.
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

# This is a combination of 4 commits.
# This is the 1st commit message:

Recover transaction pool on light client (paritytech#3833)

* recover tx pool on light client

* revert local tests fix

* removed import renamings

* futures03::Future -> std::future::Future

* Update core/transaction-pool/graph/src/error.rs

Co-Authored-By: Tomasz Drwięga <tomusdrw@users.noreply.github.com>

* replace remove_from_ready with remove_invalid

* avoid excess hashing

* debug -> warn

* TransactionPool + BasicTransactionPool

* pause future tx reject when resubmitting

* bump impl_version to make CI happy

* and revert back local test fixes

* alter doc to restart CI

* Transaction::clone() -> Transaction::duplicate()

* transactions -> updated_tranasctions

* remove explicit consensus-common ref

* ::std:: -> std::

* manual set/unset flag -> calling clusore with given flag value

* removed comments

* removed force argument

* BestIterator -> Box<Iterator>

* separate crate for TxPool + Maintainer trait

* long line fix

* pos-merge fix

* fix benches compilation

* Rename txpoolapi to txpool_api

* Clean up.

* Finalize merge.

* post-merge fix

* Move transaction pool api to primitives directly.

* Consistent naming for txpool-runtime-api

* Warn about missing docs.

* Move  abstraction for offchain calls to tx-pool-api.

* Merge RPC instantiation.

* Update cargo.lock

* Post merge fixes.

* Avoid depending on client.

* Fix build

edit option

environment block num - test

metrics fn push and finalized listing

best_block_num listing

base-code(rebased)

# This is the commit message #2:

README.md add

# This is the commit message #3:

README.md add

# This is the commit message #4:

README.md

# This is the commit message #2:

README.md

# This is the commit message #2:

README.md update

# This is the commit message #3:

README.md update

# This is the commit message #2:

README.md update

# This is the commit message #3:

README.md

# This is the commit message #4:

README.md

# This is the commit message paritytech#5:

README.md

# This is the commit message paritytech#6:

translation of introduction
@tomusdrw tomusdrw mentioned this pull request Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.