-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect P2p with TxPool #590
Conversation
Seems a lot of tests are failing now, do you know why yet @bvrooman ? I am seeing a lot of timeouts |
It appears that the difficulty we are encountering is caused by the different configurations created by the combinations of different feature flags, specifically the I'm mapping out the channels in a draw.io doc here. This map is not complete, and it may not be completely accurate thus far. From this map, we can observe that certain channels will have a different set of senders/receivers depending on the services enabled. When running tests with all features enabled, including P2P and Relayer, tests pass. However, we can see that without P2P, some senders are removed from the system and some signals are never passed to receivers. This causes some channels to wait indefinitely causing timeouts. We need to make sure that tests pass for all combinations of the matrix:
Right now, the code is written as if all services are enabled. Tests fail when P2P is disabled (e.g. It is clear from reading the code and mapping out the connections that using channels to this volume introduces a lot of complexity and potential for error. |
Will try and reconfigure so we can get all feature combos working, although realistically we just disable tx gossiping if both aren't enabled. Maybe we could do something like wrapping gossip under a meta feature? |
FYI - I don't have permission to open that draw.io link I think we should try to find the simplest way to enable this for now without coming up with a perfect solution to the channels issue, since we have a pending initiative to redesign our service arch and reduce our reliance on channels where possible. |
I propose a couple of options:
I am pushing a commit to implement the second option. This option is quick and dirty, but it will unblock the tests. |
… into controlc/p2p_tx
let node_two = FuelService::new_node(node_config).await.unwrap(); | ||
let client_two = FuelClient::from(node_two.bound_address); | ||
|
||
tokio::time::sleep(Duration::new(3, 0)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: We can replace the sleep
calls with polling for peers once the functionality is enabled. This will be tracked in a separate issue here: #649.
let tx = client_one.transaction(&result.0.to_string()).await.unwrap(); | ||
assert!(tx.is_some()); | ||
|
||
tokio::time::sleep(Duration::new(3, 0)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Same as above; we will replace sleep
with polling once that is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brief review
tx: tx.clone(), | ||
status: TxStatus::Submitted, | ||
}); | ||
let _ = network_sender |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method insert
below do the same except for this part. Can we move common logic to a separate function and pass a closure that will broadcast in the case of p2p
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rather than complicating with a closure, this method could just call the one below, iterate over the Vec<Result<Arc<Transaction>>>
return type and then trigger a broadcast for each successful one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rather than complicating with a closure, this method could just call the one below, iterate over the
Vec<Result<Arc<Transaction>>>
return type and then trigger a broadcast for each successful one.
We originally proposed this but we changed it in favour of a more "performant" approach, at the cost of the extra code. But I think this approach is the simplest and cleanest, and I prefer the cleaner approach to the performant approach in this case.
The draw back is:
- Double the number of iterations
But the benefits are:
- No extra method or code duplication
- Easier to read and reason about code
I'll opt for that instead and push a commit for that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably atleast leave a note in the future though in case in future txpool optimizations we need areas to improve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Draft PR based on this branch: #670
match new_transaction.unwrap() { | ||
TransactionBroadcast::NewTransaction ( tx ) => { | ||
let txs = vec!(Arc::new(tx)); | ||
TxPool::insert(txpool, db.as_ref().as_ref(), tx_status_sender, txs).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You must also use insert_with_broadcast
if p2p
is enabled.
Also, it means that this case is not tested. Do we plan to test in this PR or in follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be done unless we change the current gossip logic to only broadcast to immediate peers (depth 1). Even with depth of 1, we risk bouncing broadcasts back and forth in an infinite loop. Right now, any gossiped messages will be propagated to the whole network. We have other P2P tasks to investigate how to prevent nodes from re-gossiping invalid data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it may not be clear from a cold read, but this branch is executed when new transactions are received from gossip. In this case, we do not want to echo the same transaction back to the network with a broadcast.
fuel-txpool/src/service.rs
Outdated
pub incoming_tx_receiver: broadcast::Receiver<TransactionBroadcast>, | ||
#[cfg(feature = "p2p")] | ||
pub network_sender: mpsc::Sender<P2pRequestEvent>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can already introduce the traits we discussed on the call in this PR?
You can represent P2P as Gossiper
and TransactionProvider
traits. It will aggregate incoming_tx_receiver
and network_sender
into one field, making the code cleaner. It seems easy to do because TXPool already knows nothing about P2P(based on the imports in Cargo.toml
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to move forward with the current PR as it's been in progress for a long time and we should checkpoint this progress, and then we can do channel / trait conversion in subsequent PRs.
} | ||
for (ret, tx) in res.iter().zip(txs.into_iter()) { | ||
match ret { | ||
Ok(removed) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current code, we check that transactions were in the transactions pool, but we don't check that transactions have been committed in the block.
With the introduction of P2P gossiping seems we need to do this check. Overwise, we will gossip about already committed transactions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think txpool cleanup should be deferred to the block importer/sync task. Without that in place, it doesn't make as much sense to implement this yet.
* cleaned up feature flags * fmt * fix leak: * fmt again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
The service setup w/ channels looks painful, but luckily we have a plan to address this soon.
The PR in draft can come after this one.
Connects P2p with TxPool which should allow transactions inserted over the txpool of one node to be broadcasted to the others. Done in collaboration with @bvrooman but this has fallen stale so I had to re-impl over master since merge conflicts and git history had grown too messy
Closes #477
Closes #478