Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify incoming TransmissionResponses #3083

Closed
wants to merge 5 commits into from

Conversation

raychu86
Copy link
Contributor

Motivation

This PR performs transmission verification when processing transmissions from TransmissionResponses. Previously we were blindly accepting them into the batch, which could lead to possible halting cases, where malicious nodes inject faulty transactions.

This should fix:
#2990
#2991

@vicsn
Copy link
Collaborator

vicsn commented Feb 10, 2024

I believe this risks OOMing the validator. I know of two paths where we request transmissions:

  • from pings
  • from batch_proposals

We could receive many pings in parallel, and a single batch_proposals transmissions are retrieved in parallel as well. The maximum RAM to verify a 1M constraint program is 2GB. So 32-64 transmissions in the worst case and we're out.

For pings I'd imagine it would be better to follow the path of UnconfirmedTransaction messages, which will add them to the mempool queue. The mempool queue currently serially processes transactions by a single worker so should not OOM.

For batch_proposals, pushing them to the back of the mempool queue might cause significant delays, so I was thinking we should add a second separate simple mempool queue for batch_proposals.

@@ -363,7 +363,7 @@ impl<N: Network> Worker<N> {
self.spawn(async move {
while let Some((peer_ip, transmission_response)) = rx_transmission_response.recv().await {
// Process the transmission response.
self_.finish_transmission_request(peer_ip, transmission_response);
self_.finish_transmission_request(peer_ip, transmission_response).await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to process the transmission response entries in the order they arrive? if not, we could spawn tasks that performs this action, in order to be able to perform multiple of them concurrently

node/bft/src/worker.rs Outdated Show resolved Hide resolved
@@ -182,16 +182,13 @@ impl<N: Network, C: ConsensusStorage<N>> LedgerService<N> for CoreLedgerService<
(TransmissionID::Transaction(expected_transaction_id), Transmission::Transaction(transaction_data)) => {
match transaction_data.clone().deserialize_blocking() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not ideal that we do blocking deserialization in an async context; we should either do it before calling check_transaction_basic (with a deserialized transaction), or use a blocking task here

@@ -201,16 +198,13 @@ impl<N: Network, C: ConsensusStorage<N>> LedgerService<N> for CoreLedgerService<
(TransmissionID::Solution(expected_commitment), Transmission::Solution(solution_data)) => {
match solution_data.clone().deserialize_blocking() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator

@ljedrz ljedrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments; LGTM in general, but there are some performance considerations.

Copy link
Contributor

@randomsleep randomsleep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_transmission_basic is too strict for all TransmissionResponse. It will cause liveness issues. For example, it's common that a valid Certificate contains a transaction that tries to consume a serial number that is already consumed. With this patch, the validator will fail to sync the transaction and thus will fail to fetch the Certificate.

The validator should perform some more basic checks (e.g. exclude Fee transaction, invalid format) before signing BatchPropose. It should not reject the transaction if already included in a Certificate.

raychu86 and others added 2 commits February 13, 2024 10:29
Co-authored-by: ljedrz <ljedrz@users.noreply.github.com>
Signed-off-by: Raymond Chu <14917648+raychu86@users.noreply.github.com>
@vicsn
Copy link
Collaborator

vicsn commented Feb 13, 2024

@randomsleep can you elaborate?

For example, it's common that a valid Certificate contains a transaction that tries to consume a serial number that is already consumed.

This sounds like something which should be invalid.

The validator should perform some more basic checks (e.g. exclude Fee transaction, invalid format) before signing BatchPropose. It should not reject the transaction if already included in a Certificate.

This PR introduces calling check_transaction_basic on every transaction before signing batch proposals, so if all goes well we will only include valid transactions as Certificate and they won't be at risk of further rejection perse.

@raychu86
Copy link
Contributor Author

I believe @randomsleep might be correct that we may not need to verify the transmissions in the BatchCertificate, because it may prevent us from syncing certificates that are included in a subsequent subdag. Perhaps these transactions/double-spends were going to be aborted in the block; instead, we forcefully ignore ignore transmissions which prevents us from syncing up with the certificate. The validator would then get stuck until is starts syncing via blocks. Definitely needs more thought.

However, we definitely do need to verify the transmissions from the proposals.

@randomsleep
Copy link
Contributor

Yes, @raychu86 have explained.

However, while checking 'BatchPropose', we should also allow duplicated transactions. This is because at the time some validarors may have committed some transactions but others haven't.

@randomsleep
Copy link
Contributor

My thoughts on transaction checking rules:

  1. Before proposing my own 'BatchPropose': do all kinds of necessary check. Prevent attack from users.
  2. While checking incoming 'BatchPropose': only do basic check. The check should be independent from latest ledger state because validators may have different state. Prevent attack from malicious validators.
  3. While adding 'Certificate': no check.
  4. While commiting 'Subdag': Make sure bad transactions won't block executing.

@raychu86
Copy link
Contributor Author

raychu86 commented Mar 1, 2024

This should be replaced by AleoNet/snarkVM#2376 and #3104.

@raychu86 raychu86 marked this pull request as draft March 1, 2024 06:00
@raychu86 raychu86 closed this Mar 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants