Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Dandelion++ support to public networks: #6314

Open
wants to merge 1 commit into
base: master
from

Conversation

@vtnerd
Copy link
Contributor

vtnerd commented Jan 31, 2020

  • New flag in NOTIFY_NEW_TRANSACTION to indicate stem mode
  • Stem loops detected in tx_pool.cpp
  • Embargo timeout for a blackhole attack during stem phase
  - New flag in NOTIFY_NEW_TRANSACTION to indicate stem mode
  - Stem loops detected in tx_pool.cpp
  - Embargo timeout for a blackhole attack during stem phase
@@ -160,21 +160,25 @@ struct txpool_tx_meta_t
uint64_t max_used_block_height;
uint64_t last_failed_height;
uint64_t receive_time;
uint64_t last_relayed_time;
uint64_t last_relayed_time; //!< If Dandelion++ stem, randomized embargo timestamp. Otherwise, last relayed timestmap.

This comment has been minimized.

Copy link
@vtnerd

vtnerd Jan 31, 2020

Author Contributor

Reviewers may want to take notice: I hijacked last_relayed_time in mempool to mean "embargo timeout timestamp" when the tx is marked as Dandelion++ stem. There's some padding space that could be used instead, if preferred.

if (!private_req.txs.empty())
get_protocol()->relay_transactions(private_req, source, epee::net_utils::zone::invalid);
get_protocol()->relay_transactions(private_req, source, epee::net_utils::zone::invalid, relay_method::local);
}
return true;
}

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 1, 2020

Contributor

That seems to be saying "relay stem txes in fluff mode on a timer". That seems wrong ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 2, 2020

Author Contributor

The protocol says to relay in fluff mode when the embargo timer expires. I don't see any difficulties with mixing other older fluff txes, but maybe I need to think about it more. I do need to add a "sort" before sending fluff txes to hide order received and stem embargo vs fluff. I want to do this in another PR so that it can be discussed separately.

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 2, 2020

Contributor

I do not see any test for that embargo timer though.

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 4, 2020

Author Contributor

The embargo timer is leveraging existing logic in get_relayable_transactions. The filtering logic in that function now checks for the dandelionpp_stem flag in tx_meta, checks the embargo timer, etc. Previously it was filtering on a longer, not randomized timeout.

The name of the function is poor, because it does give the impression that it will be returning all transactions in the pool not marked as do_not_relay. But its filtering newish transactions that don't need to be broadcast again.

another stem node in that situation, a loop over the public
network hasn't been hit yet. */
if (tx_relay == relay_method::stem && meta.dandelionpp_stem)
tx_relay = relay_method::fluff;

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 1, 2020

Contributor

I read this as "if we get a new tx in stem mode, and we already had it (also in stem mode), then broadcast as fluff. Is that correct ? Should it not do the normal thing it does when receiving a new stem tx, whether or not it already has it ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 2, 2020

Author Contributor

Yes, you understood this correctly. The Dandelion++ states as footnote when describing their analysis implementation: "Our implementation enters fluff mode if there is a loop in the stem". Grin behavior is identical. Unfortunately the paper isn't specific on what to do when this occurs (its only mentioned in the footnote), but when calculating the probabilities, one of the scenarios was "the stem loops back to v, which terminates the stem". So I think their analysis already takes this into account.

The detection mode here is the most simple: if we sent as a Dandelion++ stem once, and its being re-added as a Dandelion++ stem, then mark it as a loop.

Some thoughts I have:

  • Record incoming connection uuid, and only claim loop if the recorded uuid differs? This would allow for re-attempting a stem - since each node has an independent epoch, the path may not be identical on the second attempt and might work the second time.
  • This may not be the most appropriate place (from a purely architecture view) to detect and alter stem loop behavior, but it is the easiest.

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 2, 2020

Contributor

If it's in the original paper then I guess it's fine. And thinking about it, it's a difference between "Eve can force Alice to switch to fluff early" and "Eve can either sit on the tx and delay the tx, or fluff herself", so it doesn't seem too annoying.

tx_relay = relay_method::fluff;
}
else
meta.set_relay_method(relay_method::none);

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 1, 2020

Contributor

That else if for hte "if (existing_tx)" right ? It seems indented off.

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 1, 2020

Author Contributor

Yikes, a tab probably slipped in here. Will fix.

if (meta.upgrade_relay_method(tx_relay) || !existing_tx) // synchronize with embargo timer or stem/fluff out-of-order messages
{
//update transactions container
meta.last_relayed_time = std::numeric_limits<decltype(meta.last_relayed_time)>::max();

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 1, 2020

Contributor

Since last_relayed_time is the embargo timer, this seems to say you're changing the embargo timer to "never", instead of what it was the first time you received it. If this is correct, then theoretically the first relay in a stem series can tell whether a node was the originator of a tx by never relaying stem txes except once back to the sender. If the tx is not seen again within a few minutes, then that node is likely the one that originated it.

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 2, 2020

Author Contributor

If sending back to the prior node in the stem, the stem loop detection will take precedence and immediately fluff. The last_relayed_time value is only overwritten if the relay_method change goes up in the chain of none->local->stem->fluff->block. The value will be reset in on_transactions_relayed, assuming the transaction makes it to the relay logic code. The only time this is prevented is when relay_method::none, which is indicating this value is invalid anyway.

I probably need to a filter in cryptonote_protocol_handler.inl that drops transactions in the stem phase on outgoing connections (only incoming connections should be sending stem transactions). This strengthens the protocol a bit (not sure of any good attacks this prevents though).

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 2, 2020

Author Contributor

Also, as per your attack, honest stem nodes should behave identically to the originator. So nothing should be gained by sending back. If this is incorrect somewhere, then it definitely needs to be fixed.

{
tx_relay = relay_method::stem;
fluff_txs.reserve(arg.txs.size());
}

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 1, 2020

Contributor

Are the reserved vectors swapped w.r.t arg.dandelionpp_fluff ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 1, 2020

Author Contributor

Nope, I botched this. Initially the field was called dandelionpp_stem, but I flipped the logic so that zero initialization of the struct would default to stem mode. And since this can't be easily observed through testing ... anyway will change.

case relay_method::block:
return false;
case relay_method::stem:
tx_relay = relay_method::fluff; // don't set stempool embargo when skipping to fluff

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 1, 2020

Contributor

Why does this do so without checking the embargo timer ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 1, 2020

Author Contributor

Looks like another tab snuck in here too or something. Will fix that.

This might require a bigger comment to explain. This is purely defensive programming. If someone requests stem mode over i2p/tor, the request cannot be done correctly because on_transactions_relayed (line 799) will put the tx in the stempool, making it eligible for broadcasting over ipv4/6 after an embargo timeout. So this was a quick hack so that "if I2P/Tor zone without white noise":

Requested Mode New TxPool State
local local
stem fluff
fluff fluff

Perhaps a return false; is better in this situation, indicating it can't work?

Also, if ipv4/6 AND local or stem requests map to Dandelion++ mode.

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 1, 2020

Contributor

Why would it be a problem to place an incoming tx in stem mode from i2p/tor in the txpool and fluff it after the embargo if not seen again ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Feb 6, 2020

Author Contributor

A transaction received over i2p/tor is forwarded over public ipv4/6 in stem mode. The issue is the public interface allows a request to send transaction(s) over i2p/tor in stem mode, which should arguable be rejected (return false). Dandelion++ over i2p/tor requires mempool tracking the "zone" a transaction was received for the embargo timeout. I'm not aware of anything else blocking Dandelion++ over i2p/tor, but it might probably worth thinking about this mode separately. A sybil actor should already have difficulties when tracing origin IP over i2p/tor. However, Dandelion++ over i2p/tor can get interesting with white noise since it makes advanced ISP spying more difficult too.

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Feb 6, 2020

Contributor

I don't understand, and I don't even know which part I don't understand here, but I'm happy enough given this is on purpose and not a mistake :)

@rbrunner7

This comment has been minimized.

Copy link
Contributor

rbrunner7 commented Feb 23, 2020

A question: How to best test that "from the outside"? I.e. as someone that runs a Dandelion++ enabled node and wants to see from log output whether some Dandelion++ enabled transactions arrive and are either passed on or are finally accepted into the mempool and re-broadcast (if I understand the protocoll correctly)?

I looked over the code but did not find something like a new Dandelion++ related log category. Maybe something like that could be very useful? I imagine only a few log lines at strategic points would already help a great deal watching a daemon "doing its thing".

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

vtnerd commented Feb 24, 2020

There is log of every outgoing p2p message (net.p2p.traffic) which includes the recipient. Command 2002 refers to transaction notifications. A stem should have one send, and a fluff one for every connection - 1 (or every connection on re-send). Logs can be added that partially duplicate those other messages - it might clarify whats happening when multiple transactions are being processed within a short time window (the command # / recipient is already fairly telling).

@erciccione

This comment has been minimized.

Copy link
Contributor

erciccione commented Feb 24, 2020

@vtnerd i'm gonna make a reddit post later today asking people to build this PR and run a testnet node with it, to test dandelion. I see this branch is 58 commits behind upstream master, could you rebase?

Shouldn't be an issue anyway, but since people are going to run a testnet node with this branch, better having them running current master + this PR, if possible.

@rbrunner7

This comment has been minimized.

Copy link
Contributor

rbrunner7 commented Feb 24, 2020

There is log of every outgoing p2p message (net.p2p.traffic) which includes the recipient. Command 2002 refers to transaction notifications.

I could not find net.p2p.traffic as a log category, but net.p2p.msg which indeed outputs a message NOTIFY_NEW_TRANSACTIONS which has a number of 2002.

On mainnet that leads to quite some amount of messages where it may be quite difficult to watch out for the communication patterns that you describe. On testnet most of the time nothing happens regarding transactions, so there it's probably feasible.

@vtnerd vtnerd force-pushed the vtnerd:feature/dandelionpp branch from 7e21376 to 0de790f Feb 25, 2020
@vtnerd

This comment has been minimized.

Copy link
Contributor Author

vtnerd commented Feb 25, 2020

rebased and added log statements in the net.p2p.tx category.

@erciccione

This comment has been minimized.

Copy link
Contributor

erciccione commented Feb 25, 2020

Call for testers: https://www.reddit.com/r/Monero/comments/f9cksh/help_us_test_dandelion_on_testnet_instructions/

(At the moment the post is waiting for manual approval, as it was erroneously considered a request for support by automod. Will be unlocked soon)

@rbrunner7

This comment has been minimized.

Copy link
Contributor

rbrunner7 commented Feb 25, 2020

I just tried with the new code, and somehow I could not get it to work: Everything worked transaction-wise, i.e. transactions were pooled, broadcasted and mined, but I never saw any additional log message using the new category net.p2p.tx.

But my daemon now gives me the following line four times, whether the new log category is set or not:

 2020-02-25 19:45:39.607 E Failed to get tx meta from txpool

Any idea what I might do wrong, or what might be the problem here?

@selsta

This comment has been minimized.

Copy link
Contributor

selsta commented Feb 25, 2020

gh-actions functional tests failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.