Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2P/Tor White Noise #5793

Open
wants to merge 2 commits into
base: master
from

Conversation

@vtnerd
Copy link
Contributor

commented Aug 4, 2019

This adds the "white noise" proposed in part 2 of my CCS. This also moves monerod closer to Dandelion++ (there is some code that can be shared between the features). Unfortunately, this does not complete part 2 of the CCS, one of the proposed goals was a separate mempool (also needed for Dandelion++). That will be in a hopefully soon-to-be-released patch - hacking the mempool is worthy of its own review. Also, the timestamp in the p2p protocol hasn't been cleared up (not strictly necessary for the CCS).

Completing Dandelion++ will now require:

  • A separate mempool (in the works for my CCS)
  • Loop detection (basically completed with the separate mempool)
  • Support for sending txs in a stem
  • Moving from a "flood" to a "fluff" implementation (see whitepaper and/or Bitcoin implementation).

The last two should be vastly easier to implement after this patch. The stem portion in particular is basically implemented and unit tested - only plumbing between parts is now needed. Anyone wishing to pick this up, see tests/unit_tests/levin.cpp, the test framework should make it easy to test stem selection, etc.

byte_slice

I also introduced a golang-like "slice" concept for bytes of data so that the "white noise" could be reference-count bumped instead of constantly copied for each connection. It also made it possible to remove a lock for every p2p message being sent. I haven't measured it, buts its all but certain to improve efficiency in tx relaying and eventually block relaying (each p2p message is generated once instead of copied repeatedly).

Changes from CCS

This patch also differs from the proposal in the CCS and my Monerokon talk. Another markdown file may be needed but -

  • Two outbound I2P/Tor connections are selected for white noise every 5 minutes
  • A 3 KiB dummy message or tx fragment is sent every 10-15 seconds over these links
  • An I2P/Tor hidden service that receives a tx immediately relays it over the public network instead of a delay. This is because Dandelion++ is likely preferable instead of an arbitrary delay (Dandelion++ has a stem phase then a fluff phase which has some waiting already).

I think these changes are defensible, the last one is the only controversial one. I changed it because once it was obvious that Dandelion++ would be incorporated into monerod, it was easier to work "within" that design instead of doing something different. I leave it to the CCS overloads on whether funds should be released as a result - it may mean that I have to complete Dandelion++ to complete the task :/

## Header
This header is sent for every Monero p2p message.

> 0 1 2 3

This comment has been minimized.

Copy link
@jtgrassie

jtgrassie Aug 5, 2019

Contributor

Formatting looks off. Recommend formatting as a code block as opposed to quote.

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 5, 2019

Author Contributor

Looks great in vim! Will change to code block.

This comment has been minimized.

Copy link
@jtgrassie

jtgrassie Aug 5, 2019

Contributor

Everything look great in vim ;)

This comment has been minimized.

Copy link
@jtgrassie

jtgrassie Aug 5, 2019

Contributor

Note you might want to double check this whole table (e.g. looks like 10 bits per byte here and a couple of the field widths look off, such as command is 4 bytes not 2).

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 6, 2019

Author Contributor

Changed to graphical table to 8-bits (that was embarrassing! whoops!).

The Command field was intentionally listed as 2-bytes and the Version field was intentionally listed as 1-byte despite both being 4-bytes in the code. The fields are little-endian with the upper portions unused, so I marked them as reserved. I think that was inappropriate because there isn't a way to use them in a backwards compatible way. An older p2p parser would be interpreting things differently.

Both are being changed shortly.

@@ -49,10 +49,12 @@
#include <boost/asio/ssl.hpp>
#include <boost/array.hpp>
#include <boost/noncopyable.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/shared_ptr.hpp> //! \TODO Convert to std::shared_ptr

This comment has been minimized.

Copy link
@jtgrassie

jtgrassie Aug 5, 2019

Contributor

Is this still needed? code looks to be using std::shared_ptr.

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 5, 2019

Author Contributor

There are still some boost::shared_ptr's in this file, so I added this comment.

This comment has been minimized.

Copy link
@jtgrassie

jtgrassie Aug 5, 2019

Contributor

Gotcha

#define CRYPTONOTE_NOISE_EPOCH_RANGE 30 // seconds
#define CRYPTONOTE_NOISE_MIN_DELAY 10 // seconds
#define CRYPTONOTE_NOISE_DELAY_RANGE 5 // seconds
#define CRYPTONOTE_NOISE_BYTES 1*1024 // 3 KiB

This comment has been minimized.

Copy link
@jtgrassie

jtgrassie Aug 5, 2019

Contributor

Should this be 3*1024 ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 5, 2019

Author Contributor

Yup, forgot to revert this after testing. Will revert.

@fuwa0529

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

Can a hidden service operator just ignore the white noise?

@fuwa0529

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

Ah, nvm. It's the ISP that this white noise is trying to obfuscate.

@fuwa0529

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

I wonder if I can achieve a similar white noise effect, by setting up a hidden service that accepts everything, and piping 3KB of spaces in a shell script to that service, in a while loop.

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

commented Aug 6, 2019

Can a hidden service operator just ignore the white noise?

The white noise is ignored by the hidden service, do you mean refuse inbound connections that are sending white noise? Not with this patch, but is something I have thought about. Runtime or compile time options can be added for this purpose, but this could make it more difficult on outbound users of the feature.

monerod already has rate limiting features for p2p (which may need improving). AFAIK nothing "new" is needed from that perspective.

I wonder if I can achieve a similar white noise effect, by setting up a hidden service that accepts everything, and piping 3KB of spaces in a shell script to that service, in a while loop.

Probably not. The ISP can monitor the Tor cell count being sent (or the I2P equivalent). In many (all?) cases the shell script would have to stop sending at exactly the moment monerod was sending out data so that Tor didn't alter its cell count usage in a given time frame.

@vtnerd vtnerd force-pushed the vtnerd:feature/tor_white_noise branch from 411417a to 3eb8281 Aug 6, 2019

@fuwa0529

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

The question on the ignoring of white noise was a misunderstanding on my part, but thanks for explaining.

On the shell scripting, the way I understand white noise used here is that it only has to obfuscate outgoing transactions made by the user, everything else can be analysed and need not be masked, like block sync, peer table, etc, those won't affect privacy.

But txs are small and rare, so a stream of random packet, the length of which follows the distribution of other txs is enough to mask it.


const auto send = [&txs, &source, pad_txs] (std::pair<const enet::zone, network_zone>& network)
{
if (network.second.m_notifier.send_txs(std::move(txs), source, pad_txs))

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 6, 2019

Author Contributor

This should be pad_txs || network.first != enet::zone::public_ to match existing behavior. That way, if the noise feature is disabled on the network, padding is still automatically used.

Will push a change later (or tomorrow).

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

commented Aug 6, 2019

The question on the ignoring of white noise was a misunderstanding on my part, but thanks for explaining.

It can be disabled on the sender side, --proxy tor,127.0.0.1:9050,disable_noise or --proxy tor,127.0.0.1:9050,5,disable_noise - the latter limits outbound connections over Tor to 5 and disables white noise whereas the first uses the default number of max outbound connections.

A CLI method for enabling/disabling the noise on a running monerod may also be useful for knowledgeable users to handle their own anonymity set/timeframe. Something to add in a future patch.

On the shell scripting, the way I understand white noise used here is that it only has to obfuscate outgoing transactions made by the user, everything else can be analysed and need not be masked, like block sync, peer table, etc, those won't affect privacy.

But txs are small and rare, so a stream of random packet, the length of which follows the distribution of other txs is enough to mask it.

I'm not sure what you are saying here. I think you might be referring to the padding feature which pads transactions to the next 1024 byte boundary to make bandwidth analysis harder. If the noise feature is disabled, the padding feature will be enabled (see my own code comment/review). Is that what you mean?

Or do you mean just pumping in noise when you are about to send? See reply above if thats what you meant.

@fuwa0529

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

Thanks @vtnerd. I meant pumping in noise continually, in an external tool. Now that you mentioned padding, with noise, built in is probably better.

return false;
}

temp = std::move(m_fragment_buffer);

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

I think the standard says the moved object will be destructible-valid, but does not mandate anything else. This means m_fragment_buffer's contents are undefined no ? I expect you intend it to be empty.

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

"A valid but unspecified state". The standard never specifies behavior on moves, so it could copy the data leaving the source unchanged (small string optimization apparently might do this).

Anyway - if the wire format is "valid" then m_fragment_buffer is always cleared when the next fragment begins (which puts the object into a specific state from the unspecified state). Or so I thought - the size() checks above are not valid - the fragment buffer should be empty for that query might it might not be. I will add a clear after this move which will usually be a NOP. Good catch.

{
temp.clear();
m_fragment_buffer = std::move(temp);
}

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

Is this to save an allocation ? clear never frees ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

It is up to the implementation, but every implementation implements the std::vector requirements (never change capacity). The reason is probably to maintain predictable behavior. Always freeing is less performant in many use cases, and sometimes freeing makes it harder for the programmer to optimize around reserve/clear, etc.

It might be worth dropping since its not actually guaranteed. The code is also minimal at least, and all existing implementations currently re-use the memory.

@@ -128,7 +128,7 @@ connection_basic_pimpl::connection_basic_pimpl(const std::string &name) : m_thro
int connection_basic_pimpl::m_default_tos;

// methods:
connection_basic::connection_basic(boost::asio::ip::tcp::socket&& sock, boost::shared_ptr<connection_basic_shared_state> state, ssl_support_t ssl_support)
connection_basic::connection_basic(boost::asio::ip::tcp::socket&& sock, std::shared_ptr<connection_basic_shared_state> state, ssl_support_t ssl_support)

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

Why did you switch all these boost shared_ptr to std ?

return;

noise_channel& channel = zone_->channels.at(destination_);
assert(channel.strand.running_in_this_thread());

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

That'd hard kill the deamon, which may not be the best response. Would throwing instead be better ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

These are documenting asserts. Its not possible for them to fail, unless the code is altered. These are also never checked in release builds, so it wouldn't abort in released binaries. The idea is that if a change is made, these should fail in the unit tests (provided someone runs them in debug mode).

std::vector<boost::uuids::uuid> out_connections_;

//! \pre Called within `zone->strand`.
static void post(std::shared_ptr<detail::zone> zone)

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

Since you like micro opts, why are you passing a shared_ptr and not a const shared_ptr&, which saves the atomic inc/dec/test (which is probably not really relevant for speed, but maybe for code size) ? Any advantage doing it this way ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

The shared_ptrs are std::moved in both calls to this function, which skips the atomic inc/dec and prevents an indirection (reference/pointer to the internal shared_ptr pointer value). In many cases using const shared_ptr<...>& for a function parameter is probably preferred because moving is often not possible. This is all "internal" cpp code too, which helps because the number of call points are limited.

);

// padding is not useful when using noise mode
const std::string& payload = make_tx_payload(std::move(txs), false);

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

Reference to a temporary. I have some vague recollection it's a new C++ change that makes this OK. Can you confirm (it might be you who told me about it in the first place) ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

Yeah its an obscure C++ rule, C++ temporary binding. Sometimes I do it because if the function is updated to const & then it still compiles+works and no copy is made. It does feel awkward though for some reason. And there is no point right now (and likely ever since its generating a string here), so I will drop it.


void notify::run_epoch()
{
if (!zone_)

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

Lots of functions check for this. The ctor dereferences without a check. I do not see anything in this class that NULLs the shared_ptr. Is this abundance of caution, or can it actually be NULLed somewhere I did not see ?

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

It can't be NULL'ed anywhere unless ASIO botches a move call internally or invokes the same functor twice. I think ASIO documents that it will never invoke a callback twice, but I cannot find that right statement now. Too pessimistic?

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 20, 2019

Contributor

No, it's fine, I was wondering if I'd missed some clever implicit thing.

static_assert(unsigned(enet::zone::invalid) == 0, "invalid expected to be 0");
static_assert(unsigned(enet::zone::public_) == 1, "public_ expected to be 1");
static_assert(unsigned(enet::zone::i2p) == 2, "i2p expected to be 1");
static_assert(unsigned(enet::zone::tor) == 3, "tor expected to be 2");

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

Those two do not match

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

Yup, going to fix these.

static_assert(unsigned(enet::zone::tor) == 3, "tor expected to be 2");

// check for anonymity networks with noise and connections
for (auto network = ++m_network_zones.begin(); network != m_network_zones.end(); ++network)

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 19, 2019

Contributor

This implicitely skips over the public zone, correct ? That seems fairly vulnerable to adding another public zone (if we, say, add GSM based connections somewhere, not that we'd do that but for the sake of example).

This comment has been minimized.

Copy link
@vtnerd

vtnerd Aug 20, 2019

Author Contributor

Ok, so to rephrase so that it makes more sense to me - if someone added a public type network with an enum value of 4, then it would be treated incorrectly as an anonymity network. I will add a break in here if the zone "exceeds" tor since there I can't think of a static_assert that would work. Anyone adding an anonymity network should see that its not using it during testing and adjust.

This comment has been minimized.

Copy link
@moneromooo-monero

moneromooo-monero Aug 20, 2019

Contributor

Maybe some templatey C++y thing like...

template<address_type t> struct is_public { static const bool value = false; }
template<> struct is_public<public_> { static const bool value = true; }

Or maybe even don't have a default so additions have to specify one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.