-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for swapping through Tor. #438
Conversation
e5aacee
to
00f1330
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
My comments are a bit all over the place, so I'll summarize things here:
I think we should move the creation of the ephemeral services into the Transport
implementation. This makes everything much more self-contained.
A more self-contained implementation is easier to test and hence allows us to drop the e2e test again (we already have too many IMO). I'd rather not add another one that is identical apart from the transport protocol.
I am not fussed about whether the implementation goes into its own crate (in this workspace) or is just a single module within swap
. What is important to me is that we can test the Tor transport with a trivial protocol like ping
instead of running a full swap.
swap/src/bin/asb.rs
Outdated
let (tor_socks5_port, _ac) = match uac.assert_tor_running().await { | ||
Ok(_) => { | ||
tracing::info!("Tor found. Setting up hidden service. "); | ||
let ac = register_tor_services(config.network.clone().listen, uac).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point in the program's lifecycle, you can't actually know whether or not you will be listening on these addresses. If the ports f.e. are already taken, we will not be listening on this port and you will be forwarding the traffic to whichever application has the port.
This snippet of code should either run:
- Here in the eventloop
- As part of a new
NetworkBehaviour
through theinject_new_listen_addr
callback - Or as part of the events that are emitted from your custom transport through the
Listener
Like we discussed the other day, the latter would be the cleanest because it makes the whole Tor transport module self-contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point in the program's lifecycle, you can't actually know whether or not you will be listening on these addresses.
I don't understand why not. You configure your service to listen on these addresses, why would it change?
This snippet of code should either run:
Here in the eventloop
The eventloop
could work as well. However, SwarmEvent::NewListenAddr(addr)
will be fired for each local listening address. We should only care about localhost
/127.0.0.1
.
Like we discussed the other day, the latter would be the cleanest because it makes the whole Tor transport module self-contained.
I played with this thought but disagree with it. I think it would be weird if a network behavior has the side effect of interacting with Tor and adding hidden service.
Personally I think the cleanest is to add hidden services from the outside as we have more information about the listening addresses we care about.
Not everyone might be comfortable with handing over the control port to the application and with this approach, a user can configure the hidden service from the outside of the application.
I used this approach successfully while testing and only added the feature of adding hidden Tor services later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why not. You configure your service to listen on these addresses, why would it change?
It is not about whether or not it changes (although things can change: the user could disable their network interface f.e.). It is about whether or not the intent to listen on these addresses actually succeeds. Swarm::listen_on
is asynchronous. Calling the function by itself doesn't do anything other than modifying some state. That state is picked up on the next poll
of the Swarm and the transport is instructed to try and listen on the given multi-address.
You are not actually listening on the interface until the underlying transport emits ListenerEvent::NewAddress
.
However, SwarmEvent::NewListenAddr(addr) will be fired for each local listening address. We should only care about localhost/127.0.0.1.
There is is_loopback()
: https://doc.rust-lang.org/std/net/struct.Ipv4Addr.html#method.is_loopback
I played with this thought but disagree with it. I think it would be weird if a network behavior has the side effect of interacting with Tor and adding hidden service.
I agree that a NetworkBehaviour
would be weird. I am talking about the Transport
though. A Transport
's responsibility is to dial and listen using a certain transport protocol. If you have a Tor-Transport
, it seems perfectly reasonable that the implementation somehow achieves that it is reachable via Tor. One way of doing this is setting up an ephemeral service. There are other ways as well so we will probably need some configuration at some point.
On the point of side-effects: A NetworkBehaviour
as well as a Transport
need to be explicitly included by the user in the configuration. It doesn't appear out of nowhere. As part of choosing to include such a module, I would hope that they read the documentation and are aware of what provided functionality does.
Personally I think the cleanest is to add hidden services from the outside as we have more information about the listening addresses we care about.
There seems to be a misunderstanding here. Like I explained above, you actually have less information about the listen addresses on the outside. Already in-use ports are a problem for example. Another one is wildcard addresses like 0.0.0.0
that only get resolved to concrete interfaces by the kernel once you start listening. A third one might be unsupported multi-addresses like trying to listen on /ip4/0.0.0.0/udp/3000
although you don't have a Udp transport configured.
To summarize, the Transport
implementation has absolutely the most knowledge about what is going on in terms of which interfaces are being listened on etc. For example, the Swarm
can only emit a SwarmEvent::NewListenAddr
because the Transport
emitted a ListenerEvent::NewAddr
event.
Not everyone might be comfortable with handing over the control port to the application and with this approach, a user can configure the hidden service from the outside of the application.
Happy to make it configurable then. From our discussion, I see at least three possible options:
- Don't listen on Tor at all, i.e. just pass through to the underlying TCP transport
- "Listen" on an existing hidden service. This one is a bit weird because effectively, the only thing you can do is emit a
ListenerEvent::NewAddr
and inform the rest of the application that we are listening on a particular onion address. I guess we could do some validation with the Tor node that the user set up Tor correctly to route to the port that we are actually listening on. - Automatically set up one or more hidden service(s) for the addresses we are listening on. Can probably make this configurable to use random keys or existing ones.
In summary, having this code inside the Transport
achieves multiple things:
- It is more correct. Unless you react to events of the
Swarm
about successful / failed listening, you don't know whether or not you are actually listening on an interface. - It makes the functionality reusable because adding the transport is basically plug-and-play.
- Conceptually, a
Transport
is about listening and dialling. Having an implementation do only one of the two is a code-smell. The code-smell occurs again in the code insidemain
that sets up the hidden service that just sits there and doesn't know where it wants to belong. - It makes the functionality independently testable. We can make a swarm that automatically sets up a hidden service, emits the correct listen address and we have another swarm that connects to it. That is how (to my knowledge) all
Transport
s are tested in rust-libp2p.
Hope this clears things up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I thought about this in more detail and had some insights.
I came around to the view that a Transport
shouldn't just magically do things. It should only do something once the user gave a clear instruction via swarm.listen_on
. For example, a TcpTransport
doesn't do anything unless the user says:
swarm.listen_on(/ip4/0.0.0.0/**tcp**/8080)
Hence, I think we should design the TorTransport
in a way that it only does something if the user calls:
swarm.listen_on(/onion3/vww6ybal4bd7szmgncyruucpgfkqahzddi37ktceo3ah7ngmcopnpyyd:1234")
To make things work in the above case, the Transport
would have to:
- assume that the hidden service is already set up
- use the control port to read information about the hidden service and figure out what port it forwards to locally
- listen on the local tcp port to make sure we receive the traffic coming in at this onion address
This will cover the usecase where the user wants to set up Tor outside of the application.
To make use of ephemeral services, we could take some inspiration from "wildcard" addresses: 0.0.0.0
is not actually a valid IP address and you can't listen on it. It is merely a placeholder that just shares the same syntax.
To convey the intent that the TorTransport
should set up an ephemeral service, we could react to a special onion address in the form of:
swarm.listen_on(/onion3/00000000000000000000000000000000000000000000000000000000:1234")
Given a multi-address like this, the TorTransport
would:
- listen on a TCP port locally
- generate a new key and set up an ephemeral service that forwards to the local TCP port
- emit a
NewListenAddr
event with the generated onion address to inform the rest of the application about it
There are a few more corner cases in regards to choosing ports etc but I think overall, this would be a much better design:
- We clearly separate ourselves from the
TcpTransport
, i.e. aTorTransport
should not react to a/ip4/0.0.0.0/tcp/0
multiaddress being passed toswarm.listen_on
. - We leave it completely up to the user to manage the hidden service themselves or having the application set up a hidden service.
- We can support dialling via Tor without having to listen on Tor.
- We don't have to filter addresses for localhost only: In the case of an ephemeral service, we can explicitly listen on
127.0.0.1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up so clearly. Very Interesting idea.
Some thoughts and questions from my side:
[...]
swarm.listen_on(/onion3/vww6ybal4bd7szmgncyruucpgfkqahzddi37ktceo3ah7ngmcopnpyyd:1234")
- assume that the hidden service is already set up
- use the control port to read information about the hidden service and figure out what port it forwards to locally
- listen on the local tcp port to make sure we receive the traffic coming in at this onion address
From my current understand of Tor this feels a bit weird because the application is not listening on this onion address. It's your local Tor instance which does that for you.
The other thing is that I don't see a UX improvement with telling the application to listen on /onion3/vww6ybal4bd7szmgncyruucpgfkqahzddi37ktceo3ah7ngmcopnpyyd:1234
vs telling it to listen on /ipv/127.0.0.1/tcp/1234
.
The user has to configure Tor in the first place knowing that the there is a service listening already or that he will start a service which listens on this address. To be precise, the user will need to add the following to their torrc file:
HiddenServiceDir /usr/local/etc/tor/hidden_service
HiddenServicePort 3000 127.0.0.1:3000
- use the control port to read information about the hidden service and figure out what port it forwards to locally
I would try to avoid any deep interaction with the control port. the control port spec are very verbose and if things fail it's not clear why.
To give you a feeling, checkout the control port spec Section 3.9. GETINFO
from here https://gitweb.torproject.org/torspec.git/tree/control-spec.txt
In particular, these two requests seem to be relevant:
"hs/client/desc/id/<ADDR>"
Prints the content of the hidden service descriptor corresponding to
the given <ADDR> which is an onion address without the ".onion" part.
The client's cache is queried to find the descriptor. The format of
the descriptor is described in section 1.3 of the rend-spec.txt
document.
If <ADDR> is unrecognized or if not found in the cache, a 551 error is
returned.
[New in Tor 0.2.7.1-alpha]
[HS v3 support added 0.3.3.1-alpha]
"hs/service/desc/id/<ADDR>"
Prints the content of the hidden service descriptor corresponding to
the given <ADDR> which is an onion address without the ".onion" part.
The service's local descriptor cache is queried to find the descriptor.
The format of the descriptor is described in section 1.3 of the
rend-spec.txt document.
If <ADDR> is unrecognized or if not found in the cache, a 551 error is
returned.
[New in Tor 0.2.7.2-alpha]
[HS v3 support added 0.3.3.1-alpha]
I'm not even sure what the responds will look like 😅
To make use of ephemeral services, we could take some inspiration from "wildcard" addresses: 0.0.0.0 is not actually a valid IP address and you can't listen on it. It is merely a placeholder that just shares the same syntax.
To convey the intent that the TorTransport should set up an ephemeral service, we could react to a special onion address in the form of:
swarm.listen_on(/onion3/00000000000000000000000000000000000000000000000000000000:1234")
That's an interesting idea. Unfortunately parsing /onion3/00000000000000000000000000000000000000000000000000000000
to a multi address will fail because it is not a valid multi-address format. It expects the onion address to be in base32 format, e.g. oarchy4tamydxcitaki6bc2v4leza6v35iezmu2chg2bap63sv6f2did
.
I understand that you would like to just have a wild_card address here but I wouldn't know how a nice wild_card address could look like.
Imho a better UX would result from asking the user to explicitly state if she wants to have an hidden service registered for her or not.
#[serde(deny_unknown_fields)] | ||
pub struct TorConf { | ||
pub control_port: u16, | ||
pub socks5_port: u16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we get this information by talking to the control_port
? Feels redundant to have the user specify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we can but it didn't work out of the box. Maybe I misconfigured Tor or I need more rights to read the conf.
It should be as simple as: let socks_port = auc.get_conf("SocksPort").await?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I undid the latest commit. The initial approach resulted in a better UX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial approach resulted in a better UX.
I am curious, why is that? I had a look at the commit and - as long as it works - it looks like we can successfully reduce the configuration surface?
b5e4fe0
to
0a8c913
Compare
22d44ff
to
03986a7
Compare
131048d
to
ee50da4
Compare
swap/Cargo.toml
Outdated
[dependencies.torut] | ||
branch = "feature-flag-tor-secret-keys" | ||
default-features = false | ||
features = ["v3", "control"] | ||
git = "https://github.com/bonomat/torut/" | ||
version = "0.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We temporarily depend on this fork until the PR is merged: teawithsand/torut#10
This allows us to depend on Torut without v2 support and hence without the need for openssl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be interested in learning about the [patch]
section inside Cargo.toml
files. Useful to do things like this without having to modify the original manifest.
The good thing is that [patch]
is applied across the whole dependency tree and hence also affects transitive dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool stuff :)
54e8a2a
to
5d24710
Compare
This PR does a few things. * It adds a TorTransport which either dials through Tor's socks5 proxy or via clearnet. * It enables ASB to register hidden services for each network it is listening on. We assume that we only care about different ports and re-use the same onion-address for all of them. The ASB requires to have access to Tor's control port. * It adds support to dial through a local Tor socks5 proxy. We assume that Tor is always available on localhost. Swap cli only requires Tor to be running so that it can send messages via Tor's socks5 proxy. * It adds a new e2e test which swaps through Tor. For this we assume that Tor is currently running on localhost. All other tests are running via clear net.
5d24710
to
779834d
Compare
779834d
to
d8d072c
Compare
OnionV2 addresses are being deprecated and will be fully phased out on 15.10.2021: https://blog.torproject.org/v2-deprecation-timeline
d8d072c
to
c8e6db2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this up! Good to go in from my side, I don't see any blockers that cannot be fixed in a follow up PR. We might want to adapt some of the messages though as they might be confusing for users.
swap/src/network/tor_transport.rs
Outdated
} | ||
_ => { | ||
tracing::warn!( | ||
"Address {} could not be formatted. Dialling via clear net", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Address {} could not be formatted. Dialling via clear net", | |
"Address {} could not be formatted as onion address. Dialling via clear net", |
This message might be somewhat confusing for a user. Not sure if it should be on warn
level - but given that someone might want to stay anonymous it is better to warn them when going clearnet in case they dial dnsaddr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding your change: to_address_string
is not formatting it to an onion address but formats it to a string of the format address:port
so that we can dial through Tor.
I agree that this might not be confusing to some and was deciding between failing, i.e. not dialing at all, or just falling back to clear net.
I opted for dialing via clear net in the case we cannot format that address string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good.
The other day we discussed (and to my knowledge agreed) that a design where the user primarily uses the Swarm::listen_on
interface to control Tor would be better.
Do you still plan on doing this?
I would strongly prefer it because it serves as the much better foundation to iterate on as things are more contained. Like we discussed, in an MVP implementation, we don't need to include all the possibilities like user-configured Tor services if what we initially ship always uses ephemeral services anyway.
As such I think it shouldn't be a lot of work. The only thing we need to figure out is how we represent such a "wildcard" onion address.
swap/src/network/tor_transport.rs
Outdated
// Deal with non-onion addresses | ||
let protocols = multi.iter().collect::<Vec<_>>(); | ||
let address_string = protocols | ||
.iter() | ||
.filter_map(|protocol| match protocol { | ||
Protocol::Ip4(addr) => Some(format!("{}", addr)), | ||
Protocol::Ip6(addr) => Some(format!("{}", addr)), | ||
Protocol::Dns(addr) => Some(format!("{}", addr)), | ||
Protocol::Dns4(addr) => Some(format!("{}", addr)), | ||
_ => None, | ||
}) | ||
.collect::<Vec<_>>(); | ||
if address_string.is_empty() { | ||
tracing::warn!( | ||
"Could not format address {}. Please consider reporting this issue. ", | ||
multi | ||
); | ||
return None; | ||
} | ||
|
||
let address_string = address_string | ||
.get(0) | ||
.expect("Valid multiaddr consist out of max 1 address") | ||
.clone(); | ||
|
||
// check for port | ||
let port = protocols | ||
.iter() | ||
.filter_map(|protocol| match protocol { | ||
Protocol::Tcp(port) => Some(format!("{}", port)), | ||
Protocol::Udp(port) => Some(format!("{}", port)), | ||
_ => None, | ||
}) | ||
.collect::<Vec<_>>(); | ||
|
||
match port.len().cmp(&1) { | ||
Ordering::Greater => { | ||
tracing::warn!( | ||
"Did not expect more than 1 port in address {}. Please consider reporting this issue.", | ||
multi | ||
); | ||
Some(address_string) | ||
} | ||
Ordering::Less => Some(address_string), | ||
Ordering::Equal => Some(format!( | ||
"{}:{}", | ||
address_string, | ||
port.get(0) | ||
.expect("Already verified the length of the vec.") | ||
)), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks quite complicated. Any reason you can't just step through the iterator as needed?
let mut protocols = addr.iter();
let address = match protocols.next() {
// extract ip4 etc, fail if non-matching
// do proper error handling on `None`
};
let port = match protocols.next() {
// extract port for tcp / udp etc, fail if non-matching
// do proper error handling on `None`
};
format!("{}:{}", address, port)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a Result
seems more appropriate if we fail to extract a usable address.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check out the last commit ea6e2f8
(#438) :)
let tor_socks5_port = get_port() | ||
.expect("We don't care about Tor in the tests so we get a free port to disable it."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to be a free port? Is 0
not sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not try 0
. I did not want by accident chose a port which exists because we try to connect to this port and fail if it is not reachable.
Looking at the base32 alphabet, this should be a valid onion3 address:
I chose |
More options if you don't like
|
Co-authored-by: Daniel Karzel <daniel.karzel@coblox.tech>
…ap into tor-integration
@thomaseizinger : sorry, missed your comment yesterday.
There might have been a misunderstanding then 😅. I thought we agreed that this would be a nice-to-have and an also elegant solution is to treat an application which is only listening (i.e. the Alice's transport) in a way that it does not know about Tor. This allows us to achieve the scenario where the user does not want the application to create a hidden service without needing "wildcard" addresses. In regards to the wildcard address, I don't think it adds that much value atm because the only thing which would change is how the user would configure the ASB.
With that being said, I'd rather move on and focus on what is more urgent, e.g. dynamic transaction fees. |
But from the user's perspective, we do create a hidden service if Tor is available? The user is not going to care about the internal architecture (i.e. whether or not the transport is agnostic to Tor or not), so from their perspective, the application will "make Tor work" if it is there. I see two paths forward:
As an added benefit, the discussed design also results in a much more contained module which I would argue is a sign that this is the way to go.
I think it is important to strike a balance between providing features and adding complexity and in the current state of this PR, I see an imbalance here not only in complexity for the user but also in the form of technical debt which results in complexity that we need to deal with when working with the codebase on a daily basis.
Okay. How do we proceed with this PR then? Close? Move to draft? Someone else taking over? |
This PR is feature complete from both sides, the ASB and the swap cli. It's been reviewed extensively and the feedback and improvements were implemented. The main thing which is still being debated here is where the hidden services should be created. |
bors r+ |
Resolves: #447
This PR does a few things.
Note: It is expected that the new test will fail on CI because we do not have Tor running. That's why it wasn't added yet. I'll play around in my own fork to not use precious CI resources :)
The rest is ready for review.
// edit:
Swapping through Tor on one machine seems to be working:
Asb:
Swap: