Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: remove derp regions in favor of direct urls #1831

Merged
merged 28 commits into from
Dec 18, 2023

Conversation

dignifiedquire
Copy link
Contributor

@dignifiedquire dignifiedquire commented Nov 23, 2023

Breaking Change: This change removes the notion of derp regions, and simplifies it to derpers, which are identified by their URL.
Currently the URL is considered fully, including all path and query params. This might change later depending on usage.

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised at the interpretation of a derp URL here. If I understand things correctly this currently uses the URL to contact the DERP server, so it only ever uses the hostname part of the URL. I guess I was expecting something more like a defined URL format:

https://hostname_or_ip?stun_only=bool&stun_port=u16

In that case the DerpNode would store a hostname_or_ip field and have a ::from_url() constructor.

Currently it seems like the URL, which could carry all sorts of metadata, is only used as a hostname and all the other fields of it's type are ignored.

I also wonder if a lot of the code that currently takes a region_id, or in this PR derp_url, should take an Arc<DerpNode> instead. Though that's probably a question of only secondary importance.

@link2xt
Copy link
Contributor

link2xt commented Nov 26, 2023

Authentication tokens not sent around in tickets but used to authenticate with your DERP server can be part of the URL later as well, it's nice that it is extensible (see TURN REST API and https://github.com/coturn/coturn README).

@flub
Copy link
Contributor

flub commented Nov 29, 2023

I'm rather stuck at what the problem with the pingers is, the relevant log is this:

2023-11-29T12:02:03.926900Z DEBUG netcheck.actor: iroh_net::netcheck: netcheck actor starting
2023-11-29T12:02:03.927049Z DEBUG netcheck.actor:stun_udp_listener{local_addr="0.0.0.0:46578"}: iroh_net::netcheck: udp stun socket listener started
2023-11-29T12:02:03.927090Z DEBUG netcheck.actor:stun_udp_listener{local_addr="[::]:32794"}: iroh_net::netcheck: udp stun socket listener started
2023-11-29T12:02:03.927968Z DEBUG netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: reportstate actor starting port_mapper=false skip_external_network=false
2023-11-29T12:02:03.961709Z DEBUG run_probe{probe=Ipv4 after 0ns to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:03.961971Z DEBUG run_probe{probe=Ipv4 after 0ns to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: sending probe StunIpv4 derp_addr=127.0.0.1:42703 send_res=Ok(28) txid=transaction id (0xFB4DF9BB9CFDDD84812F3606)
2023-11-29T12:02:04.063210Z DEBUG run_probe{probe=Ipv4 after 100ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:04.063388Z DEBUG run_probe{probe=Ipv4 after 100ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: sending probe StunIpv4 derp_addr=127.0.0.1:42703 send_res=Ok(28) txid=transaction id (0xFAA6C59C3713CD6AFB95B29E)
2023-11-29T12:02:04.163043Z DEBUG netcheck.actor:reportgen.actor:captive-portal: reqwest::connect: starting new connection: http://127.0.0.1/
2023-11-29T12:02:04.163111Z DEBUG netcheck.actor:reportgen.actor:captive-portal: hyper::client::connect::http: connecting to 127.0.0.1:80
2023-11-29T12:02:04.163248Z DEBUG run_probe{probe=Ipv4 after 200ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:04.163450Z DEBUG run_probe{probe=Ipv4 after 200ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: sending probe StunIpv4 derp_addr=127.0.0.1:42703 send_res=Ok(28) txid=transaction id (0xFC30650F62D8BACB7BE17CEB)
2023-11-29T12:02:04.866379Z  WARN netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: check_captive_portal error: error sending request for url (http://127.0.0.1/generate_204): error trying to connect: tcp connect error: Connection refused (os error 111)
2023-11-29T12:02:05.301265Z DEBUG run_probe{probe=Https after 300ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:05.301319Z DEBUG run_probe{probe=Icmp after 300ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:05.301346Z DEBUG run_probe{probe=Https after 400ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:05.301369Z DEBUG run_probe{probe=Icmp after 400ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:05.301391Z DEBUG run_probe{probe=Https after 500ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:05.301413Z DEBUG run_probe{probe=Icmp after 500ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: starting probe
2023-11-29T12:02:05.301962Z DEBUG iroh_net::netcheck::reportgen: probe set aborted: no derp node addr: not implemented probe=Https { delay: 300ms, node: DerpNode { url: http://127.0.0.1:42703/, stun_only: true, stun_port: 42703 } }
2023-11-29T12:02:05.302090Z DEBUG run_probe{probe=Icmp after 300ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: ICMP ping start to 127.0.0.1:42703 with payload len 23 - derp http://127.0.0.1:42703/
2023-11-29T12:02:05.302115Z DEBUG run_probe{probe=Icmp after 300ms to http://127.0.0.1:42703/}: iroh_net::ping: Creating pinger addr=127.0.0.1 ident=39972
2023-11-29T12:02:05.302218Z DEBUG run_probe{probe=Icmp after 400ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: ICMP ping start to 127.0.0.1:42703 with payload len 23 - derp http://127.0.0.1:42703/
2023-11-29T12:02:05.302240Z DEBUG run_probe{probe=Icmp after 400ms to http://127.0.0.1:42703/}: iroh_net::ping: Creating pinger addr=127.0.0.1 ident=18599
2023-11-29T12:02:05.302636Z  WARN run_probe{probe=Icmp after 400ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: icmp latency measurement failed: Multiple identical request
2023-11-29T12:02:05.302696Z DEBUG run_probe{probe=Icmp after 500ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: ICMP ping start to 127.0.0.1:42703 with payload len 23 - derp http://127.0.0.1:42703/
2023-11-29T12:02:05.302710Z DEBUG run_probe{probe=Icmp after 500ms to http://127.0.0.1:42703/}: iroh_net::ping: Creating pinger addr=127.0.0.1 ident=15237
2023-11-29T12:02:05.303343Z  WARN run_probe{probe=Icmp after 500ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: icmp latency measurement failed: Multiple identical request
2023-11-29T12:02:05.303631Z  WARN run_probe{probe=Icmp after 300ms to http://127.0.0.1:42703/}: iroh_net::netcheck::reportgen: icmp latency measurement failed: Network error.
2023-11-29T12:02:05.303698Z DEBUG netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: finished probe: ProbeReport { ipv4_can_send: false, ipv6_can_send: false, icmpv4: false, delay: None, probe: Icmp { delay: 400ms, node: DerpNode { url: http://127.0.0.1:42703/, stun_only: true, stun_port: 42703 } }, addr: None }
2023-11-29T12:02:05.303768Z DEBUG surge_ping::client: no one is waiting for ICMP packet (V4(Icmpv4Packet { source: 127.0.0.1, destination: 0.0.0.0, ttl: None, icmp_type: IcmpType(0), icmp_code: IcmpCode(0), size: 31, real_dest: 127.0.0.1, identifier: PingIdentifier(33011), sequence: PingSequence(0) }))
2023-11-29T12:02:06.963097Z  WARN netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: tick: probes timed out
2023-11-29T12:02:06.963136Z DEBUG netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: all tasks done
2023-11-29T12:02:06.963152Z DEBUG netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: aborting 1 probe sets, already have enough reports
2023-11-29T12:02:06.963172Z DEBUG netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: Sending report to netcheck actor
2023-11-29T12:02:06.963200Z DEBUG netcheck.actor:reportgen.actor: iroh_net::netcheck::reportgen: reportgen actor finished

A few things:

  • All but the first ICMP probe fails to instantiate the pinger because they say "multiple identical requests". Yet they all have a different ident.
  • Somehow a ping response for an ident we never created arrives (PingIdentifier(33011)). I don't think we do any other pinging so... eh?
  • The first ping fails with just "Network error", which as far as I can trace the surge_ping code means the oneshot sender inserted in the ReplyMap of the client is dropped. It is a mystery to me how this could happen without surge_ping's recv loop doing this, but that only does it when it finds a matching entry in the dict and thus it sends a successful reply, or when the pinger is dropped. but neither of these are happening i think
  • Why is this happening now? Like, this should have failed before, this PR shouldn't be changing this behaviour.

@Arqu any change the runner environment has changed? the permissions it has to ping on a raw socket?

@flub
Copy link
Contributor

flub commented Nov 29, 2023

oh, and the nighly rust job succeeds. It is being run on the same runner.

@dignifiedquire dignifiedquire changed the title [WIP] feat: remove derp regions feat: remove derp regions in favor of direct urls Dec 1, 2023
@Arqu
Copy link
Collaborator

Arqu commented Dec 1, 2023

/netsim branch arqu/derp_drop_region

region_id = 1
host_name = "foo.bar"
[[derp_nodes]]
url = "https://foo.bar"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make a follow-up issue to be able to set the stun_port in the URL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the url sets the derp port already, stun port is a seperate port

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is what I mean. Currently if the derp server is configured with a non-default STUN port you'd have to manually write the config. We could include the stun_port as a query-parameter on the URL and thus you'd only need to URL to use the derp server.

fn send_addr_to_vec(addr: &SendAddr) -> Vec<u8> {
match addr {
SendAddr::Derp(url) => {
let mut out = vec![1u8];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is micro-optimising, but can't this get the vector size exactly right by creating the byteified URL up-front?

out
}
SendAddr::Udp(ip) => {
let mut out = vec![0u8];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can also know the vec size up-front

/// Nearest DERP region ID; 0 means none/unknown.
my_derp: AtomicU16,
/// Nearest DERP node ID; 0 means none/unknown.
my_derp: std::sync::RwLock<Option<Url>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL is actually a pretty complex thing. Could we make this an Arc?

also, do we really need a full URL or just a hostname?

// We will fall back to sending ICMP pings. These should succeed when we have a
// working pinger.
icmpv4: have_pinger,
icmpv4,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we leave a TODO here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is, a couple of lines above

@@ -992,16 +983,20 @@ mod tests {

#[tokio::test(flavor = "current_thread", start_paused = true)]
async fn test_add_report_history_set_preferred_derp() -> Result<()> {
fn derp_url(i: u16) -> Url {
format!("http://{i}.com").parse().unwrap()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could have been done as const DERP_URL_1 = format!(...) but it doesn't matter much.

let delay = DEFAULT_INITIAL_RETRANSMIT * attempt as u32;

if if_state.have_v4 && derp_node.ipv4.is_enabled() {
if if_state.have_v4 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how relevant this is? I guess you could configure the Node to be only IPv4 or IPv6 in case you have some network trouble with one of them. But maybe we just won't need this.

I don't think it would be a very hard thing to add tough. We could add an ipv4_enabled and ipv6_enabled field to DerpNode, and allow getting that info from a query parameter in the URL. Is this worth creating an issue for to keep track of and do as a follow-up?

iroh-net/docs/derp_nodes.md Outdated Show resolved Hide resolved
iroh-net/src/derp/map.rs Outdated Show resolved Hide resolved
iroh-net/src/disco.rs Outdated Show resolved Hide resolved
@dignifiedquire
Copy link
Contributor Author

/netsim branch feat-no-derp-url

Copy link

feat-no-derp-region.6924f761e81b67238edbfc04434c95b5bd5e142f
Perf report:

test case throughput_gbps throughput_transfer
iroh 1_to_1 0.81 1.41
iroh 1_to_3 2.71 4.69
iroh 1_to_5 3.65 5.34
iroh 1_to_10 4.58 5.54
iroh 2_to_2 1.96 3.56
iroh 2_to_4 3.75 6.56
iroh 2_to_6 4.96 7.79
iroh 2_to_10 6.07 8.48
iroh_latency_200ms 1_to_1 0.86 1.59
iroh_latency_200ms 1_to_3 2.68 4.85
iroh_latency_200ms 1_to_5 3.60 5.19
iroh_latency_200ms 1_to_10 4.46 5.66
iroh_latency_200ms 2_to_2 1.82 2.99
iroh_latency_200ms 2_to_4 3.44 6.10
iroh_latency_200ms 2_to_6 4.80 7.30
iroh_latency_200ms 2_to_10 5.75 7.64
iroh_latency_20ms 1_to_1 0.83 1.43
iroh_latency_20ms 1_to_3 2.75 5.10
iroh_latency_20ms 1_to_5 3.61 5.11
iroh_latency_20ms 1_to_10 4.36 5.13
iroh_latency_20ms 2_to_2 1.77 3.20
iroh_latency_20ms 2_to_4 3.81 6.59
iroh_latency_20ms 2_to_6 4.55 6.66
iroh_latency_20ms 2_to_10 5.74 7.81

@dignifiedquire dignifiedquire force-pushed the feat-no-derp-region branch 6 times, most recently from eb8b8a9 to d874af0 Compare December 15, 2023 20:24
@dignifiedquire
Copy link
Contributor Author

/netsim branch feat-no-derp-url

@dignifiedquire dignifiedquire added this pull request to the merge queue Dec 18, 2023
Merged via the queue into main with commit 4002c46 Dec 18, 2023
26 checks passed
@dignifiedquire dignifiedquire deleted the feat-no-derp-region branch December 18, 2023 14:20
fubuloubu pushed a commit to ApeWorX/iroh that referenced this pull request Feb 21, 2024
Breaking Change: This change removes the notion of derp regions, and
simplifies it to derpers, which are identified by their URL.
Currently the URL is considered fully, including all path and query
params. This might change later depending on usage.




- [x] Needs n0-computer/chuck#49

---------

Co-authored-by: Floris Bruynooghe <flub@n0.computer>
Co-authored-by: Kasey <kasey@n0.computer>
Co-authored-by: Ruediger Klaehn <rklaehn@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants