This is part of a four-issue tunnel creation story
A newly created tunnel takes up to ~14 minutes before it reliably routes traffic. There are two distinct delays, each with an operator-side and a client-side component:
- Delay 1 (~3-4 min): creation → toggle turns green
- Delay 2 (~0-10 min): toggle green → traffic flows
- UX consequence
- app#160 — green toggle shown before tunnel is usable
Summary
This and #160 are related to a degraded experience where not only does a tunnel take minutes to become usable, the app shows the tunnel as active when it's not yet routing traffic.
The connector heartbeat interval is computed as lease_duration_seconds / 2 + jitter (app/lib/src/heartbeat.rs:566-573). For the observed connector datum-connect-jttwh, the Lease has lease_duration_seconds ≈ 1200, yielding a ~10 minute heartbeat interval.
Since the network-services-operator re-reconciles downstream tunnel routes on each connector heartbeat (see network-services-operator#167), this interval directly controls how long a newly created tunnel remains non-functional after the toggle turns green. A user who creates a tunnel immediately after a heartbeat fires will wait the full ~10 minutes before routing is confirmed live.
Relevant code
// app/lib/src/heartbeat.rs
const DEFAULT_LEASE_DURATION_SECS: i32 = 30;
fn renewal_interval(lease_duration_seconds: i32) -> Duration {
let base = Duration::from_secs((lease_duration_seconds / 2).max(1));
let jitter_max = (base.as_secs() / 5).max(1);
let mut rng = rand::rng();
let jitter = rng.random_range(0..=jitter_max);
base + Duration::from_secs(jitter)
}
DEFAULT_LEASE_DURATION_SECS is 30s, but the actual Lease resource in the cluster has a much longer duration (~1200s), overriding this default.
Expected
Either:
- (Preferred) Decouple tunnel routing confirmation from the heartbeat entirely — fix in network-services-operator#167 makes the NSO push addressing proactively, making the heartbeat interval irrelevant for Delay 2.
- (Short-term) Reduce the Lease duration in the cluster to something appropriate for interactive use (e.g. 60s → 30s heartbeat interval), so Delay 2 is bounded to <60s even without the NSO fix.
Impact
Up to ~10 minute window where tunnel toggle is green but tunnel does not route traffic. Worse when tunnel is created right after a heartbeat. See also network-services-operator#166 for Delay 1 (creation → toggle green).
This is part of a four-issue tunnel creation story
A newly created tunnel takes up to ~14 minutes before it reliably routes traffic. There are two distinct delays, each with an operator-side and a client-side component:
Summary
This and #160 are related to a degraded experience where not only does a tunnel take minutes to become usable, the app shows the tunnel as active when it's not yet routing traffic.
The connector heartbeat interval is computed as
lease_duration_seconds / 2 + jitter(app/lib/src/heartbeat.rs:566-573). For the observed connectordatum-connect-jttwh, the Lease haslease_duration_seconds ≈ 1200, yielding a ~10 minute heartbeat interval.Since the network-services-operator re-reconciles downstream tunnel routes on each connector heartbeat (see network-services-operator#167), this interval directly controls how long a newly created tunnel remains non-functional after the toggle turns green. A user who creates a tunnel immediately after a heartbeat fires will wait the full ~10 minutes before routing is confirmed live.
Relevant code
DEFAULT_LEASE_DURATION_SECSis 30s, but the actual Lease resource in the cluster has a much longer duration (~1200s), overriding this default.Expected
Either:
Impact
Up to ~10 minute window where tunnel toggle is green but tunnel does not route traffic. Worse when tunnel is created right after a heartbeat. See also network-services-operator#166 for Delay 1 (creation → toggle green).