feat(discovery): mDNS-SD primary-IP hints for service-order-aware dialing#423
feat(discovery): mDNS-SD primary-IP hints for service-order-aware dialing#423jondkinney wants to merge 52 commits intofeschber:mainfrom
Conversation
|
Similar to my comment on #422: Is this really necessary? From my testing (I will have to check again), at least on Linux if I connect an Ethernet Cable, the Listener on the Wifi Interface becomes unreachable (because it replies via the Ethernet port) - Service order is automatically correct. |
|
@feschber it was necessary in my testing because regardless of service order my mac was having the wifi chosen over the ethernet. I could immediately tell because it was choppy and slow on wifi and smooth on ethernet. Then confirmed by poking at things while they were running on both machines. |
d03f5fe to
f053971
Compare
Adds a host-side fallback that releases capture when the user
sweeps the cursor against the host-adjacent edge of the guest
and keeps pushing past a configurable threshold. Solves the
"two locked screens" case where the peer's capture backend
can't fire CaptureBegin (and therefore can't send Leave back),
leaving the host stuck capturing indefinitely until the
release-bind chord is pressed.
Algorithm lives in InputCapture::poll_next so every backend
(macOS, libei, layer-shell, x11, windows, dummy) gets it for
free — they only need to emit standard motion events through
the existing Stream interface, which they already do. The
wrapper tracks:
virtual_pos: signed position along the entry axis, clamped at
0 from below. No upper clamp — the wrapper can't know the
guest's far-edge extent without protocol-level cooperation,
and any proxy is wrong for some user's setup.
wall_pressure: motion that overshoots the host-adjacent edge
and would have driven virtual_pos negative. Fires
CaptureEvent::AutoRelease when the threshold is reached;
the capture loop then runs the same teardown path as the
release-bind chord.
State resets on Begin (entry to capture), AutoRelease (we
self-released), and external release (chord, peer Leave,
connection error, EnterOnly fallback).
Surface:
- New FrontendRequest::SetReleaseThreshold + FrontendEvent::
ReleaseThreshold IPC pair.
- New release_threshold_px field on the daemon config (0 = off,
serialized to config.toml).
- New AdwPreferencesGroup with a 0–500px slider in the GTK
window. Default 0 (disabled) so existing users see no
behavior change until they opt in.
- New CaptureEvent::AutoRelease variant + handling in
src/capture.rs's handle_capture_event (short-circuit to
release_capture, which already synthesizes key-ups and sends
Leave to the peer).
Known limitation: the wrapper has no way to know where the
guest's cursor actually is (the guest doesn't tell us). On
re-entry into a peer mid-session, virtual_pos resets to 0 but
the guest's cursor may still be in the middle of its screen
from the prior session, causing the threshold to fire from
the wrong reference point. A protocol-level Bounds event +
cursor-warp on Enter is needed for full correctness.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a new ProtoEvent variant carrying the receiving device's display geometry (in pixels). Sent by the emulation side right after acknowledging an Enter so the capturing peer can model the guest cursor's position along the entry axis. Wire format: 1-byte EventType discriminator (Bounds = 11) followed by big-endian u32 width and big-endian u32 height — 9 bytes total, well under MAX_EVENT_SIZE (21). This commit only adds the protocol wiring. Senders and the host-side cache come in subsequent commits. Old peers that don't recognize EventType=11 will skip the datagram per the forward-compat fix in the previous commit, so deployment is incremental: the emulation side can start sending Bounds without breaking older capturing peers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `display_bounds(pos)` and `warp_cursor(pos)` to the
InputEmulation trait and implement them across every backend:
- macOS: CGDisplay APIs for bounds, CGWarpMouseCursorPosition for warp
- x11: RandR for bounds, XWarpPointer for warp
- wlroots: wl_output extents + virtual_pointer.motion_absolute
- libei: region walking + ei_pointer.emit_motion_absolute
- Windows: GetSystemMetrics + SetCursorPos
- xdg_desktop_portal: no-op fallback (the protocol exposes neither
bounds nor a warp primitive)
These are the prerequisites for the protocol-based wall-press
auto-release: emulation hosts now have a common API to report their
display extents to peers and to warp the cursor on Enter so the
host's modeled virtual_pos = 0 matches the guest's actual cursor.
Wire the new emulation-side capabilities into the daemon's
listener task. When a peer's Enter arrives:
1. Reply Ack (existing behavior).
2. Reply Bounds(width, height) using the cached display
geometry from the active emulation backend.
3. Warp the local cursor to the entry edge of the displayed
position (0 for Left, width-1 for Right, etc., centered
along the orthogonal axis).
The warp is the structural fix for the "cursor jumps back to
where it was" symptom: previously, on re-entry into a peer
mid-session, the cursor stayed wherever the prior capture
session left it, breaking the host's wall-press model
(virtual_pos=0 in the host's mind didn't match the guest's
actual cursor column). With the warp, the host's model is
synchronized with the guest's reality on every Enter.
EmulationProxy gains:
- Cached display_bounds (Rc<Cell<Option<(u32, u32)>>>),
refreshed each time the underlying InputEmulation is
(re)created. Read by the listener task.
- warp_cursor(x, y) fire-and-forget. Drops if emulation
isn't currently active (no live backend to receive it).
ProxyRequest::Warp(x, y) carries the request to EmulationTask,
which dispatches to InputEmulation::warp_cursor.
If the active backend doesn't implement display_bounds — every
non-macOS backend right now — the listener skips the Bounds
reply and the warp call. The capturing peer falls back to its
existing "no upper clamp / virtual_pos = 0 on Begin" heuristic,
which is degraded but functional. Adding display_bounds /
warp_cursor to other backends unlocks correct behavior
incrementally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
InputCapture now keeps a per-position HashMap of peer display
geometry, populated when ProtoEvent::Bounds arrives from the
peer (handled in src/capture.rs's recv arm). track_wall_press
uses the cached entry-axis extent as the upper clamp for
virtual_pos:
self.virtual_pos = proposed.clamp(0.0, peer_extent);
Eliminates the runaway-virtual_pos bug from the heuristic
fallback: when the user obliviously over-pushes their physical
mouse past the guest's actual far edge, the modeled position
clamps at the real width instead of climbing fictionally to
infinity. Now the user's "walk back" cost is bounded by the
guest's actual screen width.
When the peer hasn't sent Bounds yet (older peer running
without the protocol extension, or in the brief pre-Ack
window of a fresh connection), peer_extent returns INFINITY
and the model degrades to the prior heuristic.
Cache lifecycle:
- Insert on ProtoEvent::Bounds.
- Drop on CaptureRequest::Destroy(handle) so re-adding the
same peer later starts fresh.
Combined with the previous commit (emulation warps cursor on
Enter), the host's virtual_pos = 0 at Begin now matches the
guest's actual cursor at column 0 (or width-1, etc.) on every
re-entry. The "cursor was in the middle, 200px back fires
release prematurely" bug is fixed structurally rather than
papered over.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The label "Auto-Release" reads as a global app preference; the description's "forwarded mouse capture" was ambiguous about which machine does the forwarding. Rename the group to "Outgoing Auto-Release" so the scope mirrors the surrounding "Outgoing Connections" / "Incoming Connections" groups, and lead the description with "When this machine is capturing input for a peer …" so a user scanning the window can tell at a glance that this setting only matters when the local machine is the host.
GtkScale's default behavior treats a vertical scroll event as +/- increment, which means the threshold creeps any time the user is scrolling the window and the cursor passes over the slider — easy to do given the slider sits in the middle of the preferences pane. Add an EventControllerScroll to the slider in CAPTURE phase that returns Propagation::Stop unconditionally. The scale's own scroll controller never sees the event, so the value doesn't change. Trade-off: scrolling doesn't pass through to the parent GtkScrolledWindow while the cursor is on the slider — the wheel becomes inert there. Acceptable: prior behavior was actively destructive (silent state corruption); this is just "no scroll in this small region." If users start complaining about the gap, the next step is to forward dy to the ancestor scrolled window's vadjustment manually before returning Stop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Old text described the mechanism ("releases capture automatically once
the cursor pushes past the host-adjacent edge") without explaining
when the user would actually need it. With the new peer-Leave deadline
gate (34605a7), wall-press only fires when the peer can't deliver a
Leave — i.e. when the peer's screen is locked or its capture backend
is otherwise suppressed. New text leads with that framing and trims
two sentences to two.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Capture-phase scroll handler used to return Propagation::Stop to suppress GtkScale's default scroll-to-adjust behavior, but Stop also killed propagation to the parent — so the main window wouldn't scroll when the cursor was over the slider. Frustrating because the slider sits in the middle of the preferences pane and "I just want to scroll past this" is the common interaction. Same capture-phase handler now walks up to the ancestor ScrolledWindow and bumps its vadjustment by `dy * step_increment` (or 40px when step_increment is unset). Mimics what native scroll passthrough would have done — slider value stays fixed, parent scrolls smoothly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Windows clippy flags `loop { let Some(...) = get_msg() else { break } }`
as while-let-loop. Rewrite to `while let Some(msg) = get_msg() { … }`.
The inner `break` for `RequestType::Exit` still breaks the surrounding
while-let, so semantics are unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the Proxy trait import needed by the wlroots backend's `output.id()` call (introduced when the emulation side started binding wl_output for display_bounds), and applies cargo fmt for this split's own files.
Before: when crossing machines, the guest's cursor jumped to the
midpoint of the entry edge — a ~100 px Y-jump on typical
displays — because the guest snapped to a hardcoded
(0, h/2) / (w/2, 0) point on Enter. Visually discontinuous and
hard to follow when the user is mid-task.
After: the host's capture backend snapshots the screen-space cursor
position at the instant of the edge crossing (CGEvent.location()
on macOS — the only backend that can report this today; others
emit None and the guest falls back to the prior midpoint warp).
The capture loop scales those host coords against the cached peer
geometry and sends them as a new ProtoEvent::MotionAbsolute right
after Enter. The guest handles MotionAbsolute by warping the
cursor to (x, y), overriding the entry-edge midpoint so the user
sees visual continuity across the boundary.
Layered choices:
- New ProtoEvent::MotionAbsolute { x, y } primitive rather than
bolting an offset onto Enter — gives a reusable
position-setting building block for future features (snap to
point on app launch, multi-monitor handoff, follow-host-cursor
modes) without inventing more event variants.
- Pixel coordinates in the receiver's screen space, not normalized
floats — host already caches peer bounds (Bounds proto event)
for the wall-press upper clamp, so it can do the scaling and
the guest just calls warp_cursor directly. Guest's
warp_cursor primitive already takes pixels.
- Backwards compatibility: peers running the previous protocol
don't recognize MotionAbsolute and skip it via the forward-
compat decode-tolerance fix from earlier in this branch. Old
hosts paired with new guests fall through to the entry-edge
midpoint (current behavior); new hosts paired with old guests
ignore MotionAbsolute and the cursor stays at the edge midpoint
too — neither pair regresses.
Capture backend coverage in this commit: macOS only (the
CGEventTap callback has cg_ev.location() at the moment of edge
crossing). Other backends (libei, x11, layer_shell, windows,
dummy) emit Begin { cursor: None } and don't send MotionAbsolute,
so the guest falls back to the midpoint warp on Enter. Adding
cursor-position reporting to those backends is a per-backend
follow-up.
InputCapture trait grew display_bounds() (default impl returns
None; macOS implements via CGDisplay::active_displays) and a
peer_warp_target(pos, cursor) helper that combines the host's
own bounds, the cached peer bounds, and the cursor position into
a target point on the peer's screen. peer_warp_target returns
None when either bounds is unavailable, in which case the capture
loop just doesn't emit MotionAbsolute.
The cross-axis cursor preservation introduced in 6c1bd88 was macOS-only; the layer-shell capture backend (Wayland/Hyprland and similar wlroots compositors) emitted Begin { cursor: None }, so transitions where Linux was the host fell back to the entry-edge midpoint warp on the guest — the same 300–400 px Y-jump the macOS path was fixed to avoid. Read surface_x / surface_y from wl_pointer::Enter and translate to compositor screen-space using the layer-surface's anchor edge: surfaces here are 1 px on the on-axis dimension and span the cross-axis, so the surface-local cross-axis coord is the screen offset directly. To support multi-output setups, store the output's compositor position+size on the Window when it's created, and add a display_bounds() override that returns the union rectangle of all active outputs (mirrors the macOS impl so MotionAbsolute scaling stays consistent). Effect: Linux→peer transitions where Linux is the source now preserve cross-axis cursor position the same way macOS→peer transitions already do. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Counterpart to 6c1bd88's Enter-time cross-axis preservation. When the host releases capture (release-bind chord, auto-release threshold, peer destroyed), the visible cursor reappears at whatever point capture started — typically the entry-edge midpoint or wherever the guest chose to warp to. The user perceives this as a 100–400 px Y-jump even though Mac→Linux→Mac round-trip "should" feel continuous, because nothing in the release path tells the host where the guest's cursor visually was at the moment of release. Track a virtual_cursor (f64, f64) in the wrapper that mirrors the guest's screen-space cursor: seeded on Begin from the peer_warp_target / entry-edge midpoint (whatever the guest will actually do on Enter), accumulated against every Motion event we forward, clamped to peer bounds. On release, project it back to host screen-space with host_warp_target_on_release — symmetric inverse of peer_warp_target — and pass that as a new Option<(i32, i32)> parameter on the Capture::release trait method. macOS threads the target through ProducerEvent::Release and warps before show_cursor() so the visible cursor reappears at the matching host point. Other backends ignore the parameter (they don't hide/manage the system cursor on the way out). This is a no-op when peer_bounds or display_bounds is unavailable — fallback is the previous behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-sufficient counterpart to MotionAbsolute. Carries the host's cursor as a normalized fraction (0..1) of the host's own screen plus the entry side from the receiver's frame. The receiver scales nx/ny against its own display bounds and pins the on-axis dimension to the matching edge. The point: MotionAbsolute requires the host to know the peer's geometry (cached via a prior `Bounds` event), which doesn't exist on the very first crossing — `Bounds` is only sent in response to `Enter`, so the host can't include MotionAbsolute on the same crossing that asks for the bounds it needs. CursorPos sidesteps the round-trip dependency entirely; the receiver does the scaling locally with its own bounds. Wire format adds f32 codec impl alongside existing u8/u32/i32/f64. Old peers don't know the new EventType tag and skip the event via the proto forward-compat decode-tolerance path; they continue to warp to the entry-edge midpoint as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion to peer_warp_target for the bounds-free CursorPos path. Normalizes the host's screen-space cursor against the host's own display bounds — no peer geometry consulted, so a return value of Some is independent of whether the peer has sent Bounds yet. The capture loop will emit this fraction as ProtoEvent::CursorPos right after Enter so the guest can warp on the very first crossing instead of falling through to the entry-edge midpoint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the peer takes over (sends Enter+CursorPos), the host was also releasing capture and warping its local cursor based on the last-known peer virtual_cursor. The two warps fired on the same shared cursor and raced — the host's stale warp frequently won, clobbering the peer's authoritative proportional landing and making the cursor appear at whatever position the host *thought* the peer cursor was, regardless of where the user actually crossed. Split the release path: ReleaseForHandover skips the host warp_target so CursorPos is the only warp on remote-takeover. The release-bind chord and backend auto-release still go through the original release_capture path that computes a host warp. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ReleaseNotify wasn't the only source of host warp races. When the peer's local capture begins, it sends ProtoEvent::Leave to every incoming connection (service.rs:357), which the recipient's capture loop handles by calling release_capture — computing a host warp from stale virtual_cursor and racing against the peer's upcoming CursorPos warp on the shared cursor. Route peer-Leave release through release_capture_handover so the proportional CursorPos warp lands without competition. The rare case where the peer released without taking over (no Enter/ CursorPos follows) just leaves our cursor where it was — fine, since nothing else is moving it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Enter handler unconditionally warped the local cursor to the
midpoint of the entry edge, intending to seat virtual_pos=0 at
column 0 before the host's stream of relative motion arrived.
But the host now sends CursorPos right after Enter, which carries
the proportional landing point AND pins the on-axis dimension to
the matching edge — making the midpoint warp redundant.
Worse, the midpoint warp races against fast handovers: when the
user crosses, then crosses back within ~100ms, the local
CGEventTap (or layer-shell equivalent) reads the cursor's
location field at the new crossing while the cursor is still
sitting at the midpoint from the previous Enter — never
advancing to the proportional CursorPos warp that would have
followed. The opposite-direction CursorPos then encodes
ny=0.500 ("middle of source") and the receiver dutifully warps
its cursor to its own middle, producing the persistent
"always lands in the middle" symptom even after suppressing the
host-warp races on both sides.
Trust the host: if it can compute a proportional point (which it
can in every case where Begin.cursor was populated), CursorPos
seats the cursor correctly. If it can't, the cursor stays where
it was — preferable to a forced midpoint that masquerades as a
mid-screen crossing on subsequent re-crosses.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wall-press fallback previously fired the moment the cursor pressed the host-adjacent edge of the peer for `release_threshold_px` worth of unabsorbed motion — racing the peer's layer-shell `Leave` (the authoritative handover signal) on every normal cross. In practice the network round-trip beats 200px of physical motion easily, so layer-shell won the race and wall-press only visibly fired on the lock screen where the peer has no layer-shell. The right outcome, by accident. Make it explicit. When wall_pressure crosses the threshold, set `wall_press_pending_at` and arm a 150ms timer instead of firing. `release_no_host_warp` (the path peer-Leave already routes through) clears the pending flag via `reset_wall_press_state`, so a healthy handover cancels the deferred AutoRelease before it can fire. The timer itself is polled in `poll_next` so the deadline elapses even when the user pinned the cursor at the wall and stopped moving. Result: - Normal operation: peer Leave arrives in <50ms → wall-press cancelled, no race against the proportional CursorPos warp the handover path uses to position the host's cursor. - Lock screen / dead peer / network down: no Leave arrives → 150ms past threshold → fire AutoRelease as the original fallback intended. Costs +150ms of latency to the genuine fallback case (lock screen), which is imperceptible on top of the 200px of cursor "stickiness" the user already sees while the threshold accumulates. Also retreating into the interior now cancels a pending fire — a brief bump against the wall followed by motion deeper into the guest no longer leaves a primed timer waiting to misfire. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Instant value was stored but never read — only `is_some()` / `is_none()` / `take()`. `tokio::time::Instant::now()` already gives us the deadline base for the timer reset, so the std::time import drops too. No behavior change.
Wayland's compositor revokes input on layer-shell surfaces while the screen is locked, so Linux-as-host gets this behavior for free. macOS and Windows do not — CGEventTap and WH_MOUSE_LL hooks both keep firing under the lock screen — leaving a half-broken state where the mouse can move to the peer but the keyboard can't follow (the lock screen consumes keys before any tap/hook sees them). Match Wayland's behavior on the other two platforms by detecting lock state and gating barrier crossings on it. macOS: - Register CFNotificationCenter distributed-notification observers for `com.apple.screenIsLocked` / `com.apple.screenIsUnlocked` on the same CFRunLoop thread that hosts the event tap. - Add `host_locked: bool` to InputCaptureState; the lock callback flips it via blocking_lock and synthesizes `AutoRelease` upward via the event channel if a capture was already in flight. - Gate the cross-detection branch in event_tap_callback on `!state.host_locked`. The mutex serializes against the callback so events delivered after the lock-state flip see the new value. Windows: - Add `Win32_System_RemoteDesktop` to the `windows` crate features for `WTSRegisterSessionNotification` / `WTSUnRegisterSessionNotification`. - Register the existing message-only window for `NOTIFY_FOR_THIS_SESSION` so it receives `WM_WTSSESSION_CHANGE`. - Add `HOST_LOCKED: Cell<bool>` thread-local; window_proc updates it on `WTS_SESSION_LOCK` / `WTS_SESSION_UNLOCK` and synthesizes `AutoRelease` via the event channel if a capture was active. - Gate the cross-detection in `check_client_activation` on `!HOST_LOCKED.get()`. Linux X11 backend is currently `NotImplemented` so there's nothing to gate; whoever wires up the X11 capture path can add the same check using their preferred lock-state source (D-Bus org.freedesktop.ScreenSaver, xss XScreenSaverQueryInfo, etc). Known limitation: distributed notifications / WM_WTSSESSION_CHANGE fire only on transitions — if the daemon starts while the host is already locked, host_locked stays false until the next lock cycle. Acceptable for now since the daemon normally starts before lock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous attempt to gate cursor crossings while the host's screen is locked tried `CFNotificationCenterGetDistributedCenter` for `com.apple.screenIsLocked` / `Unlocked`. Empirically, the callback never fires when the daemon is non-Cocoa: the distnoted mach port is attached to the main thread's CFRunLoop regardless of which thread called AddObserver, and lan-mouse's main thread runs the GLib main loop instead of a CFRunLoop, so the port is never serviced. A dedicated worker thread with its own CFRunLoop doesn't help (port still attaches to main). `notify_register_check` against the same names is also a dead end — `loginwindow` doesn't post on notify(3) for these keys (verified with `notifyutil -1`). Replace the entire observer machinery with a direct poll of `CGSessionCopyCurrentDictionary["CGSSessionScreenIsLocked"]` on each `MouseMoved` event in the tap callback. ~10-50us per call (XPC to WindowServer); negligible at typical mouse rates. On the unlocked → locked transition, synthesize an `AutoRelease` so the cursor returns to the host. On Sequoia 15+ the key is absent (not `kCFBooleanFalse`) when unlocked — treat missing-or-nil as unlocked. Verified: with macOS as host and Linux as guest, locking the Mac prevents the cursor from crossing to the Linux peer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously polled CGSession on every MouseMoved tap callback (~1000Hz worst case = 1-5% CPU on the XPC). The only forwarding decision that actually consults the lock state is the cross-detection commit point inside `state.crossed(cg_ev)` returning Some — fire-once-per-cross, not fire-once-per-twitch. Move the `is_screen_locked()` call there and drop the per-event polling, the `host_locked` cached field, and the transition-detection logic. Tradeoff: mid-capture lock (cursor on peer when Mac auto-locks via idle timeout) no longer auto-releases the cursor back to the Mac. The user can release-bind (Ctrl+Shift+Cmd+Alt) to bring the cursor back. Acceptable: cursor stuck on peer while screen locked is mildly annoying, not dangerous; auto-lock-during-capture is rare in practice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply rustfmt to the host-lock suppression code (input-capture macOS + Windows event_thread).
The DTLS recv loops in src/listen.rs and src/connect.rs each read one full datagram per call. A failed `try_into::<ProtoEvent>()` means the datagram's leading EventType byte didn't match any known variant — a misalignment is impossible because DTLS is message-framed, not stream-framed. Previously, src/listen.rs would `break` out of the loop on parse failure (tearing down the connection) and src/connect.rs would silently swallow the error with no log. Both are wrong as forward-compat behavior: any future protocol addition (e.g. a new event variant) would force every existing peer to disconnect rather than gracefully ignoring the unknown event. Skip-and-continue on both sides, with a debug-level log so the behavior is observable. Pre-requisite for any future ProtoEvent variant to land without forcing a coordinated upgrade across every peer in a deployment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a one-shot Hello message to the lan-mouse wire protocol so each
peer can display the other end's build commit hash and warn on
version mismatch. Soft-warn only — mismatched versions never refuse
traffic.
Wire change (lan-mouse-proto)
* `ProtoEvent::Hello { commit: [u8; 8] }` carries the 8-byte ASCII
short commit from shadow_rs's `SHORT_COMMIT`. Encoded/decoded
alongside the existing event variants.
* `EventType::Hello` is appended to the enum so existing IDs are
untouched. Old peers receive the event, hit `InvalidEventId`, and
silently skip it via the forward-compat handler in
`connect.rs::receive_loop` — the connection is unaffected.
Daemon
* Connect side sends one Hello immediately after the DTLS handshake
authenticates and before the ping_pong loop starts. Best-effort,
fire-and-forget — `log::debug!` on send error.
* Listen side mirrors the peer's Hello with its own (same shape as
the existing Ping → Pong reply), so the peer's connect-side
receive_loop populates `ClientState::peer_commit` for that
handle.
* The disconnect path clears `peer_commit` so a stale hash isn't
shown after the connection drops.
IPC
* `ClientState::peer_commit: Option<[u8; 8]>`. `None` means the
peer hasn't sent Hello yet — either fresh connection or older
build that predates the event.
GTK
* `ClientObject` exposes `peer-commit` as an `Option<String>`
property; `peer_commit_to_string` converts the wire `[u8; 8]` to
the displayable hex.
* `lan_mouse_gtk::run` now takes the local commit and stashes it in
a `OnceLock` so per-row UI can compare against each peer's hash.
* `ClientRow::refresh_version_status` re-renders the collapsed
subtitle with Pango markup whenever the property changes:
- matched → green "peer version: <hex> · matched"
- mismatch → orange "peer version: <hex> · ours: <hex>"
- unknown → orange "peer version: unknown · ours: <hex>"
* Window invokes `refresh_version_status` from
`update_client_state` after writing the new property, and
`bind` calls it once on row construction so the initial
subtitle isn't blank.
Known limitation: state-change broadcasts from the network side
(set_alive / set_active_addr / set_peer_commit) don't currently
trigger a `FrontendEvent::State` directly; the UI picks up the
latest values on the next user-driven broadcast. Same pre-existing
behavior as the alive/active_addr fields.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These are user-visible labels in the version-status subtitle, so sentence-case reads better than the lowercase first-pass. "matched" stays lowercase since it's a status descriptor, not a label. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the Hello handler in `ListenTask` echoed our local commit
back but deliberately threw away the peer's, on the assumption that
the outgoing connect-side path (`connect.rs:278-279` →
`set_peer_commit`) would always populate the visible state for any
bidirectionally-configured peer.
That assumption breaks any time the *outgoing* TCP/DTLS direction is
broken even though the inbound direction is fine — happened just now
when the peer Mac's daemon stopped listening on 4242 (DHCP-renewed
IP, daemon crashed, asymmetric NAT, …). Mac was still happily
connecting in the other direction and sending events, including the
initial Hello, but Linux silently displayed "peer version unknown"
because the listen side dropped Mac's commit on the floor.
Add a `PeerHello { addr, commit }` EmulationEvent variant fired from
the listen-side Hello handler. The service maps `addr → ClientHandle`
via `client_manager.get_client(addr)` and calls `set_peer_commit` +
`broadcast_client` exactly like the connect path does. The connect
path remains the primary source for symmetric setups; this is the
defensive fallback so version visibility doesn't depend on outbound
reachability.
Skips silently when no outgoing client is configured for the peer's
addr (incoming-only setup) — there's no UI row to update in that
case anyway.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Lan Mouse window previously couldn't scroll its preference groups when the window height was reduced below the natural content height — content was simply clipped, with no way to reach the lower groups. AdwStatusPage doesn't include built-in scrolling. Wrap the AdwStatusPage in a GtkScrolledWindow inside the existing AdwToastOverlay, with vertical scroll on demand and horizontal scroll disabled (we use AdwClamp for horizontal sizing). propagate-natural-height keeps the window's preferred size identical when content fits, so existing layout behavior on tall windows is unchanged. Effect: when the user resizes the window shorter than the natural content height (or has a small display), all preference groups remain reachable via vertical scroll. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hickory_resolver's TokioResolver only consults /etc/resolv.conf and queries upstream DNS servers — which means it can't see /etc/hosts, mDNS (Avahi/Bonjour), NetBIOS, or anything else in the system's full name-resolution stack. On a typical home LAN there's no DNS server that knows about peer machine names, so users had to fall back to typing IP addresses, which broke the moment they moved their setup to a different network. Swap to tokio::net::lookup_host, which calls getaddrinfo (or GetAddrInfoEx on Windows). That walks /etc/nsswitch.conf on Linux (picking up Avahi-resolved .local names, /etc/hosts, and DNS), uses Bonjour for .local on macOS, and the full Windows resolver on Windows. A Bonjour hostname like "JKMBP-M4-Max.local" now resolves on every modern network without explicit configuration; the user can carry their two machines between LANs and the connection still finds them. Drop the hickory-resolver dependency entirely — it's no longer needed. ServiceError::Dns also goes away; lookup failures surface as io::Error which is already covered by ServiceError::Io. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a host has two interfaces on the same subnet (e.g. macOS with Wi-Fi en0 and a USB-C dock en7 both on 192.168.1.0/24), a single 0.0.0.0:port DTLS listener silently breaks for peers that dial the non-routed IP: the kernel sources its reply from the routing table's preferred interface, so the reply's src-IP doesn't match the 4-tuple the peer expects, and webrtc-dtls drops the packet. Replace the single 0.0.0.0 bind with one Listener per local IPv4 address (loopback + link-local skipped). Each listener's reply socket is bound to a specific IP, so the kernel uses *that* IP as source — symmetric replies guaranteed regardless of the routing table. A supervisor task watches if-watch (Network.framework on macOS, netlink on Linux) for interface up/down events and adds/drops listener slots dynamically: plugging a dock or toggling Wi-Fi no longer requires a lan-mouse restart. Port-change rebuilds all slots together. Falls back to a single 0.0.0.0 bind only if interface enumeration or every per-IP bind fails — preserves single-NIC behavior and ensures we never silently fail to listen. Removes the previous user-facing workaround of forcing `ips = ["192.168.1.88"]` on the peer; with this change `ips = []` + hostname resolution Just Works on multi-homed hosts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The if-watch supervisor added in 2c7ce2e already had a Down-event handler — but `if_watch` on macOS uses Network.framework, which doesn't reliably fire `IfEvent::Down` when an interface is administratively disabled (e.g. the user toggles Wi-Fi off in System Settings). The Up event is reliable; the Down event is not. Result: when the user toggled Wi-Fi off mid-session, the Wi-Fi IP's listener slot stayed live in the HashMap, bound to a vanished IP. Harmless in isolation (no traffic can reach an unbound IP), but it defeats 2c7ce2e's "no restart needed when interface state changes" promise — the user has to restart lan-mouse to clean up. Add a 30-second polling reconciliation arm to the supervisor's select! loop: - Enumerate currently-present IPv4 addresses (same logic as startup via `enumerate_listenable_ipv4`). - Diff against the listeners HashMap. Drop slots whose IP is no longer present (catches missed Down events). Add slots for new IPs that appeared without an Up event (defensive, symmetric). Polling cost is negligible (`getifaddrs` is a syscall) and the 30-second cadence is fast enough that "I just toggled Wi-Fi" feels prompt without spamming. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`if !listeners.contains_key(&ip) { ... insert(ip, ...) }` plus
`#[allow(clippy::map_entry)]` works but is uglier than just using
the `Entry::Vacant` slot up front. The Vacant arm handles both the
existence check and the subsequent insert in a single hash lookup —
which is the exact rewrite clippy was suggesting, just expressed
without forcing `or_insert_with` (which doesn't fit because
`try_bind_listener` is async + fallible).
Brings combined branch in line with the equivalent fix on the
split stack so both express the same behavior in the same shape.
Two small cleanups on the reconcile loop: - The `if listeners.remove(&ip).is_some()` check was redundant — `to_drop` is collected from `listeners.keys()` and we run single-threaded, so `remove()` is guaranteed to return Some. - `reconcile_tick` (30s) now uses `MissedTickBehavior::Skip`. The default `Burst` would replay backlog ticks back-to-back when resuming from a long suspend (laptop closed for hours), each triggering a redundant interface enumeration. `Skip` collapses the backlog to a single fire on resume.
…ling Adds Bonjour service registration + browsing under `_lan-mouse._udp.local.`. Each instance's TXT record carries a `primary=<ipv4>` field whose value is the IP of the interface that owns the default route — which on macOS reflects the user's service-order ranking, on Linux the lowest-metric default route, on Windows the route GetBestRoute2 selects. The dialer reads peer announcements via a continuous browse and caches `peer_hostname → primary_ipv4` in a `Rc<RefCell<HashMap>>` shared with `LanMouseConnection`. On each connection attempt, `connect_to_handle` looks up the peer's hostname and (when found) hands the resulting `SocketAddr` to `connect_any` as a "preferred" address that gets a 200ms head start over the rest of the candidate list — modeled on RFC 8305 happy-eyeballs. A healthy preferred path virtually always wins; a broken one only delays connect by 200ms before the rest of the IPs join the race. Subsystem is gated by a new config flag `mdns_discovery` (default true). Toggling off unregisters our service, aborts the browse task, and shuts the daemon, but preserves the `primary_cache` so any already-known hints stay queryable until overwritten on re-enable. Useful on networks where mDNS multicast (224.0.0.251) is firewalled. GUI exposes the toggle as a new "Network Discovery" preferences group with an mDNS Discovery switch row, mirroring the existing natural-scroll switch's plumbing (block/unblock signal handler when the daemon pushes the value via Sync, etc). Service-side wiring: - Service owns Discovery; clones the shared cache into LanMouseConnection on construction. - A 30-second `tokio::time::interval` calls `Discovery::refresh()` so the TXT record stays accurate when the OS-preferred interface changes (e.g. user toggles Wi-Fi off and Ethernet takes over). - Port-change events forward through `Discovery::set_port` so the re-published TXT/SRV records reflect the new listen port. New deps: `mdns-sd = "0.19"` (cross-platform mDNS responder, doesn't piggyback on system Avahi/Bonjour), `netdev = "0.43"` for default- route lookup, `hostname = "0.4"` for the local hostname. Falls back gracefully when: - `ServiceDaemon::new` fails (multicast group locked / no perms) - No interface owns the default route - Peer isn't announcing (old version or discovery disabled there) …in all cases the dialer just returns `preferred = None` and the existing connect_any race runs unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…re bypass
Previously each call into LanMouseConnection::send spawned a fresh
connect_to_handle, even when the prior attempt had failed because the
peer was unreachable. With ips=[] in the client config and the peer
offline, this produced dozens of attempts per second — every mouse
event near the boundary triggered another DNS lookup and another round
of "client (N) connecting ... (ips: [], preferred: None)" log spam.
Hours of that may have contributed to mDNS state corruption observed
when the peer eventually returned to the LAN.
Gate the spawn at the call site:
- Per-handle RetryState tracks next_attempt_at + a doubling backoff
capped at 30s.
- signature_of(ips, primary_hint) hashes the candidate set; when the
signature changes between attempts (mDNS browse populates a primary,
DNS resolves new IPs) the gate is bypassed and the next send tries
immediately.
- Successful connect drops the retry entry; failure (or empty
candidate set) records a new backoff floor.
Net effect: no retry storm during outages, and a peer reappearing via
mDNS reconnects on the very next mouse event without waiting on the
backoff to expire.
The dialer's `primary_hints` lookup keys on the configured `hostname`
("JKMBP-M4-Max.local"), but the cache was being populated with the
SRV target hostname returned by `ServiceInfo::get_hostname()`. macOS
will sometimes appear in mDNS-SD with a suffixed system hostname
("JKMBP-M4-Max-2.local") for the SRV record while the service-instance
label keeps the user-visible identifier ("JKMBP-M4-Max.local") — those
two names are advertised together but mdns-sd resolves only one
SRV target into the event, so the cache key drifted to a name the
config never references and `preferred` came back None.
Switch the cache key to the service-instance label, parsed off the
fullname's `.<SERVICE_TYPE>` suffix. The label is what users put in
their config (the announcer derives it from the same `local_hostname()`
on registration) and it's stable across SRV-target variations.
Log line now shows both fields so future hostname/target mismatches
are visible without a packet capture:
mdns: peer instance=jkmbp-m4-max.local (target=jkmbp-m4-max-2.local) ...
Discovery now caches by service-instance label, but the announcer's choice of label is platform-dependent: macOS's `hostname::get()` returns the FQDN (`Foo.local`) while Linux's returns the short name (`omarchy`). Without normalization this works asymmetrically — a config of `omarchy.local` for a Linux peer wouldn't match the cached `omarchy` key. Add `normalize_mdns_name` (lower-case, drop trailing `.`, drop `.local` suffix) and apply it on both insert (start_browse) and lookup (`peer_primary_ip`, `should_attempt`, `connect_to_handle`). The `.local` domain is implied for everything mDNS-SD touches, so collapsing it on both sides is lossless and matches how `dns-sd` and Bonjour APIs treat instance labels in their wire form.
- src/connect.rs: insert blank line in `should_attempt` doc-comment before the "Otherwise returns false" continuation. Clippy's `doc_lazy_continuation` (Rust 1.94+) treats text immediately after a list item without a blank line or indent as continuing the last bullet, which it isn't. - src/discovery.rs: cargo fmt collapsed a wrapped `let instance = …` line onto one line at column-fit. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- discovery.rs: drop the unused `peer_primary_ip` method. It was kept as a "canonical lookup entry point" with `#[allow(dead_code)]`, but the dialer reads `primary_cache` directly via the shared `Rc<RefCell>` in `connect.rs` and nothing else calls it. - discovery.rs: fix the `refresh()` doc — said "call from the if-watch supervisor" but it's actually driven by the service's periodic tick. - service.rs: set `MissedTickBehavior::Skip` on `discovery_refresh_tick` (30s). The default `Burst` would replay backlog ticks back-to-back when resuming from a long suspend, each triggering a redundant interface enumeration and TXT republish.
|
Superseded by #433 (same content; the branch was renumbered as part of restructuring the stack from 7 PRs into 9, which auto-closed this cross-fork PR). |
Review-only focused diff (just this PR's commits, vs.
split/03-network): jondkinney/lan-mouse@split/03-network...split/04-mdnsSummary
mDNS-SD service-order discovery — even with a multi-homed listener, the dialer still has to choose which of the peer's IPs to dial first, and plain hostname resolution returns every interface's IP without ranking.
connect_any's parallel race picks whichever DTLS handshake completes first, which is RTT-roughly-correct but not always what the user wanted. The classic symptom: Wi-Fi wins the race even when the user has Ethernet ranked higher in macOS's service order, leading to a stuttery session over Wi-Fi while a healthy wired path sits idle.Each lan-mouse instance now publishes a
_lan-mouse._udp.local.Bonjour service whose TXT record carriesprimary=<ipv4>, where<ipv4>is the IP of the interface that owns the default route — which on macOS reflects service order, on Linux the lowest-metric default route, on Windows whateverGetBestRoute2selects. The dialer continuously browses the same service type and cachespeer_hostname → primary_ipv4in aRc<RefCell<HashMap>>shared withLanMouseConnection.connect_anyextended with happy-eyeballs head-start: if a preferred address is known, dial it alone for 200ms before joining the rest of the candidate list to the race. A healthy preferred path virtually always wins; a broken one only delays connect by 200ms before fallbacks kick in. (Cf. RFC 8305 IPv6→IPv4 fallback delay.)Subsystem gated by a new
mdns_discoveryconfig flag (default true) and a corresponding GUI switch under a new "Network Discovery" preferences group. Toggling off unregisters the service, aborts the browse task, and shuts the daemon, but preserves theprimary_cacheso already-known hints stay queryable until overwritten — useful on networks where mDNS multicast (224.0.0.251) is firewalled. A 30-seconddiscovery_refresh_tickre-publishes the TXT record so it stays accurate when the OS-preferred interface changes (e.g. user toggles Wi-Fi off and Ethernet takes over).New deps:
mdns-sd(cross-platform mDNS responder, doesn't piggyback on system Avahi/Bonjour),netdev(default-route lookup),hostname(local hostname for the service instance name).Falls back gracefully when
ServiceDaemon::newfails (multicast group locked / no perms), no interface owns the default route, or the peer isn't announcing (old version or discovery disabled there) — the dialer just seespreferred = Noneand the existingconnect_anyrace runs unchanged.Test plan
_lan-mouse._udp.local.advertised) — dialer seespreferred = Noneand falls back to existing race behaviorSplit out from #418, the umbrella PR collecting ~10 independent feature areas. This PR is the mDNS-SD discovery subset and depends on the multi-homed-listener PR. See #418 for the full picture.
Stack overview
These PRs are split out from #418 and stack in this order:
Each PR's branch builds on the previous one, so until earlier PRs are merged the cumulative diff against
mainincludes all preceding work. Reviewing in order is easiest.