Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFI refactoring to support client-detected network changes #3343

Closed
2 tasks
jamilbk opened this issue Jan 22, 2024 · 7 comments
Closed
2 tasks

FFI refactoring to support client-detected network changes #3343

jamilbk opened this issue Jan 22, 2024 · 7 comments
Assignees
Labels
area/android_client Issues related to the Android client area/apple_client Issues related to the Apple client area/connlib Firezone's core connectivity library area/windows_client Issues related to the Windows client kind/UX Changes related specifically to overall user experience and product quality
Milestone

Comments

@jamilbk
Copy link
Member

jamilbk commented Jan 22, 2024

Chatting with @thomaseizinger to optimize reconnects, we can improve reliability and simplify state management in the clients with some refactoring around how we handle state:

  • Add a new command (in addition to Session.connect and Session.disconnect) the clients can call from the app side: update(dns: [String], internet?: boolean) that is called whenever the OS detects a network interface change
  • Collapse all of the onTunnelReady onSetInterfaceConfig etc states into a single onTunnelStateChange(state: ...) callback that passes the tunnel state, including whether it's "connecting", "reconnecting", or "disconnected".

With the changes above, connlib now manages tunnel state. When it receives a "command" (connect/disconnect/update) it reacts accordingly and then passes the state object back to the app. The app can then update UI and its state accordingly.

  • Session.connect(...): Long
  • Session.update(dns: [String]): Long
  • Session.disconnect(): Long
  • onTunnelStateChange(state: struct): Int?

Scenarios to handle (can happen in any order at any frequency)

  • Peer network partition
  • Portal network partition
  • Client network changes (DNS, interface address, etc)
@jamilbk jamilbk added area/connlib Firezone's core connectivity library kind/UX Changes related specifically to overall user experience and product quality labels Jan 22, 2024
@jamilbk jamilbk changed the title Determine when to get new DNS servers FFI improvements for reconnect Jan 23, 2024
@jamilbk jamilbk changed the title FFI improvements for reconnect FFI improvements for client-detected network changes Jan 23, 2024
@jamilbk jamilbk changed the title FFI improvements for client-detected network changes FFI refactoring to support client-detected network changes Jan 23, 2024
@jamilbk jamilbk added this to the 1.0 GA milestone Jan 23, 2024
@thomaseizinger
Copy link
Member

With the changes above, connlib now manages tunnel state. When it receives a "command" (connect/disconnect/update) it reacts accordingly and then passes the state object back to the app. The app can then update UI and its state accordingly.

Disclaimer: I've not looked at the client code much so take this with a grain of salt 😊

What I'd like to add here is that the clients should code their UI as a pure function of the state returned by connlib very much like MVC with the following "mapping":

  • model == connlib
  • controller == FFI-handle to connlib
  • view == state returned from connlib

This would have the advantage that the clients don't need to keep additional state but can "just" render something based on what connlib returns. Consequences of this design are:

  • connlib becomes more like a kind of daemon or server that runs in the background.
  • connlib will have to "debounce" incoming commands and not be too eager / naive in how it applies them. For example, connlib should take the message from the clients in regards to network changes as mere "hints" because they can be quite spammy. @jamilbk and I tested this on the MacOS client and it flips back and forth between "connected" and "disconnected" as MacOS is switching over the traffic from e.g. Ethernet to WiFi. Tearing down the entire tunnel because of this would not be a good user experience.
  • connlib will have to start sharing more state updates as it is doing things. Instead of just saying "connected" and "disconnected", we should also send updates such as "we have connectivity issues" when we e.g. detect that keep-alives are not longer coming through or when we perform an ICE restart. Otherwise the clients cannot render an appropriate UI.

@thomaseizinger
Copy link
Member

I thought some more about this and what impact it may have. I realized that using this pattern, we can also solve a lot of awkwardness around device initialization on the various platforms:

  • The callbacks for on_add_route and on_remove_route will go away with the design mentioned in this issue. Instead, issuing a state update to the clients will have the list of routes that we want. It is a client's responsibility to "make that happen".
  • With those callbacks removed, the Android client will not be able to return a new file descriptor from those functions. Instead, we can add a new FFI command: set_device_file_descriptor. Similar to disconnect, that is a function that the Android client can call on the Session.
  • Whenever the Android client receives a state update and detects that the list of routes has changed, it can invoke Session::set_device_file_descriptor to set the new FD.

This greatly simplifies the control-flow in connlib because we only need to initialize a new device based on an FD. In fact, that API can be cfg(android)1 because it will only be called from the crate that bridges the Android client code with connlib.

Footnotes

  1. Pseudo-code.

@ReactorScram
Copy link
Collaborator

ReactorScram commented Jan 30, 2024

list of routes does sound nice. Then the only diffing is in the client, or they could blow up the routing and re-do it, as a simpler implementation. Adding 1 route takes less than 1 ms on my laptop

@jamilbk
Copy link
Member Author

jamilbk commented Feb 19, 2024

We can table the major refactoring for later, but the primary point of this issue is to handle client-detected network changes and propagate this to connlib to react appropriately.

@jamilbk
Copy link
Member Author

jamilbk commented Feb 22, 2024

After chatting more with @conectado just now, we devised the minimum set of changes required to get network updates working reliably across all clients:

  • Add a Session.update(dnsServers: [String]) command call. This will only be called during an active connlib Session. This is called when:
    • A client's network settings has changed, AND
    • The new network settings have different DNS servers than the old settings
  • Remove the get_system_default_resolvers() callback on all clients. This is no longer needed with the change above.
  • On the Linux client, where there is no API to receive network change events, we can just poll every 5 seconds for DNS default resolver changes, and issue the Session.update then.
  • Each client should be able to handle multiple onSetInterfaceConfig and then onTunnelReady calls during the lifetime of the Session. These will be called by connlib in response to the Session.update command, similar to how it already responds to Session.connect. This means that the period of time between Session.update and onTunnelReady, the client's UI state should show the tunnel in Connecting or Reconnecting state.

I'll take a stab at knocking out the Apple and Android side of this. @ReactorScram -- when that's done, it can be used as an example implementation to replicate for the Windows and Linux clients.

@conectado will be starting on the connlib side to handle this within a few days.

@jamilbk
Copy link
Member Author

jamilbk commented Mar 13, 2024

Closing due to discussion being settled on 3/12

@jamilbk jamilbk closed this as completed Mar 13, 2024
@jamilbk
Copy link
Member Author

jamilbk commented Mar 13, 2024

#3429

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/android_client Issues related to the Android client area/apple_client Issues related to the Apple client area/connlib Firezone's core connectivity library area/windows_client Issues related to the Windows client kind/UX Changes related specifically to overall user experience and product quality
Projects
None yet
Development

No branches or pull requests

4 participants