Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prioritise thunderbolt over WIFI #41 #47

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

itsknk
Copy link
Contributor

@itsknk itsknk commented Jul 20, 2024

For issue #41, I figured the solution could be quite simple since it already supports the Thunderbolt discovery and needs to be prioritized.

I've updated the grpc_discovery.py file to get this done.

Changes made:

  • Import netifaces and platform.
  • Create a get_network_interfaces method that prioritizes Thunderbolt interfaces over WiFi on macOS.
  • Make the __init__ method to call get_network_interfaces and store the result.
  • Update task_broadcast_presence to broadcast on all available interfaces, prioritizing Thunderbolt on macOS.
  • Make task_listen_for_peers to listen on all available interfaces.

@AlexCheema
Copy link
Contributor

This is awesome! First proper PR, congrats!

Code looks great, only one very minor thing for consistency across the codebase, instead of platform use psutil. It's the same thing but just more consistent across the codebase.

@itsknk
Copy link
Contributor Author

itsknk commented Jul 20, 2024

Will change.

@AlexCheema
Copy link
Contributor

Also don't we need to add netifaces as a dependency in setup.py?

@itsknk
Copy link
Contributor Author

itsknk commented Jul 20, 2024

My bad! We do! I just installed it locally and forgot about setup.py! Will update the whole thing and give you a better one!


def get_network_interfaces(self):
interfaces = netifaces.interfaces()
if psutil.system() == "Darwin": # macOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this if psutil.MACOS: and remove the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beat me to it!

@AlexCheema
Copy link
Contributor

Just added an integration test for discovery. Already coming in useful, we'll see if it still passes after your changes.

@itsknk
Copy link
Contributor Author

itsknk commented Jul 20, 2024

I guess setting up netifaces dependency failed?!

@itsknk itsknk force-pushed the main branch 3 times, most recently from f554ae0 to 0222b2e Compare July 20, 2024 14:40
@AlexCheema
Copy link
Contributor

AlexCheema commented Jul 20, 2024

I just quickly tested this myself locally. For me, the interface name does not begin with thunderbolt.

# before thunderbolt connected
>>> netifaces.interfaces()
['lo0', 'gif0', 'stf0', 'anpi0', 'anpi1', 'anpi2', 'en4', 'en5', 'en6', 'en1', 'en2', 'en3', 'bridge0', 'ap1', 'en0', 'awdl0', 'llw0', 'utun0', 'utun1', 'utun2', 'utun3', 'utun4', 'utun5', 'utun6', 'utun7', 'utun8', 'utun9']
# after thunderbolt connected
>>> netifaces.interfaces()
['lo0', 'gif0', 'stf0', 'anpi0', 'anpi1', 'anpi2', 'en4', 'en5', 'en6', 'en1', 'en2', 'en3', 'bridge0', 'ap1', 'en0', 'awdl0', 'llw0', 'utun0', 'utun1', 'utun2', 'utun3', 'utun4', 'utun5', 'utun6', 'utun7', 'utun8', 'utun9', 'en11', 'en16']

As you can see, en11 and en16 were added after connecting thunderbolt.

@AlexCheema
Copy link
Contributor

Eventually we should support automatically upgrading to the best interface available (in this case thunderbolt) - see #51. We can implement this later though, for now this is great.

@itsknk
Copy link
Contributor Author

itsknk commented Jul 20, 2024

Eventually we should support automatically upgrading to the best interface available (in this case thunderbolt) - see #51. We can implement this later though, for now this is great.

This makes better sense, @AlexCheema.

@AlexCheema
Copy link
Contributor

Just to clarify, this still needs to be fixed / tested: #47 (comment)

@itsknk
Copy link
Contributor Author

itsknk commented Jul 21, 2024

Update:

  • Look for Thunderbolt in both the hardware and type fields, as Thunderbolt interfaces might be identified in either.
  • Identify WiFi interfaces by checking if the type is 'wi-fi' (case-insensitive).
  • Categories all other interfaces as 'other'.

@AlexCheema, I wrote a test script and it returned:

Screenshot 2024-07-20 at 11 42 16 PM

Please review this.

print(f"Other interfaces: {other_interfaces}")

# Prioritize Thunderbolt, then WiFi, then others
return thunderbolt_interfaces + wifi_interfaces + other_interfaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful code. love this

@AlexCheema
Copy link
Contributor

Have you got a thunderbolt cable to test this with? I don't think this works in its current state. This is what system_profiler SPNetworkDataType -json gives me for my MacBook Pro's Thunderbolt interface:

{
      "_name": "Thunderbolt Bridge",
      "Ethernet": {
        "MediaOptions": [],
        "MediaSubType": "autoselect"
      },
      "hardware": "Ethernet",
      "interface": "bridge0",
      "ip_address": [
        "169.254.13.209"
      ],
      "IPv4": {
        "AdditionalRoutes": [
          {
            "DestinationAddress": "169.254.13.209",
            "SubnetMask": "255.255.255.255"
          }
        ],
        "Addresses": [
          "169.254.13.209"
        ],
        "ConfigMethod": "DHCP",
        "ConfirmedInterfaceName": "bridge0",
        "InterfaceName": "bridge0",
        "SubnetMasks": [
          "255.255.0.0"
        ]
      },
      "IPv6": {
        "ConfigMethod": "Automatic"
      },
      "Proxies": {
        "ExceptionsList": [
          "*.local",
          "169.254/16"
        ],
        "FTPPassive": "yes"
      },
      "spnetwork_service_order": 4,
      "type": "Ethernet"
    }

@AlexCheema
Copy link
Contributor

I had another look and generally I don't think this is doing what we want it to do.
I'm not even sure the behaviour is different to before the changes. We're broadcasting on all interfaces (was already the case as far as I can tell) and listening on all interfaces (was already the case).

I think what we need to do here:

  • In the broadcast message, add a priority field which indicates the device's preference to receive connections on this particular interface
  • the listener should keep track of the current priority for each peer (can be in the same dict as known_peers)
  • if the listener receives a broadcast with a higher priority than it currently has for a connected peer, it should disconnect on the old interface and reconnect on the new interface

@itsknk
Copy link
Contributor Author

itsknk commented Jul 21, 2024

Did:

  • A priority field is added to the broadcast message.
  • Modified known_peers to include the priority info.
  • Implemented reconnection logic in on_listen_message when a higher priority interface is detected.
  • get_network_interfaces returns interfaces with their priorities.
  • A reconnect_peer method is added to handle the reconnection.

@AlexCheema, wrote a small test script that simulates receiving a broadcast from a peer. And to see whether it keeps track of priority and change when a better one comes. Here is the output from the test:

Screenshot 2024-07-21 at 4 09 55 PM

@AlexCheema
Copy link
Contributor

AlexCheema commented Jul 21, 2024

Did:

  • A priority field is added to the broadcast message.
  • Modified known_peers to include the priority info.
  • Implemented reconnection logic in on_listen_message when a higher priority interface is detected.
  • get_network_interfaces returns interfaces with their priorities.
  • A reconnect_peer method is added to handle the reconnection.

@AlexCheema, wrote a small test script that simulates receiving a broadcast from a peer. And to see whether it keeps track of priority and change when a better one comes. Here is the output from the test:

Screenshot 2024-07-21 at 4 09 55 PM

@itsknk can you commit the test too? and add some assertions so it's a proper test would be great

@itsknk
Copy link
Contributor Author

itsknk commented Jul 21, 2024

Did:

  • A priority field is added to the broadcast message.
  • Modified known_peers to include the priority info.
  • Implemented reconnection logic in on_listen_message when a higher priority interface is detected.
  • get_network_interfaces returns interfaces with their priorities.
  • A reconnect_peer method is added to handle the reconnection.

@AlexCheema, wrote a small test script that simulates receiving a broadcast from a peer. And to see whether it keeps track of priority and change when a better one comes. Here is the output from the test:
Screenshot 2024-07-21 at 4 09 55 PM

@itsknk can you commit the test too? and add some assertions so it's a proper test would be great

Done @AlexCheema

@@ -30,11 +32,42 @@ def __init__(self, node_id: str, node_port: int, listen_port: int, broadcast_por
self.listen_port = listen_port
self.broadcast_port = broadcast_port if broadcast_port is not None else listen_port
self.broadcast_interval = broadcast_interval
self.known_peers: Dict[str, Tuple[GRPCPeerHandle, float, float]] = {}
self.known_peers: Dict[str, Tuple[GRPCPeerHandle, float, float, int]] = {} # Added priority
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a comment here?

if not interface_name:
continue

priority = self.get_interface_priority(interface_name, interface_type, hardware)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pure function. Move it outside of the class

import json
import psutil
import time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you move this off the class, get rid of the duplicate definition here

self.broadcast_task = None
self.listen_task = None
self.cleanup_task = None

def get_network_interfaces(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this is a pure function so move it outside of the class

add netifaces dependency

use psutil instead of platform

use psutil instead of platform

fix typo

prioritise thunderbolt over WIFI

optimize network discovery and dynamic interface handling

optimize network discovery and dynamic interface handling

implement network interface detection and prioritization

implement network interface detection and prioritization

prioritize interfaces from device preference

prioritize interfaces from device preference

implemented assertions

implemented assertions

Implement pure functions outside the class. Write better test

prioritize interfaces from device preference
from typing import List, Dict, Callable, Tuple, Coroutine
from ..discovery import Discovery
from ..peer_handle import PeerHandle
from .grpc_peer_handle import GRPCPeerHandle
from exo.topology.device_capabilities import DeviceCapabilities, device_capabilities, UNKNOWN_DEVICE_CAPABILITIES
from exo import DEBUG_DISCOVERY

def get_interface_priority(interface_name, interface_type='', hardware=''):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

types please

else:
return 0

def get_network_interfaces():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type

print(f"Error in broadcast presence: {e}")
import traceback
print(traceback.format_exc())
if DEBUG_DISCOVERY >= 2: print(f"Error updating or accessing interfaces: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did the message change?

if peer_id in self.known_peers:
_, _, _, current_priority = self.known_peers[peer_id]
if new_priority > current_priority:
await self.reconnect_peer(peer_id, peer_host, peer_port, new_priority, device_capabilities)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discovery is not responsible for connecting to peers, it just discovers them. connecting should happen in the standard_node.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this requires a bit of a refactor to do properly, I'm focusing on linux fixes right now but if you have any questions async, leave a comment on this PR and tag me

@itsknk
Copy link
Contributor Author

itsknk commented Jul 22, 2024

Hey @AlexCheema, yeah it does look a little complex. After some understanding, I thought of some steps. Please do verify these, and let me know if I'm on the right track.

  • Let GRPCDiscovery class to handle discovering peers and maintaining their information, without any connection logic.
  • The StandardNode class will handle the connection logic, including priority-based reconnections.
  • The update_peers method in StandardNode can check for new peers or higher priority connections and manage the connections accordingly.
  • The periodic topology collection remains, ensuring the node regularly updates its peer connections based on the latest discovery information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants