Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support libp2p relays for NAT traversal #186

Merged
merged 11 commits into from
Jan 9, 2023
Merged

Support libp2p relays for NAT traversal #186

merged 11 commits into from
Jan 9, 2023

Conversation

Vahe1994
Copy link
Collaborator

@Vahe1994 Vahe1994 commented Jan 8, 2023

  • Added relay options to servers
  • Enabled relay options by default
  • Changed hivemind version to 1.1.5
  • Moved reachability check to be performed after blocks are loaded

2. Enabled relay options by default
3. Changed hivemind version to 1.1.5
@Vahe1994 Vahe1994 marked this pull request as ready for review January 8, 2023 11:45
@Vahe1994 Vahe1994 requested a review from mryab January 8, 2023 11:47
@@ -127,7 +127,10 @@ def main():
parser.add_argument("--mean_balance_check_period", type=float, default=60,
help="Check the swarm's balance every N seconds (and rebalance it if necessary)")

parser.add_argument("--auto_relay", action='store_true', help="Enabling relay for NAT traversal")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parser.add_argument("--auto_relay", action='store_true', help="Enabling relay for NAT traversal")
parser.add_argument("--auto_relay", action='store_true', help="Enable relay for NAT traversal")

@@ -78,6 +78,8 @@ def __init__(
load_in_8bit: Optional[bool] = None,
tensor_parallel_devices: Optional[Sequence[torch.device]] = None,
skip_reachability_check: bool = False,
use_relay: bool = True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's best to either remove default argument values of to remove these arguments completely: we might forget to change defaults here in the future, and required values will be passed to kwargs anyway

Copy link
Collaborator

@borzunov borzunov Jan 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point but I think we should keep it, since the convention in Petals is that all Server defaults match to the defaults of run_server.py (in turn, the hivemind default for use_auto_relay is different).

But all you said would have applied if the defaults here matched with hivemind.

Copy link
Collaborator Author

@Vahe1994 Vahe1994 Jan 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will agree with @borzunov here. Here are my arguments:

  1. use_relay will not be passed from run_server and we want it by default to be True
  2. it is nice to see in the arguments all parameters that matters
  3. usually , it is not a good idea to be dependent on default argument from another library . They could be changed without notice and can lead to strange behavior

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about explicitly indicating arguments for creation of Server, though it somewhat contradicts the existence of **kwargs in init. My primary concern is that we should strive to have consistent defaults across different locations: one way to do this in an error-proof way would be to declare a common constant with the default value and use it in both locations. Besides, petals-cli is a part of Petals, so these files belong to the same library

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to create constants for all defaults in this case (tens of them). I think this is a more general problem that should be addressed outside of this PR (maybe we should use smth like reflection).

@borzunov borzunov changed the title Support circuit relay v2 Support libp2p relays for NAT traversal Jan 8, 2023
Copy link
Collaborator

@borzunov borzunov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I have questions about relays, I need to talk to @justheuristic before we ship this code.

logger = get_logger(__file__)


def check_reachability(peer_id, wait_time: float = 7 * 60, retry_delay: float = 15) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from server.py

Copy link
Collaborator

@borzunov borzunov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All urgent issues have been resolved.

@borzunov borzunov merged commit 93bed7d into main Jan 9, 2023
@borzunov borzunov deleted the relay_auto branch January 9, 2023 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants