Skip to content

Update EndpointConfig interfaces#651

Merged
chhwang merged 3 commits intomainfrom
chhwang/endpoint-config
Oct 22, 2025
Merged

Update EndpointConfig interfaces#651
chhwang merged 3 commits intomainfrom
chhwang/endpoint-config

Conversation

@chhwang
Copy link
Contributor

@chhwang chhwang commented Oct 20, 2025

  • Separate IB-specific options into a nested struct
  • Enable connect() by an Endpoint, not only by EndpointConfig
  • Other minor changes

@Binyang2014 Binyang2014 requested a review from Copilot October 21, 2025 16:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the EndpointConfig interface by consolidating InfiniBand-specific configuration options into a nested Ib struct and enables connection creation directly from an Endpoint object. Key changes include storing the full EndpointConfig in Endpoint::Impl instead of individual fields, improving transport validation logic, and updating Python bindings accordingly.

Key Changes

  • Refactored EndpointConfig to nest IB-specific parameters (maxCqSize, maxCqPollNum, maxSendWr, maxWrPerSend) into an EndpointConfig::Ib struct
  • Modified Endpoint::Impl to store config_ instead of separate transport_, device_, and maxWriteQueueSize_ fields
  • Added Communicator::connect() overload accepting const Endpoint& as the primary method, with EndpointConfig overload as a wrapper

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
include/mscclpp/core.hpp Refactored EndpointConfig to nest IB options in Ib struct; added Endpoint::config() method; updated Communicator::connect() signature
src/include/endpoint.hpp Changed Endpoint::Impl to store config_ instead of individual transport/device fields
src/endpoint.cc Updated constructor and methods to use config_ member; added deserialization validation
src/context.cc Improved transport validation with unified error message for mismatched transports
src/connection.cc Removed redundant default maxWriteQueueSize initialization
src/communicator.cc Refactored connect() to accept Endpoint directly; removed unnecessary std::move() calls
src/include/communicator.hpp Removed unnecessary std::move() from future initialization
python/mscclpp/core_py.cpp Updated Python bindings to expose nested Ib struct and modified connect() signatures

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@Binyang2014 Binyang2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chhwang chhwang enabled auto-merge (squash) October 22, 2025 03:04
@chhwang chhwang disabled auto-merge October 22, 2025 17:39
@chhwang chhwang merged commit 200cdf9 into main Oct 22, 2025
14 checks passed
@chhwang chhwang deleted the chhwang/endpoint-config branch October 22, 2025 17:39
chhwang added a commit that referenced this pull request Oct 23, 2025
Copilot AI mentioned this pull request Oct 24, 2025
chhwang added a commit that referenced this pull request Oct 24, 2025
* Python cannot distinguish `Communicator::connect(const Endpoint&,
...)` from `Communicator::connect(const EndpointConfig&, ...)`.
Temporarily removed the former one.
* A few other fixes in Python bindings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants