-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Hiding implementation details for comms #409
Changes from all commits
07644ec
f3e1f09
e038644
3fa9c94
fc132ef
85572d0
b19a4b6
6c74570
14785ff
25b62e6
76edf36
92851b3
8e266ef
9d535fb
7c990b1
3a20c64
83e3aa6
2773edc
3ff1c4d
cbb6f2d
713a872
ceb4d8b
1d3ffab
b48e393
dc076ae
e824384
078e33c
30dafb6
aedc6d7
245ea3d
2b99f1a
bb4b7c7
e798ab1
2d5898d
2cb3c3a
23912f5
fc306f6
61c263a
3b8cdf7
9e80074
a14e034
8b62532
ddfcee9
21a2c88
2bf24cb
41bd51b
4a95f2e
082eea9
ce8a005
c39685a
af46a09
1e9b9bd
7fc712c
a85b085
260a727
bb4a7f3
8c02d2c
ebb7c1d
c1e9c2d
fef8446
52e8635
293302f
61e5b3c
3701daf
60c81a7
55301e7
d20e5ef
1594f7f
1c74204
0bc750b
a3c0370
ea209c1
7093ad5
f613e39
9a5aa77
97a3d95
32bcbb6
b33b46a
b2e0b01
ad90d9b
e31b04c
902657c
d7f0074
c7666ad
8446e2f
4de51ed
9d1bdc5
4f2d14b
ea5e0a4
234c825
93d9a57
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
/* | ||
* Copyright (c) 2021, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we really place a test file under the main include directory? (i.e. include/raft) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @seunghwak, that's a great point and I'm open to suggestions on this one. In short, even though the file contains test functions, I've exposed them through the public API primarily for the hand-off to the cython layer. I guess on one hand it could be argued that the functions are internal to RAFT so maybe they could be treated as implementation details. On the other hand, they are public to the C/C++ API of the comms package so we are trying to make explicit the points which are exposed outside of the Maybe it would be more explicit if we put them in a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If this is the case, I am not opposed to placing these files in the If the test files are purely for internal testing, I prefer something like |
||
|
||
#pragma once | ||
|
||
#include <raft/comms/comms.hpp> | ||
#include <raft/comms/detail/test.hpp> | ||
|
||
#include <raft/handle.hpp> | ||
|
||
namespace raft { | ||
namespace comms { | ||
|
||
/** | ||
* @brief A simple sanity check that NCCL is able to perform a collective operation | ||
* | ||
* @param[in] handle the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] root the root rank id | ||
*/ | ||
bool test_collective_allreduce(const handle_t& handle, int root) | ||
{ | ||
return detail::test_collective_allreduce(handle, root); | ||
} | ||
|
||
/** | ||
* @brief A simple sanity check that NCCL is able to perform a collective operation | ||
* | ||
* @param[in] handle the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] root the root rank id | ||
*/ | ||
bool test_collective_broadcast(const handle_t& handle, int root) | ||
{ | ||
return detail::test_collective_broadcast(handle, root); | ||
} | ||
|
||
/** | ||
* @brief A simple sanity check that NCCL is able to perform a collective reduce | ||
* | ||
* @param[in] handle the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] root the root rank id | ||
*/ | ||
bool test_collective_reduce(const handle_t& handle, int root) | ||
{ | ||
return detail::test_collective_reduce(handle, root); | ||
} | ||
|
||
/** | ||
* @brief A simple sanity check that NCCL is able to perform a collective allgather | ||
* | ||
* @param[in] handle the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] root the root rank id | ||
*/ | ||
bool test_collective_allgather(const handle_t& handle, int root) | ||
{ | ||
return detail::test_collective_allgather(handle, root); | ||
} | ||
|
||
/** | ||
* @brief A simple sanity check that NCCL is able to perform a collective gather | ||
* | ||
* @param[in] handle the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] root the root rank id | ||
*/ | ||
bool test_collective_gather(const handle_t& handle, int root) | ||
{ | ||
return detail::test_collective_gather(handle, root); | ||
} | ||
|
||
/** | ||
* @brief A simple sanity check that NCCL is able to perform a collective gatherv | ||
* | ||
* @param[in] handle the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] root the root rank id | ||
*/ | ||
bool test_collective_gatherv(const handle_t& handle, int root) | ||
{ | ||
return detail::test_collective_gatherv(handle, root); | ||
} | ||
|
||
/** | ||
* @brief A simple sanity check that NCCL is able to perform a collective reducescatter | ||
* | ||
* @param[in] handle the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] root the root rank id | ||
*/ | ||
bool test_collective_reducescatter(const handle_t& handle, int root) | ||
{ | ||
return detail::test_collective_reducescatter(handle, root); | ||
} | ||
|
||
/** | ||
* A simple sanity check that UCX is able to send messages between all ranks | ||
* | ||
* @param[in] h the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param[in] numTrials number of iterations of all-to-all messaging to perform | ||
*/ | ||
bool test_pointToPoint_simple_send_recv(const handle_t& h, int numTrials) | ||
{ | ||
return detail::test_pointToPoint_simple_send_recv(h, numTrials); | ||
} | ||
|
||
/** | ||
* A simple sanity check that device is able to send OR receive. | ||
* | ||
* @param h the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param numTrials number of iterations of send or receive messaging to perform | ||
*/ | ||
bool test_pointToPoint_device_send_or_recv(const handle_t& h, int numTrials) | ||
{ | ||
return detail::test_pointToPoint_device_send_or_recv(h, numTrials); | ||
} | ||
|
||
/** | ||
* A simple sanity check that device is able to send and receive at the same time. | ||
* | ||
* @param h the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param numTrials number of iterations of send or receive messaging to perform | ||
*/ | ||
bool test_pointToPoint_device_sendrecv(const handle_t& h, int numTrials) | ||
{ | ||
return detail::test_pointToPoint_device_sendrecv(h, numTrials); | ||
} | ||
|
||
/** | ||
* A simple sanity check that device is able to perform multiple concurrent sends and receives. | ||
* | ||
* @param h the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param numTrials number of iterations of send or receive messaging to perform | ||
*/ | ||
bool test_pointToPoint_device_multicast_sendrecv(const handle_t& h, int numTrials) | ||
{ | ||
return detail::test_pointToPoint_device_multicast_sendrecv(h, numTrials); | ||
} | ||
|
||
/** | ||
* A simple test that the comms can be split into 2 separate subcommunicators | ||
* | ||
* @param h the raft handle to use. This is expected to already have an | ||
* initialized comms instance. | ||
* @param n_colors number of different colors to test | ||
*/ | ||
bool test_commsplit(const handle_t& h, int n_colors) { return detail::test_commsplit(h, n_colors); } | ||
} // namespace comms | ||
}; // namespace raft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you put additional space here? Or is this clang-format's job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't add any additional spaces. This must have been clang-format.