-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc] Export the RPC interface from libc
#71432
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-libc Author: Joseph Huber (jhuber6) ChangesSummary: Obviously, this approach is far less safe than the carefully crafted C++ I was unsure whether or not to use function pointers to give a This somewhat stretches the concept of Patch is 21.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71432.diff 20 Files Affected:
diff --git a/libc/config/gpu/entrypoints.txt b/libc/config/gpu/entrypoints.txt
index 5dda6aa71d36f81..47bcb36db49d55b 100644
--- a/libc/config/gpu/entrypoints.txt
+++ b/libc/config/gpu/entrypoints.txt
@@ -139,6 +139,11 @@ set(TARGET_LIBC_ENTRYPOINTS
# gpu/rpc.h entrypoints
libc.src.gpu.rpc_host_call
+ libc.src.gpu.rpc_open_port
+ libc.src.gpu.rpc_get_buffer
+ libc.src.gpu.rpc_send
+ libc.src.gpu.rpc_recv
+ libc.src.gpu.rpc_close_port
)
set(TARGET_LIBM_ENTRYPOINTS
diff --git a/libc/include/CMakeLists.txt b/libc/include/CMakeLists.txt
index 9d170603ffa45cd..0f044ff84f28983 100644
--- a/libc/include/CMakeLists.txt
+++ b/libc/include/CMakeLists.txt
@@ -515,6 +515,7 @@ if(LIBC_TARGET_ARCHITECTURE_IS_GPU)
DEPENDS
.llvm_libc_common_h
.llvm-libc-types.rpc_opcodes_t
+ .llvm-libc-types.rpc_port_t
)
endif()
diff --git a/libc/include/gpu/rpc.h.def b/libc/include/gpu/rpc.h.def
index 0438cd65e7be22e..f178007ee7083f5 100644
--- a/libc/include/gpu/rpc.h.def
+++ b/libc/include/gpu/rpc.h.def
@@ -11,7 +11,10 @@
#include <__llvm-libc-common.h>
+#include <llvm-libc-types/size_t.h>
+
#include <llvm-libc-types/rpc_opcodes_t.h>
+#include <llvm-libc-types/rpc_port_t.h>
%%public_api()
diff --git a/libc/include/llvm-libc-types/CMakeLists.txt b/libc/include/llvm-libc-types/CMakeLists.txt
index 3c0cc7bbc71dacb..8e0bf6a50c22bdd 100644
--- a/libc/include/llvm-libc-types/CMakeLists.txt
+++ b/libc/include/llvm-libc-types/CMakeLists.txt
@@ -91,3 +91,4 @@ add_header(wint_t HDR wint_t.h)
add_header(sa_family_t HDR sa_family_t.h)
add_header(struct_sockaddr HDR struct_sockaddr.h)
add_header(rpc_opcodes_t HDR rpc_opcodes_t.h)
+add_header(rpc_port_t HDR rpc_port_t.h)
diff --git a/libc/include/llvm-libc-types/rpc_port_t.h b/libc/include/llvm-libc-types/rpc_port_t.h
new file mode 100644
index 000000000000000..b0e6669a9f88a53
--- /dev/null
+++ b/libc/include/llvm-libc-types/rpc_port_t.h
@@ -0,0 +1,21 @@
+//===-- Definition of type rpc_port_t -------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef __LLVM_LIBC_TYPES_RPC_PORT_T_H__
+#define __LLVM_LIBC_TYPES_RPC_PORT_T_H__
+
+typedef struct {
+ __UINT8_TYPE__ reserved[32];
+} rpc_port_t;
+
+typedef struct {
+ __UINT64_TYPE__ data[8];
+} rpc_buffer_t;
+
+
+#endif // __LLVM_LIBC_TYPES_RPC_PORT_T_H__
diff --git a/libc/spec/gpu_ext.td b/libc/spec/gpu_ext.td
index dce81ff7786203b..56094d4d50d3d16 100644
--- a/libc/spec/gpu_ext.td
+++ b/libc/spec/gpu_ext.td
@@ -1,8 +1,13 @@
+def RPCPortT : NamedType<"rpc_port_t">;
+def RPCPortPtrT : PtrType<RPCPortT>;
+def RPCBufferT : NamedType<"rpc_buffer_t">;
+def RPCBufferPtrT : PtrType<RPCBufferT>;
+
def GPUExtensions : StandardSpec<"GPUExtensions"> {
HeaderSpec RPC = HeaderSpec<
"gpu/rpc.h",
[], // Macros
- [], // Types
+ [RPCPortT, RPCBufferT, SizeTType], // Types
[], // Enumerations
[
FunctionSpec<
@@ -10,6 +15,31 @@ def GPUExtensions : StandardSpec<"GPUExtensions"> {
RetValSpec<VoidType>,
[ArgSpec<VoidPtr>, ArgSpec<VoidPtr>, ArgSpec<SizeTType>]
>,
+ FunctionSpec<
+ "rpc_open_port",
+ RetValSpec<RPCPortT>,
+ [ArgSpec<UnsignedIntType>]
+ >,
+ FunctionSpec<
+ "rpc_get_buffer",
+ RetValSpec<RPCBufferPtrT>,
+ [ArgSpec<RPCPortPtrT>]
+ >,
+ FunctionSpec<
+ "rpc_send",
+ RetValSpec<VoidType>,
+ [ArgSpec<RPCPortPtrT>]
+ >,
+ FunctionSpec<
+ "rpc_recv",
+ RetValSpec<VoidType>,
+ [ArgSpec<RPCPortPtrT>]
+ >,
+ FunctionSpec<
+ "rpc_close_port",
+ RetValSpec<VoidType>,
+ [ArgSpec<RPCPortPtrT>]
+ >,
]
>;
let Headers = [
diff --git a/libc/src/__support/RPC/rpc.h b/libc/src/__support/RPC/rpc.h
index 08c1dfd10d6d7f3..66c9fb297767780 100644
--- a/libc/src/__support/RPC/rpc.h
+++ b/libc/src/__support/RPC/rpc.h
@@ -319,6 +319,14 @@ template <bool T, typename S> struct Port {
LIBC_INLINE uint16_t get_index() const { return index; }
+ /// Waits until this port owns the buffer and returns a pointer to it.
+ LIBC_INLINE Buffer *get_buffer() const {
+ uint32_t in = owns_buffer ? out ^ T : process.load_inbox(lane_mask, index);
+ process.wait_for_ownership(lane_mask, index, out, in);
+
+ return &process.packet[index].payload.slot[gpu::get_lane_id()];
+ }
+
LIBC_INLINE void close() {
// The server is passive, if it own the buffer when it closes we need to
// give ownership back to the client.
@@ -347,7 +355,9 @@ struct Client {
: process(port_count, buffer) {}
using Port = rpc::Port<false, Packet<gpu::LANE_SIZE>>;
- template <uint16_t opcode> LIBC_INLINE Port open();
+
+ template <uint16_t opcode> LIBC_INLINE Port open() { return open(opcode); }
+ LIBC_INLINE Port open(const uint16_t opcode);
private:
Process<false, Packet<gpu::LANE_SIZE>> process;
@@ -510,8 +520,10 @@ LIBC_INLINE void Port<T, S>::recv_n(void **dst, uint64_t *size, A &&alloc) {
/// only open a port if we find an index that is in a valid sending state. That
/// is, there are send operations pending that haven't been serviced on this
/// port. Each port instance uses an associated \p opcode to tell the server
-/// what to do.
-template <uint16_t opcode> LIBC_INLINE Client::Port Client::open() {
+/// what to do. It is very important that the \p opcode value is identical
+/// across each lane. Use the template version of this unless absolutely
+/// necessary.
+LIBC_INLINE Client::Port Client::open(const uint16_t opcode) {
// Repeatedly perform a naive linear scan for a port that can be opened to
// send data.
for (uint32_t index = 0;; ++index) {
diff --git a/libc/src/gpu/CMakeLists.txt b/libc/src/gpu/CMakeLists.txt
index e20228516b5112d..822b7ba2f89fe62 100644
--- a/libc/src/gpu/CMakeLists.txt
+++ b/libc/src/gpu/CMakeLists.txt
@@ -8,3 +8,58 @@ add_entrypoint_object(
libc.src.__support.RPC.rpc_client
libc.src.__support.GPU.utils
)
+
+add_entrypoint_object(
+ rpc_open_port
+ SRCS
+ rpc_open_port.cpp
+ HDRS
+ rpc_open_port.h
+ DEPENDS
+ libc.src.__support.RPC.rpc_client
+ libc.src.__support.GPU.utils
+)
+
+add_entrypoint_object(
+ rpc_close_port
+ SRCS
+ rpc_close_port.cpp
+ HDRS
+ rpc_close_port.h
+ DEPENDS
+ libc.src.__support.RPC.rpc_client
+ libc.src.__support.GPU.utils
+)
+
+add_entrypoint_object(
+ rpc_get_buffer
+ SRCS
+ rpc_get_buffer.cpp
+ HDRS
+ rpc_get_buffer.h
+ DEPENDS
+ libc.src.__support.RPC.rpc_client
+ libc.src.__support.GPU.utils
+)
+
+add_entrypoint_object(
+ rpc_send
+ SRCS
+ rpc_send.cpp
+ HDRS
+ rpc_send.h
+ DEPENDS
+ libc.src.__support.RPC.rpc_client
+ libc.src.__support.GPU.utils
+)
+
+add_entrypoint_object(
+ rpc_recv
+ SRCS
+ rpc_recv.cpp
+ HDRS
+ rpc_recv.h
+ DEPENDS
+ libc.src.__support.RPC.rpc_client
+ libc.src.__support.GPU.utils
+)
diff --git a/libc/src/gpu/rpc_close_port.cpp b/libc/src/gpu/rpc_close_port.cpp
new file mode 100644
index 000000000000000..f4d9f4b194a4b7b
--- /dev/null
+++ b/libc/src/gpu/rpc_close_port.cpp
@@ -0,0 +1,25 @@
+//===---------- GPU implementation of the external RPC port interface -----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/gpu/rpc_close_port.h"
+
+#include "src/__support/GPU/utils.h"
+#include "src/__support/RPC/rpc_client.h"
+#include "src/__support/common.h"
+
+namespace LIBC_NAMESPACE {
+
+static_assert(sizeof(rpc_port_t) == sizeof(rpc::Client::Port), "ABI mismatch");
+static_assert(sizeof(rpc_buffer_t) == sizeof(rpc::Buffer), "ABI mismatch");
+
+LLVM_LIBC_FUNCTION(void, rpc_close_port, (rpc_port_t * handle)) {
+ rpc::Client::Port *port = reinterpret_cast<rpc::Client::Port *>(handle);
+ port->close();
+}
+
+} // namespace LIBC_NAMESPACE
diff --git a/libc/src/gpu/rpc_close_port.h b/libc/src/gpu/rpc_close_port.h
new file mode 100644
index 000000000000000..2be39ce3698a384
--- /dev/null
+++ b/libc/src/gpu/rpc_close_port.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for RPC functions -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_GPU_RPC_CLOSE_PORT_H
+#define LLVM_LIBC_SRC_GPU_RPC_CLOSE_PORT_H
+
+#include <gpu/rpc.h>
+
+namespace LIBC_NAMESPACE {
+
+void rpc_close_port(rpc_port_t *handle);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_GPU_RPC_CLOSE_PORT_H
diff --git a/libc/src/gpu/rpc_get_buffer.cpp b/libc/src/gpu/rpc_get_buffer.cpp
new file mode 100644
index 000000000000000..10097b404c81c46
--- /dev/null
+++ b/libc/src/gpu/rpc_get_buffer.cpp
@@ -0,0 +1,25 @@
+//===---------- GPU implementation of the external RPC port interface -----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/gpu/rpc_get_buffer.h"
+
+#include "src/__support/GPU/utils.h"
+#include "src/__support/RPC/rpc_client.h"
+#include "src/__support/common.h"
+
+namespace LIBC_NAMESPACE {
+
+static_assert(sizeof(rpc_port_t) == sizeof(rpc::Client::Port), "ABI mismatch");
+static_assert(sizeof(rpc_buffer_t) == sizeof(rpc::Buffer), "ABI mismatch");
+
+LLVM_LIBC_FUNCTION(rpc_buffer_t *, rpc_get_buffer, (rpc_port_t * handle)) {
+ rpc::Client::Port *port = reinterpret_cast<rpc::Client::Port *>(handle);
+ return reinterpret_cast<rpc_buffer_t *>(port->get_buffer());
+}
+
+} // namespace LIBC_NAMESPACE
diff --git a/libc/src/gpu/rpc_get_buffer.h b/libc/src/gpu/rpc_get_buffer.h
new file mode 100644
index 000000000000000..5a07db1a5470085
--- /dev/null
+++ b/libc/src/gpu/rpc_get_buffer.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for RPC functions -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_GPU_RPC_GET_BUFFER_H
+#define LLVM_LIBC_SRC_GPU_RPC_GET_BUFFER_H
+
+#include <gpu/rpc.h>
+
+namespace LIBC_NAMESPACE {
+
+rpc_buffer_t *rpc_get_buffer(rpc_port_t *handle);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_GPU_RPC_GET_BUFFER_H
diff --git a/libc/src/gpu/rpc_open_port.cpp b/libc/src/gpu/rpc_open_port.cpp
new file mode 100644
index 000000000000000..73dab5066fa1bfb
--- /dev/null
+++ b/libc/src/gpu/rpc_open_port.cpp
@@ -0,0 +1,26 @@
+//===---------- GPU implementation of the external RPC port interface -----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/gpu/rpc_open_port.h"
+
+#include "src/__support/CPP/bit.h"
+#include "src/__support/GPU/utils.h"
+#include "src/__support/RPC/rpc_client.h"
+#include "src/__support/common.h"
+
+namespace LIBC_NAMESPACE {
+
+static_assert(sizeof(rpc_port_t) == sizeof(rpc::Client::Port), "ABI mismatch");
+static_assert(sizeof(rpc_buffer_t) == sizeof(rpc::Buffer), "ABI mismatch");
+
+LLVM_LIBC_FUNCTION(rpc_port_t, rpc_open_port, (unsigned opcode)) {
+ rpc::Client::Port port = rpc::client.open(static_cast<uint16_t>(opcode));
+ return cpp::bit_cast<rpc_port_t>(port);
+}
+
+} // namespace LIBC_NAMESPACE
diff --git a/libc/src/gpu/rpc_open_port.h b/libc/src/gpu/rpc_open_port.h
new file mode 100644
index 000000000000000..553fa59120aaf68
--- /dev/null
+++ b/libc/src/gpu/rpc_open_port.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for RPC functions -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_GPU_RPC_OPEN_PORT_H
+#define LLVM_LIBC_SRC_GPU_RPC_OPEN_PORT_H
+
+#include <gpu/rpc.h>
+
+namespace LIBC_NAMESPACE {
+
+rpc_port_t rpc_open_port(unsigned opcode);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_GPU_RPC_OPEN_PORT_H
diff --git a/libc/src/gpu/rpc_recv.cpp b/libc/src/gpu/rpc_recv.cpp
new file mode 100644
index 000000000000000..96086dc2b17a345
--- /dev/null
+++ b/libc/src/gpu/rpc_recv.cpp
@@ -0,0 +1,25 @@
+//===---------- GPU implementation of the external RPC port interface -----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/gpu/rpc_recv.h"
+
+#include "src/__support/GPU/utils.h"
+#include "src/__support/RPC/rpc_client.h"
+#include "src/__support/common.h"
+
+namespace LIBC_NAMESPACE {
+
+static_assert(sizeof(rpc_port_t) == sizeof(rpc::Client::Port), "ABI mismatch");
+static_assert(sizeof(rpc_buffer_t) == sizeof(rpc::Buffer), "ABI mismatch");
+
+LLVM_LIBC_FUNCTION(void, rpc_recv, (rpc_port_t * handle)) {
+ rpc::Client::Port *port = reinterpret_cast<rpc::Client::Port *>(handle);
+ port->recv([](rpc::Buffer *) { /* Filled by rpc_get_buffer() */ });
+}
+
+} // namespace LIBC_NAMESPACE
diff --git a/libc/src/gpu/rpc_recv.h b/libc/src/gpu/rpc_recv.h
new file mode 100644
index 000000000000000..926565c3b0c16a0
--- /dev/null
+++ b/libc/src/gpu/rpc_recv.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for RPC functions -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_GPU_RPC_RECV_H
+#define LLVM_LIBC_SRC_GPU_RPC_RECV_H
+
+#include <gpu/rpc.h>
+
+namespace LIBC_NAMESPACE {
+
+void rpc_recv(rpc_port_t *handle);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_GPU_RPC_RECV_H
diff --git a/libc/src/gpu/rpc_send.cpp b/libc/src/gpu/rpc_send.cpp
new file mode 100644
index 000000000000000..4f4f5bcc2a5eb80
--- /dev/null
+++ b/libc/src/gpu/rpc_send.cpp
@@ -0,0 +1,25 @@
+//===---------- GPU implementation of the external RPC port interface -----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/gpu/rpc_send.h"
+
+#include "src/__support/GPU/utils.h"
+#include "src/__support/RPC/rpc_client.h"
+#include "src/__support/common.h"
+
+namespace LIBC_NAMESPACE {
+
+static_assert(sizeof(rpc_port_t) == sizeof(rpc::Client::Port), "ABI mismatch");
+static_assert(sizeof(rpc_buffer_t) == sizeof(rpc::Buffer), "ABI mismatch");
+
+LLVM_LIBC_FUNCTION(void, rpc_send, (rpc_port_t * handle)) {
+ rpc::Client::Port *port = reinterpret_cast<rpc::Client::Port *>(handle);
+ port->send([](rpc::Buffer *) { /* Filled by rpc_get_buffer() */ });
+}
+
+} // namespace LIBC_NAMESPACE
diff --git a/libc/src/gpu/rpc_send.h b/libc/src/gpu/rpc_send.h
new file mode 100644
index 000000000000000..e367514181f2ba8
--- /dev/null
+++ b/libc/src/gpu/rpc_send.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for RPC functions -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_GPU_RPC_SEND_H
+#define LLVM_LIBC_SRC_GPU_RPC_SEND_H
+
+#include <gpu/rpc.h>
+
+namespace LIBC_NAMESPACE {
+
+void rpc_send(rpc_port_t *handle);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_GPU_RPC_SEND_H
diff --git a/libc/test/integration/startup/gpu/CMakeLists.txt b/libc/test/integration/startup/gpu/CMakeLists.txt
index 7555986b16df452..18764ad2bcaf0b0 100644
--- a/libc/test/integration/startup/gpu/CMakeLists.txt
+++ b/libc/test/integration/startup/gpu/CMakeLists.txt
@@ -44,6 +44,19 @@ add_integration_test(
rpc_interface_test.cpp
)
+add_integration_test(
+ startup_rpc_external_interface_test
+ SUITE libc-startup-tests
+ SRCS
+ rpc_external_interface_test.cpp
+ DEPENDS
+ libc.src.gpu.rpc_open_port
+ libc.src.gpu.rpc_get_buffer
+ libc.src.gpu.rpc_send
+ libc.src.gpu.rpc_recv
+ libc.src.gpu.rpc_close_port
+)
+
add_integration_test(
startup_rpc_stream_test
SUITE libc-startup-tests
diff --git a/libc/test/integration/startup/gpu/rpc_external_interface_test.cpp b/libc/test/integration/startup/gpu/rpc_external_interface_test.cpp
new file mode 100644
index 000000000000000..ae5221f58fc901d
--- /dev/null
+++ b/libc/test/integration/startup/gpu/rpc_external_interface_test.cpp
@@ -0,0 +1,99 @@
+//===-- Loader test to check the external RPC interface with the loader ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include <gpu/rpc.h>
+
+#include "src/gpu/rpc_close_port.h"
+#include "src/gpu/rpc_get_buffer.h"
+#include "src/gpu/rpc_open_port.h"
+#include "src/gpu/rpc_recv.h"
+#include "src/gpu/rpc_send.h"
+
+#include "include/llvm-libc-types/test_rpc_opcodes_t.h"
+#include "src/__support/GPU/utils.h"
+#include "src/__support/RPC/rpc_client.h"
+#include "test/IntegrationTest/test.h"
+
+using namespace LIBC_NAMESPACE;
+
+// Test to ensure that we can use aribtrary combinations of sends and recieves
+// as long as they are mirrored using the external interface.
+static void test_interface(bool end_with_send) {
+ uint64_t cnt = 0;
+ // rpc::Client::Port port = rpc::client.open<RPC_TEST_INTERFACE>();
+ rpc_port_t port = LIBC_NAMESPACE::rpc_open_port(RPC_TEST_INTERFACE);
+
+ // port.send([&](rpc::Buffer *buffer) { buffer->data[0] = end_with_send; });
+ rpc_buffer_t *buffer = LIBC_NAMESPACE::rpc_get_buffer(&port);
+ buffer->data[0] = end_with_send;
+ LIBC_NAMESPACE::rpc_send(&port);
+
+ // port.send([&](rpc::Buffer *buffer) { buffer->data[0] = cnt = cnt + 1; });
+ buffer = LIBC_NAMESPACE::rpc_get_buffer(&port);
+ buffer->data[0] = cnt = cnt + 1;
+ LIBC_NAMESPACE::rpc_send(&port);
+
+ // port.recv([&](rpc::Buffer *buffer) { cnt = buffer->data[0]; });
+ LIBC_NAMESPACE::rpc_recv(&port);
+ buffer = LIBC_NAMESPACE::rpc_get_buffer(&port);
+ cnt = buffer->data[0];
+
+ // port.send([&](rpc::Buffer *buffer) { buffer->data[0] = cnt = cnt + 1; });
+ buffer = LIBC_NAMESPACE::rpc_get_buffer(&port);
+ buffer->data[0] = cnt = cnt + 1;
+ LIBC_NAMESPACE::rpc_send(&port);
+
+ // port.recv([&](rpc::Buffer *buffer) { cnt = buffer->data[0]; });
+ LIBC_NAMESPACE::rpc_recv(&port);
+ buffer = LIBC_NAMESPACE::rpc_get_buffer(&port);
+ cnt = buffer->data[0];
+
+ // port.send([&](rpc::Buff...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
libc/src/__support/RPC/rpc.h
Outdated
/// what to do. | ||
template <uint16_t opcode> LIBC_INLINE Client::Port Client::open() { | ||
/// what to do. It is very important that the \p opcode value is identical | ||
/// across each lane. Use the template version of this unless absolutely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see how the template version is different than the version with the argument, other than the opcode being locked in at compile time. Could you explain a bit more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is some GPU specific magic. Basically GPUs use so-called "SIMT" execution where we have execution units that are 32 or 64 lanes wide that all execute at once. These are called "warps" or "wavefronts" depending on the platform. This mean that some "threads" are effectively held hostage by other threads if they are in the same warp. So, the unit of least parallelism is actually a warp / wavefront here, which is why the RPC interface handles a whole warp / wavefront at a time. So, given a size of 32 and some code like,
int value = thread_id % 2 ? 0 : 1;
some_fn(value);
Will result in a warp with alternating values, but the whole warp / wavefront is still "convergent" that means that all "threads" are still active. If we apply this logic to opening a port, we could have the situation where a single warp tries to handle two separate opcodes, which is a break of the asusmption that each RPC call handles a single warp / wavefront. We can avoid this by forcing it to be a compile time constant via the template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, thank you for explaining
@@ -0,0 +1,26 @@ | |||
//===---------- GPU implementation of the external RPC port interface -----===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny nit: Your cpp files have the title too far to the right in the first line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out.
|
||
namespace LIBC_NAMESPACE { | ||
|
||
static_assert(sizeof(rpc_port_t) == sizeof(rpc::Client::Port), "ABI mismatch"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to have these checks in a more centralized place? Maybe in rpc_client.h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is that we can't export the original C++ definitions, this was something Siva was adamant about because we didn't want details from libc
leaking from the headers. So basically rpc_client.h
is exported, but in the implementation .cpp
file we make sure that the definitions are compatible with what we expect. That ensures that if we change the interface this doens't silently break.
>, | ||
FunctionSpec< | ||
"rpc_send", | ||
RetValSpec<VoidType>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
design question: Would it make sense to give send, recv, and close return values of some sort? Probably an int so it can return various error codes or 0 on success to match other libc functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's necessary, there shouldn't be any failure modes once a port has been opened. Because we are executing in a unified address space, we don't need to worry about network effects like gRPC or similar. Internally we pretty much just spin until memory happens and then continue. We could potentially expand the internface to allow the lambda we're copying to fail, but I don't think there's any utility in that. If it's for consistently I could make it return an int and just hardwire it to zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No error codes. It is fundamentally crucial that these functions do not and can not fail.
This permits implementing arbitrary functions that execute across the two machines with exactly the interface they have on one machine. No extra integer value to mean "network failed", no replacing double return types with a pair of double and boolean, no extra global variable named rpcerrno that people are supposed to check.
This trickery relative to the usual RPC problem of messing up the interfaces works because communication on a shared memory machine only fails when the whole machine has fallen over, which is non-recoverable, and each function does not dynamically allocate a resource that can fail.
At least by design - in practice we may currently be undersizing the ports structure to guarantee progress in the worst case scheduling, but if so that's fixable by making N large enough for a given machine. The cost is order of a few megabytes of memory which in the scheme of things doesn't matter for a singleton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function pointers are a bad thing. We should be exporting the C++ interface as C++. not doing this dance through a C shim in exchange for widening the scope for programmer error and performance cost, plus further rolling the die with the compiler correctness.
The implementation has reinterpret cast all over the place and shouldn't. Declare the port as something like
struct port
{
union {
uint64_t d;
void* p;
};
};
or just as struct {uint64_t}, and generally if people decide to read or write that integer themselves they get to keep all the pieces.
@@ -17,4 +17,7 @@ typedef struct { | |||
__UINT64_TYPE__ data[8]; | |||
} rpc_buffer_t; | |||
|
|||
typedef void (*rpc_callback_t)(rpc_buffer_t *buffer, void *data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't call these things uint32_t, we probably can't call them rpc_buffer_t either. Presumably __LIBC_RPC_BUFFER_T or whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are macros provided by GCC and clang that pretty much expand to the same thing, don't think we need to follow the naming convention.
libc/src/gpu/rpc_recv.cpp
Outdated
@@ -17,9 +17,12 @@ namespace LIBC_NAMESPACE { | |||
static_assert(sizeof(rpc_port_t) == sizeof(rpc::Client::Port), "ABI mismatch"); | |||
static_assert(sizeof(rpc_buffer_t) == sizeof(rpc::Buffer), "ABI mismatch"); | |||
|
|||
LLVM_LIBC_FUNCTION(void, rpc_recv, (rpc_port_t * handle)) { | |||
LLVM_LIBC_FUNCTION(void, rpc_recv, | |||
(rpc_port_t * handle, rpc_callback_t callback, void *data)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would these be reinterpret casts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rpc_port_t
is more or less a type-erased buffer. We convert it back to the real type since we know what it is internally. To the user, it's just like 32 bytes.
LIBC_NAMESPACE::rpc_recv( | ||
&port, | ||
[](rpc_buffer_t *buffer, void *data, uint32_t) { | ||
*reinterpret_cast<uint64_t *>(data) = buffer->data[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise, why is there type punning happening here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the input data for a function pointer interface needs to be void and the implementation defines what it is. Same story with the numerous HSA callbacks or pthreads.
There's a fundamental layering violation in what we've done here which is probably the root cause of the API grief. Libc depends on syscall, libc does not contain syscall. We should aspire to put the RPC machinery somewhere else, leave the implementation of fopen and similar in libc, but have the templated state machine exist independent of the exported API constructs of libc. It's a good candidate for some interpretations of llvm/offload. Suggest we take the bits of rpc that would be nice to export in type safe, non-function-pointery fashion and make those usable in a header only sense, not wire it up into libc's interface, and #include them from within openmp or whatever with total disregard to the exported C interface until such time as they have a new home. |
Exporting an opaque handle is a pretty common idiom. The interface is completely hidden from the user, they just need to know it's data. Doing bit casts should be legal here.
Somewhat agree in the long term. The general issue is then who's running the show in the case of a separate implementation. Maybe we could just make the actual client
This whole interface could be exported as |
I have updated this to only export the |
7ab42ef
to
6820c7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reinterpret_casts I cautioned against in November are still here, this time with clear cut undefined behaviour.
rpc::Client::Port *port = reinterpret_cast<rpc::Client::Port *>(handle);
port-> bang
Handle is 32 bytes. It's not an instance of port. Totally invalid C++, you need to handle aliasing with far more paranoia than this to be correct. Even if you're going with "UB is fine, clang isn't currently breaking me", that's a misaligned load and I doubt GPUs support that gracefully.
Could you explain a bit further into what's undefined here? Basically we are doing this
The sketchy part may be the original bitcast, because the base type is not trivially constructable. But because we already created it, shouldn't it be valid to do an in-memory bitcast? After that it's pretty much just typical type erasure since it's referring to the same memory. |
369b63b
to
09abb45
Compare
Summary: This patch adds new extensions that allow us to export the RPC interface from `libc` to other programs. This should allow external users of the GPU `libc` to interface with the RPC client (which more or less behaves like syscalls in this context). This is done by wrapping the interface into a C-style function call. Obviously, this approach is far less safe than the carefully crafted C++ interface. For example, we now expose the internal packet buffer, and it is theoretically possible to open a single port with conflicting opcodes and break the whole interface. So, extra care will be needed when interacting with this. However, the usage is very similar as shown by the new test. This somewhat stretches the concept of `libc` just doing `libc` things, but I think this is important enough to justify it. It is difficult to split this out, as we do not want to have the situation where we have multiple RPC clients running at one time, so for now it makes sense to just leave it in `libc`.
template <typename To, typename From> | ||
LIBC_INLINE constexpr cpp::enable_if_t< | ||
(sizeof(To) == sizeof(From)) && | ||
// Implementation of bit_cast that cannot use the compiler builtin must be | ||
// trivially-constructible To, to avoid UB in the implementation. | ||
#if !LIBC_HAS_BUILTIN(__builtin_bit_cast) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this would be the case - even if we had to DIY a bitcast from scratch, wouldn't default constructible suffice?
Also aren't these things all trivially constructible anyway, in which case I don't know why there's an #if
around this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't write this code, but it's likely going off of what the documentation says https://en.cppreference.com/w/cpp/numeric/bit_cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Port
is not trivially constructable because it uses a reference instead of a pointer.
/// port using the platform's returned value. | ||
template <uint16_t opcode> | ||
[[clang::convergent]] LIBC_INLINE Client::Port Client::open() { | ||
/// port using the platform's returned value. It is required that \p opcode is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loss of invariant is not great - what used to be a language sema enforced invariant is now a comment.
It seems relatively likely that someone would try to call a different function on different lanes of the wave by passing a variable for opcode, since it's an internal implementation quirk that we do not support that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not the best solution but not strictly required for correctness. This code isn't nullable, so there's no way for it to fail. Otherwise we could just return some null value if the ballot isn't all equal to the first lane or something. Regardless, it's not ideal.
LLVM_LIBC_FUNCTION(void, rpc_close_port, (rpc_port_t * handle)) { | ||
rpc::Client::Port port = cpp::bit_cast<rpc::Client::Port>(*handle); | ||
port.close(); | ||
*handle = cpp::bit_cast<rpc_port_t>(port); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be returning the port after it has been closed? Presumably the caller can do nothing with that value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same restriction the C++ interface has, we just require that it be closed by the user. I remember toying around with destructors but it was just too much overhead and pain for what we should just require as a part of using it.
|
||
namespace LIBC_NAMESPACE { | ||
|
||
void rpc_close_port(rpc_port_t *handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be passing the struct around by value instead?
LLVM_LIBC_FUNCTION(void, rpc_recv_n, | ||
(rpc_port_t * handle, void *dst, size_t *size)) { | ||
rpc::Client::Port port = cpp::bit_cast<rpc::Client::Port>(*handle); | ||
port.recv_n(dst, reinterpret_cast<uint64_t *>(size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just pick one of size_t and uint64_t and use it consistently?
I think there are 32 bit nvptx things out there based on needing to install gcc multilib to use it and there's a credible risk size_t is 32 bit on 32 bit platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
32-bit NVPTX is officially removed since CUDA 11.2 I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left various comments inline.
I think exporting RPC from libc by wrapping the C++ in a layer of C is fundamentally the wrong thing to be doing and the signal to noise ratio of this patch somewhat supports that.
The RPC layer benefits from being in C++. We'd like it to be easily usable outside of libc, e.g. from the openmp runtime. The llvm libc is implemented in C++.
Instead of doing this force-through-C dance with the associated unsafety and complexity, and given that the libc project doesn't want to export C++ interfaces, we should move RPC to some other subproject. The new offload one seems an obvious fit. Then libc uses the C++ version directly from that subproject.
The analogy of libc depending on syscall, and rpc being an implementation of syscall, seems self evident. It's a historical quirk that we didn't have anywhere better to put the text files in the repo.
I more or less agree with the fact that this should eventually be moved somewhere else. What I'm imagining is that all the GPU specific utils, like wrappers around builtins and stuff, and this should all be in a header, likely in the |
Summary:
This patch adds new extensions that allow us to export the RPC interface
from
libc
to other programs. This should allow external users of theGPU
libc
to interface with the RPC client (which more or less behaveslike syscalls in this context). This is done by wrapping the interface
into a C-style function call.
Obviously, this approach is far less safe than the carefully crafted C++
interface. For example, we now expose the internal packet buffer, and it
is theoretically possible to open a single port with conflicting opcodes
and break the whole interface. So, extra care will be needed when
interacting with this. However, the usage is very similar as shown by
the new test.
This somewhat stretches the concept of
libc
just doinglibc
things,but I think this is important enough to justify it. It is difficult to
split this out, as we do not want to have the situation where we have
multiple RPC clients running at one time, so for now it makes sense to
just leave it in
libc
.