Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory not registered for rdma. Is this iobuf allocated before calling GlobalRdmaInitializeOrDie? Or just forget to call RegisterMemoryForRdma for your own buffer? #1995

Closed
372046933 opened this issue Nov 15, 2022 · 19 comments
Labels
bug the code does not work as expected

Comments

@372046933
Copy link
Contributor

Describe the bug (描述bug)
IOBuf cannot append_user_data with address starts after RegisterMemoryForRdma

W20221115 18:15:32.455838 79345 rdma_endpoint.cpp:727] Memory not registered for rdma. Is this iobuf allocated before calling GlobalRdmaInitializeOrDie? Or just forget to call RegisterMemoryForRdma for your own buffer?                                                                                                                                
W20221115 18:15:32.455842 79345 socket.cpp:1705] Fail to keep-write into Socket{id=2 fd=1034 addr=0.0.0.0:8002:53496} (0x55b33af574d0): Unknown error 3002 [3002]
E20221115 18:15:32.455933 79247 server2.cpp:95] RPC call failed: [E3002]Fail to keep-write into Socket{id=2 fd=1034 addr=0.0.0.0:8002:53496} (0x0x55b33af574d0): Memory not registered for RDMA

To Reproduce (复现方法)

// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements.  See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership.  The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License.  You may obtain a copy of the License at
//
//   http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied.  See the License for the
// specific language governing permissions and limitations
// under the License.

#include <gflags/gflags.h>
#include "butil/atomicops.h"
#include "butil/logging.h"
#include "butil/time.h"
#include "brpc/channel.h"
#include "brpc/rdma/rdma_helper.h"
#include "brpc/server.h"
#include "bvar/variable.h"
#include "test.pb.h"

DEFINE_int32(port, 8002, "TCP Port of this server");
DEFINE_bool(use_rdma, true, "Use RDMA or not");
namespace brpc{
namespace rdma {
    DECLARE_bool(rdma_trace_verbose);
}
}  // namespace brpc

butil::atomic<uint64_t> g_last_time(0);

namespace test {
class PerfTestServiceImpl : public PerfTestService {
public:
    PerfTestServiceImpl() {}
    ~PerfTestServiceImpl() {};

    void Test(google::protobuf::RpcController* cntl_base,
              const PerfTestRequest* request,
              PerfTestResponse* response,
              google::protobuf::Closure* done) {
        brpc::ClosureGuard done_guard(done);
        uint64_t last = g_last_time.load(butil::memory_order_relaxed);
        uint64_t now = butil::monotonic_time_us();
        if (now > last && now - last > 100000) {
            if (g_last_time.exchange(now, butil::memory_order_relaxed) == last) {
                response->set_cpu_usage(bvar::Variable::describe_exposed("process_cpu_usage"));
            } else {
                response->set_cpu_usage("");
            }
        } else {
            response->set_cpu_usage("");
        }
        brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
        if (request->echo_attachment()) {
            cntl->response_attachment().append(cntl->request_attachment());
        }
        if (!cntl->request_attachment().empty()) {
          std::string tmp;
          cntl->request_attachment().cutn(&tmp, 5);
          LOG(INFO) << tmp;
        }
        
    }
};
}

void run_client() {
  brpc::rdma::GlobalRdmaInitializeOrDie();
  brpc::ChannelOptions options;
  options.use_rdma = FLAGS_use_rdma;
  options.protocol = "baidu_std";
  options.connection_type = "single";
  options.timeout_ms = 10000;
  options.max_retry = 0;
  std::string server = "0.0.0.0:8002";
  brpc::Channel* _channel = new brpc::Channel();
  if (_channel->Init(server.c_str(), &options) != 0) {
    LOG(ERROR) << "Fail to initialize channel";
    return;
  }
  brpc::Controller cntl;
  test::PerfTestResponse response;
  test::PerfTestRequest request;
  char buffer[16] = {'a', 'b', 'c'};
  brpc::rdma::RegisterMemoryForRdma(buffer, 16);
  cntl.request_attachment().append_user_data(buffer + 1, 3, [](void*) {});
  request.set_echo_attachment(false);
  test::PerfTestService_Stub stub(_channel);
  stub.Test(&cntl, &request, &response, NULL);
  if (cntl.Failed()) {
    LOG(ERROR) << "RPC call failed: " << cntl.ErrorText();
  } else {
    LOG(INFO) << "RPC call success";
  }

}

int main(int argc, char* argv[]) {
    brpc::rdma::fLB::FLAGS_rdma_trace_verbose = true;
    GFLAGS_NS::ParseCommandLineFlags(&argc, &argv, true);
    if (FLAGS_use_rdma) {
        LOG(INFO) << "Call GlobalRdmaInitializeOrDie";
        brpc::rdma::GlobalRdmaInitializeOrDie();
    }

    brpc::Server server;
    test::PerfTestServiceImpl perf_test_service_impl;

    if (server.AddService(&perf_test_service_impl, 
                          brpc::SERVER_DOESNT_OWN_SERVICE) != 0) {
        LOG(ERROR) << "Fail to add service";
        return -1;
    }
    g_last_time.store(0, butil::memory_order_relaxed);

    brpc::ServerOptions options;
    options.use_rdma = FLAGS_use_rdma;
    if (server.Start(FLAGS_port, &options) != 0) {
        LOG(ERROR) << "Fail to start EchoServer";
        return -1;
    }
    run_client();

    server.RunUntilAskedToQuit();
    return 0;
}

Expected behavior (期望行为)
IOBuf can append_user_data with data in RegisterMemoryForRdma

Versions (各种版本)
OS: Ubuntu 22.04
Compiler: GCC 11.2.0
brpc: 3d90221
protobuf:

Additional context/screenshots (更多上下文/截图)

@Tuvie
Copy link
Contributor

Tuvie commented Nov 15, 2022

Thanks. This is a bug. The buf to attach is not the exactly same with the base address registered. I will commit a patch to fix this bug later.

@372046933
Copy link
Contributor Author

@Tuvie How much work is needed? It's grateful if you can shed some light on the relation between user_data and memory registered by ibv_reg_mr

@Tuvie
Copy link
Contributor

Tuvie commented Nov 16, 2022

@Tuvie How much work is needed? It's grateful if you can shed some light on the relation between user_data and memory registered by ibv_reg_mr

A PR is committed: #1999
Pls try again and refer to the update on docs.

By default, if you use memory managed by IOBuf, you don't need to register memory by yourself. If you manage memory outside the IOBuf, you should register memory explicitly as you did. However, since it is not efficient to lookup the lkey for this kind of user_data, we suggest you maintain your own memory pool and register all the pool once.

@372046933
Copy link
Contributor Author

Would it be better if MAX_USER_MRS be defined by gflags?

@372046933
Copy link
Contributor Author

Since linear scan is not efficient. Would interval tree be used to query and update in $\mathcal{O}(\log{}n)$ ?

@372046933
Copy link
Contributor Author

In my scenario, memory pool is not applicable, because the memory are allocated by allocator in the library out of control. I have tested RegisterMemoryForRdma with size from KBs to MBs and found that RegisterMemoryForRdma always faster than memcpy.

@Tuvie
Copy link
Contributor

Tuvie commented Nov 17, 2022

Since linear scan is not efficient. Would interval tree be used to query and update in O(log⁡n) ?

Currently, it is a tradeoff to avoid using lock for lkey lookup (more frequent operation during data transmission). To design a better solution, here is a question for your scenario: how do you use append_user_data in you application? Since you don't manage memory by yourself, why don't you use IOBuf instead?

@372046933
Copy link
Contributor Author

@Tuvie We use append_user_data in tensorflow's AsyncOpKernel::ComputeAsync. The memory comes from tensor in TensorFlow OP. It's allocated by the framework on each run.
We could use IOBuf::append, but append introduces an extra copy.

@Tuvie
Copy link
Contributor

Tuvie commented Nov 17, 2022

@Tuvie We use append_user_data in tensorflow's AsyncOpKernel::ComputeAsync. The memory comes from tensor in TensorFlow OP. It's allocated by the framework on each run. We could use IOBuf::append, but append introduces an extra copy.

So what do you think if we add a new parameter for append_user_data function to specify the lkey directly? In this case we don't need to lookup the lkey in an inefficient way. But you have to manage the lkey as a corresponding buffer id by yourself.

@372046933
Copy link
Contributor Author

Yes, we can manage LKey. Does adding parameter to append_user_data breaks backward compatibility?

@Tuvie
Copy link
Contributor

Tuvie commented Nov 17, 2022

Yes, we can manage LKey. Does adding parameter to append_user_data breaks backward compatibility?

Yes. Must try to keep the ABI compatibility. But I think it is a better way to solve this problem. Because application has more information than brpc so that the overall efficiency will be higher.

@372046933
Copy link
Contributor Author

Maybe another interface append_user_data_ib :)

@Tuvie
Copy link
Contributor

Tuvie commented Nov 20, 2022

@372046933 Pls try this PR: #2005

A new function named append_user_data_with_meta is provided. You should give the lkey returned by RegisterMemoryForRdma as the meta. Pls note that now RegisterMemoryForRdma returns 0 when failed, just contrast to the previous code.

@372046933
Copy link
Contributor Author

Thanks a million. Is append_user_data_with_meta in the PR? I couldn't find the definition in 40a24ab

@Tuvie
Copy link
Contributor

Tuvie commented Nov 21, 2022

Thanks a million. Is append_user_data_with_meta in the PR? I couldn't find the definition in 40a24ab

sorry, I repush the PR. Pls try again.

@372046933
Copy link
Contributor Author

@Tuvie
If IOBuf::append_user_data with size 0. Could we do early return here?
https://github.com/apache/incubator-brpc/blob/4308819cc4cf52fa52027ab9b10a2dbcaa84d3c4/src/butil/iobuf.cpp#L1209-L1213

@Tuvie
Copy link
Contributor

Tuvie commented Nov 22, 2022

If IOBuf::append_user_data with size 0. Could we do early return here?

That's OK.

@wwbmmm wwbmmm added the bug the code does not work as expected label Nov 23, 2022
@372046933
Copy link
Contributor Author

@Tuvie I have tested append_user_data_with_meta. It works as expected. As for the return value of append_user_data when size is 0. I added a pre-check in application and skip the call when size equals 0.

@Tuvie
Copy link
Contributor

Tuvie commented Nov 24, 2022

@372046933 Thanks for your feedback. About the return value problem, maybe @wwbmmm can share your opinion.

@wwbmmm wwbmmm closed this as completed Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug the code does not work as expected
Projects
None yet
Development

No branches or pull requests

3 participants