Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

erdma: Elastic RDMA Adatper (ERDMA) userspace provider driver #1126

Merged
merged 4 commits into from Sep 21, 2022

Conversation

hz-cheng
Copy link
Contributor

@hz-cheng hz-cheng commented Jan 17, 2022

Hello all,

This PR introduces the Elastic RDMA Adapter (ERDMA) userspace provider driver, and the patchset for review purpose was sent to the linux-rdma mail list already [1]. The kernel driver of ERDMA can refer this link [2].

The main feature of ERDMA userspace provider includes: supports RC QP, supports RDMA Write/Send/RDMA Read/Immediate opcode in post_send, supports post_recv, and supports CQs with polling mode and event mode. Now we does not support SRQ yet.

Besides, this PR has already issued the review suggestions, including:

  • Fix coding style issues
  • Remove unnecessary memset().
  • Remove some magic number in the code.

Thanks,
Cheng Xu

[1] https://lore.kernel.org/all/20211224065522.29734-1-chengyou@linux.alibaba.com/
[2] https://lore.kernel.org/all/20211221024858.25938-1-chengyou@linux.alibaba.com/

@hz-cheng
Copy link
Contributor Author

hz-cheng commented Apr 8, 2022

Hi,

I just updated the PR, changes are:

  1. Rebase the code to latest rdma-core code.
  2. Format the code using clang-format, because the userspace provider has the same format issues with the former kernel patch. Now both of them are fixed.
  3. Fix a double-free bug in erdma_create_cq.
  4. Memset the cq buffer to zero in erdma_create_cq.

Thanks,
Cheng Xu

providers/erdma/erdma.c Outdated Show resolved Hide resolved
kernel-headers/rdma/erdma-abi.h Outdated Show resolved Hide resolved
providers/erdma/erdma.h Outdated Show resolved Hide resolved
providers/erdma/erdma_db.c Outdated Show resolved Hide resolved
providers/erdma/erdma_db.c Outdated Show resolved Hide resolved
providers/erdma/erdma_db.c Outdated Show resolved Hide resolved
providers/erdma/erdma_db.c Outdated Show resolved Hide resolved
@hz-cheng
Copy link
Contributor Author

hz-cheng commented Aug 1, 2022

I just updated the PR, changes including:

  1. Sync kernel headers by kernel-headers/update script.
  2. Sort function implementations in erdma_context_ops.
  3. Let the initialize statements in functions be in inverted triangle order.
  4. Fix bugs in mmap's return value check
  5. Refactor the doorbell record allocation (because many new API can be applied, link list, bitmap, etc.,)
  6. Refactor erdma_poll_cq in the similar way to the kernel's poll_cq code.
  7. Remove PCIe device ID 5007 support.
  8. Use the proper type (int -> uint32_t, int -> size_t and so on) in some definitions.
  9. Some small changes, following the review suggestions for kernel code.

The changes with patch format can be get at here.

@hz-cheng hz-cheng force-pushed the master branch 2 times, most recently from 4d9920f to cdda3f8 Compare August 4, 2022 03:00
@hz-cheng
Copy link
Contributor Author

hz-cheng commented Aug 4, 2022

Some more issues need to be fixed, so I pushed again. Changes are:

  1. Put verbs_provider_erdma at the right place in libibverbs/verbs.h
  2. Fix a memory leak issue in erdma_destroy_qp
  3. Make some statements in erdma_create_qp shorter.
  4. Update the erdma WC mapping table, and the related code in erdma_poll_cq

The changes with patch format can be get at here.

@hz-cheng hz-cheng force-pushed the master branch 4 times, most recently from 5cf6f6c to d959e14 Compare August 5, 2022 09:29
@hz-cheng
Copy link
Contributor Author

hz-cheng commented Aug 5, 2022

I reviewed changed files in mana_ib, and found that erdma missed the modification need of RDMA_STATIC_PREFIX,
so I push this version.

Also add necessary information in debian pkg build.

The changes with patch format can be get at here.

Add the userspace verbs implementation related header files: 'erdma_hw.h'
for hardware interface definitions, 'erdma_verbs.h' for verbs related
definitions and 'erdma_db.h' for doorbell records related definitions.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Implementation of the erdma's 'struct verbs_context_ops' interface.
Due to doorbells may be drop by hardware in some situations, such as
hardware hot-upgrade, driver will keep the latest doorbell value of each
QP and CQ. So we introduce the doorbell records to store the latest
doorbell values also.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Add the definitions of erdma provider driver, and add the application
interface to core, so that core can recognize erdma provider.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Make the build system can build the provider, and add erdma to redhat package
environment and debian pkg build environment.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
@hz-cheng
Copy link
Contributor Author

hz-cheng commented Aug 22, 2022

I saw that Leon had synced the kernel header to master. To avoid potential conflicts (through the CI does not reported), I rebase the code to the latest master code. This push also includes a change: using struct erdma_sge sgl[] instead of struct erdma_sge sgl[0] in definitions in erdma_hw.h.

@rleon rleon merged commit c4ff6e9 into linux-rdma:master Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants