Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/async/rdma: use wr_id address to check valid chunk #36908

Merged
merged 1 commit into from
Sep 3, 2020

Conversation

rosinL
Copy link
Member

@rosinL rosinL commented Aug 31, 2020

CQE's wr_id could be:
1)BEACON_WRID
2)&RDMAConnectedSocketImpl::qp
3)Chunks address start from Cluster::chunk_base
When assuming qp as Chunk through CQE's wr_id, it's possible to misjudge
&(qp->ib_physical_port) into Cluster::[base, end) because there're 4 bytes
random data filled in the higher 4 bytes address around ib_pysical_port due
to the address alignement requirement of structure member.
Fix this case by checking whether wr_id value is in the allocated Chunk space.
Fixes: https://tracker.ceph.com/issues/44346

Signed-off-by: Chunsong Feng fengchunsong@huawei.com
Signed-off-by: luo rixin luorixin@huawei.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@batrick batrick added messenger Issues involving one of the Ceph messenger implementations needs-review labels Aug 31, 2020
@fengchunsong
Copy link
Contributor

@tchaikov @liupengs @changchengx please help to review,thanks

@changchengx
Copy link
Contributor

The function of this PR is right.

Suggestions:
Commit message isn't clear, especially the offsetof(chunk->buffer)'s address map to the Infiniband :: QueuePair address,
What's about the below commit message?

CQE's wr_id could be: 
  1)BEACON_WRID
  2)&RDMAConnectedSocketImpl::qp
  3)Chunks address start from Cluster::chunk_base
When assuming qp as Chunk through CQE's wr_id, it's possible to misjudge
&(qp->ib_physical_port) into Cluster::[base, end) because there're 4 bytes
random data filled in the higher 4 bytes address around ib_pysical_port due 
to the address alignement requirement of structure member.
Fix this case by checking whether wr_id value is in the allocated Chunk space.

@fengchunsong
Copy link
Contributor

The function of this PR is right.

Suggestions:
Commit message isn't clear, especially the offsetof(chunk->buffer)'s address map to the Infiniband :: QueuePair address,
What's about the below commit message?

CQE's wr_id could be: 
  1)BEACON_WRID
  2)&RDMAConnectedSocketImpl::qp
  3)Chunks address start from Cluster::chunk_base
When assuming qp as Chunk through CQE's wr_id, it's possible to misjudge
&(qp->ib_physical_port) into Cluster::[base, end) because there're 4 bytes
random data filled in the higher 4 bytes address around ib_pysical_port due 
to the address alignement requirement of structure member.
Fix this case by checking whether wr_id value is in the allocated Chunk space.

It is a good suggestion. Add the description of the wr_id application scenario and describe the root cause of the problem based on the exception scenario to help readers better understand the patch.Thanks

CQE's wr_id could be:
  1)BEACON_WRID
  2)&RDMAConnectedSocketImpl::qp
  3)Chunks address start from Cluster::chunk_base
When assuming qp as Chunk through CQE's wr_id, it's possible to misjudge
&(qp->ib_physical_port) into Cluster::[base, end) because there're 4 bytes
random data filled in the higher 4 bytes address around ib_pysical_port due
to the address alignement requirement of structure member.
Fix this case by checking whether wr_id value is in the allocated Chunk space.

Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Signed-off-by: luo rixin <luorixin@huawei.com>
@fengchunsong
Copy link
Contributor

@changchengx If there is no problem,please help to approve,thanks

@changchengx
Copy link
Contributor

@tchaikov GO
I've confirmed the function is right in this PR.

@tchaikov tchaikov merged commit dbda132 into ceph:master Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
messenger Issues involving one of the Ceph messenger implementations needs-review
Projects
None yet
5 participants