Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for cephfs clone limit error #3996

Closed
Rakshith-R opened this issue Jul 13, 2023 · 8 comments · Fixed by #4276
Closed

Add support for cephfs clone limit error #3996

Rakshith-R opened this issue Jul 13, 2023 · 8 comments · Fixed by #4276
Assignees
Labels
bug Something isn't working component/cephfs Issues related to CephFS dependency/ceph depends on core Ceph functionality

Comments

@Rakshith-R
Copy link
Contributor

Rakshith-R commented Jul 13, 2023

Describe the bug

We need to check for error when cphfs clones are rejected due to cloner threads being busy
and handle this error by sending appropriate CSI compliant error code to the external provisioner.

https://tracker.ceph.com/issues/59714

Steps to reproduce

Steps to reproduce the behavior:

  1. Create many cephfs clones (or snapshot+restore) of a PVC with data in it.

Actual results

Most of the clones/restores will stay in pending for a long time.

Expected behavior

Ceph should reject clone creation requests when cloner threads are full with an error and CephCSI should handle this error by sending appropriate CSI compliant error code to the external provisioner.

@Rakshith-R Rakshith-R added bug Something isn't working component/cephfs Issues related to CephFS dependency/ceph depends on core Ceph functionality labels Jul 13, 2023
@Rakshith-R Rakshith-R added this to the release-v3.10.0 milestone Jul 27, 2023
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Aug 9, 2023

@karthik-us PTAL

@github-actions
Copy link

github-actions bot commented Sep 8, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Sep 8, 2023
@github-actions
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 15, 2023
@Madhu-1 Madhu-1 reopened this Sep 18, 2023
@Madhu-1 Madhu-1 removed the wontfix This will not be worked on label Sep 18, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Oct 18, 2023
@karthik-us
Copy link
Collaborator

Not stale. This will be worked on once the tracking ceph issue is fixed.

@riya-singhal31 riya-singhal31 removed the wontfix This will not be worked on label Oct 19, 2023
@Rakshith-R
Copy link
Contributor Author

Expected behavior
Ceph should reject clone creation requests when cloner threads are full with an error and CephCSI should handle this error by sending appropriate CSI compliant error code to the external provisioner.

@karthik-us
Can you check with ceph team for the new error message they are sending to reject clones in this pr ceph/ceph#52670 (comment)
and add that check to cephcsi ?

We decided in last standup to not wait for the pr to be merged and pre-emptively add this check and csi error code in cephcsi.

@karthik-us
Copy link
Collaborator

Sure @Rakshith-R, will do that.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 15, 2023

the clone request errors out with EAGAIN.

@karthik-us EAGAIN will be sent if the clone thread is busy

karthik-us added a commit to karthik-us/ceph-csi that referenced this issue Nov 22, 2023
This is to pre-emptively add check for EAGAIN error returned from
ceph as part of ceph/ceph#52670 if all the
clone threads are busy and return csi compatible error.

Fixes: ceph#3996
Signed-off-by: karthik-us <ksubrahm@redhat.com>
karthik-us added a commit to karthik-us/ceph-csi that referenced this issue Nov 23, 2023
This is to pre-emptively add check for EAGAIN error returned from
ceph as part of ceph/ceph#52670 if all the
clone threads are busy and return csi compatible error.

Fixes: ceph#3996
Signed-off-by: karthik-us <ksubrahm@redhat.com>
nixpanic pushed a commit to karthik-us/ceph-csi that referenced this issue Nov 23, 2023
This is to pre-emptively add check for EAGAIN error returned from
ceph as part of ceph/ceph#52670 if all the
clone threads are busy and return csi compatible error.

Fixes: ceph#3996
Signed-off-by: karthik-us <ksubrahm@redhat.com>
Rakshith-R pushed a commit to karthik-us/ceph-csi that referenced this issue Nov 24, 2023
This is to pre-emptively add check for EAGAIN error returned from
ceph as part of ceph/ceph#52670 if all the
clone threads are busy and return csi compatible error.

Fixes: ceph#3996
Signed-off-by: karthik-us <ksubrahm@redhat.com>
@mergify mergify bot closed this as completed in #4276 Nov 24, 2023
mergify bot pushed a commit that referenced this issue Nov 24, 2023
This is to pre-emptively add check for EAGAIN error returned from
ceph as part of ceph/ceph#52670 if all the
clone threads are busy and return csi compatible error.

Fixes: #3996
Signed-off-by: karthik-us <ksubrahm@redhat.com>
iPraveenParihar pushed a commit to iPraveenParihar/ceph-csi that referenced this issue Dec 18, 2023
This is to pre-emptively add check for EAGAIN error returned from
ceph as part of ceph/ceph#52670 if all the
clone threads are busy and return csi compatible error.

Fixes: ceph#3996
Signed-off-by: karthik-us <ksubrahm@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/cephfs Issues related to CephFS dependency/ceph depends on core Ceph functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants