-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ec: IO failure when shrinking dispersed volume during io running #2948
Conversation
IO Failures are found when performing a shrink operation on a distributed-dispersed volume, while IO is in progress. RCA: During rebalance operation execution while layout has changed dht_creaete_cbk retry create operation under lock in 2nd attempt. It takes decision based on error set by posix_create in xdata in first attempt. ec(ec_manager_create) does not pass xdata to the upper xlator so dht_create is not able to take decision to reattempt fop creation in case if layout has changed and throw an EIO error. Solution: Pass the xdata to the upper xlator to avoid an issue. Fixes: gluster#2947 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
/run regression |
@@ -228,11 +228,14 @@ ec_manager_create(ec_fop_data_t *fop, int32_t state) | |||
case -EC_STATE_DISPATCH: | |||
case -EC_STATE_PREPARE_ANSWER: | |||
case -EC_STATE_REPORT: | |||
cbk = fop->answer; | |||
|
|||
GF_ASSERT(cbk != NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not guaranteed. cbk
may be NULL if there are not enough consistent bricks or the operation has failed early for some other reason.
GF_ASSERT(fop->error != 0); | ||
|
||
if (fop->cbks.create != NULL) { | ||
fop->cbks.create(fop->req_frame, fop, fop->xl, -1, fop->error, | ||
NULL, NULL, NULL, NULL, NULL, NULL); | ||
NULL, NULL, NULL, NULL, NULL, cbk->xdata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NULL, NULL, NULL, NULL, NULL, cbk->xdata); | |
NULL, NULL, NULL, NULL, NULL, cbk == NULL ? NULL : cbk->xdata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Fixes: gluster#2947 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
/run regression |
1 test(s) failed 0 test(s) generated core 5 test(s) needed retry |
/run regression |
…ster#2948) * ec: IO failure when shrinking dispersed volume during io running IO Failures are found when performing a shrink operation on a distributed-dispersed volume, while IO is in progress. RCA: During rebalance operation execution while layout has changed dht_creaete_cbk retry create operation under lock in 2nd attempt. It takes decision based on error set by posix_create in xdata in first attempt. ec(ec_manager_create) does not pass xdata to the upper xlator so dht_create is not able to take decision to reattempt fop creation in case if layout has changed and throw an EIO error. Solution: Pass the xdata to the upper xlator to avoid an issue. > Fixes: gluster#2947 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > (Cherry picked from commit f81bf52) > (Reviewed on upstream link gluster#2948) Fixes: gluster#2947 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
…) (#2951) * ec: IO failure when shrinking dispersed volume during io running IO Failures are found when performing a shrink operation on a distributed-dispersed volume, while IO is in progress. RCA: During rebalance operation execution while layout has changed dht_creaete_cbk retry create operation under lock in 2nd attempt. It takes decision based on error set by posix_create in xdata in first attempt. ec(ec_manager_create) does not pass xdata to the upper xlator so dht_create is not able to take decision to reattempt fop creation in case if layout has changed and throw an EIO error. Solution: Pass the xdata to the upper xlator to avoid an issue. > Fixes: #2947 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > (Cherry picked from commit f81bf52) > (Reviewed on upstream link #2948) Fixes: #2947 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
IO Failures are found when performing a shrink operation on a
distributed-dispersed volume, while IO is in progress.
RCA: During rebalance operation execution while layout has changed
dht_creaete_cbk retry create operation under lock in 2nd attempt.
It takes decision based on error set by posix_create in xdata
in first attempt. ec(ec_manager_create) does not pass xdata to the
upper xlator so dht_create is not able to take decision to
reattempt fop creation in case if layout has changed and throw
an EIO error.
Solution: Pass the xdata to the upper xlator to avoid an issue.
Fixes: #2947
Signed-off-by: Mohit Agrawal moagrawa@redhat.com