New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd-mirror: image map policy test #19320
rbd-mirror: image map policy test #19320
Conversation
936a641
to
873eb6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
ACTION_TYPE_REMOVE, STATE_ASSOCIATED, Transition(STATE_DISASSOCIATED, false), | ||
ACTION_TYPE_REMOVE, STATE_DISASSOCIATED, Transition(STATE_REMOVE_MAPPING, true), | ||
|
||
ACTION_TYPE_SHUFFLE, STATE_UNASSIGNED, Transition(STATE_DISASSOCIATED, false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can probably jump to STATE_UPDATE_MAPPING
here. The reason why I chose STATE_DISASSOCIATED
was to remove the image from the in-memory map. But that would mean that releasing an image from a blacklisted peer needs to be treated as a success case.
What can be done to rectify this is to unmap the image from the in-memory map in pre_execute_state_callback()
and jump to STATE_UPDATE_MAPPING
for this state transition. This would be harmless for other transitions from STATE_UNASSIGNED
as we handle negative lookup in unmap()
.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's UNASSIGNED
, I guess I don't see how it would be already mapped in-memory to another instance. Shouldn't it be in UPDATE_MAPPING
when it's been assigned and is waiting for the on-disk map to update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because we map an image to an instance in-memory during STATE_UNASSIGNED -> STATE_UPDATE_MAPPING
transition. Later if the transition STATE_UPDATE_MAPPING -> STATE_ASSOCIATED
fails (dead peer), the in-memory map for this image is not reverted (image not removed from the map).
I'm thinking a better fix would be to handle errors during transition a bit differently how we do now -- something like post_execute_state_callback_err()
, which could take action (revert/no-op) based on which state transition failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK -- so am I correct in assuming that the in-memory map is updated and the state is transitioned from UNASSIGNED
to UPDATE_MAPPING
in one step? To me, it doesn't make sense that an image in UNASSIGNED
is actually assigned in-memory but just waiting to transition to UPDATE_MAPPING
.
Speaking more generally, I think I had proposed changing the state table from to include an "error path" transition state that would allow you to build clean-up states (if needed).
i.e.:
<action>, <current state>, <success state>, <error state>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK -- so am I correct in assuming that the in-memory map is updated and the state is transitioned from UNASSIGNED to UPDATE_MAPPING in one step? To me, it doesn't make sense that an image in UNASSIGNED is actually assigned in-memory but just waiting to transition to UPDATE_MAPPING.
That's right. However, we weren't rolling back the map on failed transactions, but kind of recovering from it later based on the current
state of the image in the map (we do reset the current
state to the last idle state).
Speaking more generally, I think I had proposed changing the state table from to include an "error path" transition state that would allow you to build clean-up states (if needed).
i.e.:
, , ,
Correct. The way error_state
(or rather error_transition
) is implemented is if a particular state transition is retry-able or not. The current state
is set back to the last idle state when a state transition is aborted.
I updated the pr which includes cleaning up in-memory map state on transaction failure. Would we also want to clean up on-disk state too (say is the transition from STATE_UPDATE_MAPPING
to STATE_ASSOCIATED
failed)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the pr which includes cleaning up in-memory map state on transaction failure. Would we also want to clean up on-disk state too (say is the transition from STATE_UPDATE_MAPPING to STATE_ASSOCIATED failed)?
Yeah, that would make sense to me to "rollback" on error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll include this and repost the pr.
a4659b5
to
1c95a4d
Compare
changed to "DNM" -- will send a separate pr which implements state transition rollback. |
1c95a4d
to
4aae1ef
Compare
depends on #19577 |
4aae1ef
to
1f67524
Compare
Signed-off-by: Venky Shankar <vshankar@redhat.com>
1f67524
to
e2f5335
Compare
rebased |
No description provided.