Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not upgrade config to backend storage in api/config #214

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kevinzs2048
Copy link
Contributor

Fix #213

@kevinzs2048
Copy link
Contributor Author

@dillaman ping for review, thanks!

rbd-target-api.py Outdated Show resolved Hide resolved
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
Copy link

@dillaman dillaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the config is expected to be in a "upgraded" format, you now might hit issues in the API method since it's expecting certain fields. Is the issue just that it updates the config file even if it didn't need to upgrade?

@kevinzs2048
Copy link
Contributor Author

kevinzs2048 commented Aug 27, 2020

The situation is: when we run update target/client with multiple threads, at the same time there are api/config calls.

The metadata update at api/config side will overide the metadata that need to be changed, but not write to DB.

The scenario I find is:
When we unexport + unregister the disk AA, remove it from gw, and remote owner from it. The next step is to delete the volume. Concurrently we have assigned more disks to the client and trigger the update DB, update target/client.
When we delete it, the metadata has shown that the volume is still mapped to a special client. That means that the metadata update has conflicts.

@kevinzs2048
Copy link
Contributor Author

@dillaman is there a method to slow down the frequency of api/config calls? If so I think the situation will be better.

@dillaman
Copy link

Hmm. It's almost sounding like any API methods that will update the config should first grab the RADOS lock on the config via a with clause to ensure only one GW can update the config concurrently. I think that's a much larger change but it would resolve any race conditions.

@kevinzs2048
Copy link
Contributor Author

@dillaman Sorry for late response. I wonder why api/config method need to refresh the backend DB?
Looks the method is not a "GET" and just get config from DB and then commit the config to DB.

@dillaman
Copy link

@dillaman Sorry for late response. I wonder why api/config method need to refresh the backend DB?
Looks the method is not a "GET" and just get config from DB and then commit the config to DB.

I think it's because another GW API might have updated something so it can race w/ the true state. We just need to do a better job w/ the locking of the config object.

@kevinzs2048
Copy link
Contributor Author

@dillaman Yeah it sounds like the problem. Do you mean every changing of the "config object " and write to backend need to be locked?

@dillaman
Copy link

The Config class already has support for globally locking the on-RADOS config object via [1], but the updating of the config is currently a piecemeal operation (i.e. update parts and then commit). Instead, I would recommend adding __enter__/__exit__ methods to lock/unlock the config and make sure any reads or writes the config are only down via context manager with ... blocks.

[1] https://github.com/ceph/ceph-iscsi/blob/master/ceph_iscsi_config/common.py#L445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata conflicts due to each api/config API will refresh the metadata storage
2 participants