Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sync status #146

Merged
merged 8 commits into from
Jun 6, 2020
Merged

Add sync status #146

merged 8 commits into from
Jun 6, 2020

Conversation

ThisIsClark
Copy link
Collaborator

@ThisIsClark ThisIsClark commented Jun 3, 2020

What this PR does / why we need it:
Add sync status to avoid two sync tasks for the same device are running at the same time.
For a storage device, add a int(named sync_status) to indicate the sync status, each bit of this int indicates a resource.
When a sync task for a resource is running for a device, set the special bit of sync_status of this device to 1, and when sync is done, set the special bit back to 0.
When received request to sync resource of a device, check if all the bit of sync_status is 0(means no sync task is running for this device), if not, this request would not be handled.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Special notes for your reviewer:

Release note:

For 'sync' api, if sync_status is not 0, raise a exception.
For 'sync_all' api, if sync_status of a device is not 0, skip this device and handle the next device.

Test case:

|-----------------------------------------------------------------------------------------------------|-----------------------------------------------|
| Test case                                     | Expected behavior for sync                          | Expected behavior for sync_all                |
|-----------------------------------------------------------------------------------------------------|-----------------------------------------------|
| No sync task is running                       | New sync task should run normally                   | New sync tasks should run normally            |
|-----------------------------------------------------------------------------------------------------------------------------------------------------|
| One of the sync task is running for a device  | New sync task should not run and raise an exception | New sync task of that device should not run,  |
|                                               |                                                     | but other devices' tasks should run normally  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------|
| All of the sync task is running for a device  | New sync task should not run and raise an exception | New sync task of that device should not run,  |
|                                               |                                                     | but other devices' tasks should run normally  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------|

@codecov-commenter
Copy link

codecov-commenter commented Jun 3, 2020

Codecov Report

Merging #146 into master will increase coverage by 0.60%.
The diff coverage is 58.66%.

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
+ Coverage   28.54%   29.14%   +0.60%     
==========================================
  Files          61       61              
  Lines        3714     3785      +71     
  Branches      420      425       +5     
==========================================
+ Hits         1060     1103      +43     
- Misses       2640     2668      +28     
  Partials       14       14              
Impacted Files Coverage Δ
dolphin/task_manager/manager.py 0.00% <0.00%> (ø)
dolphin/api/v1/storages.py 41.44% <18.18%> (-4.72%) ⬇️
dolphin/api/views/storages.py 36.36% <25.00%> (-6.50%) ⬇️
dolphin/task_manager/tasks/task.py 28.31% <59.09%> (+6.57%) ⬆️
dolphin/common/constants.py 100.00% <100.00%> (ø)
dolphin/db/sqlalchemy/models.py 94.73% <100.00%> (+0.11%) ⬆️
dolphin/utils.py 30.74% <100.00%> (+3.14%) ⬆️

def sync_task(resource_type):

@decorator.decorator
def _sync_task(f, *a, **k):
Copy link
Collaborator

@sfzeng sfzeng Jun 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to set bits for all resources from the producer side. Producer set all bits at once, and consumer for a specific resource just reset the bit for it's own.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, please check again

VOLUME = 2
ACCESS_INFO = 3
DISK = 4
ALERT_SOURCE = 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACCESS_INFO and ALERT_SOURCE are not part of task manager .How do we check if there is some activty happening for these resources like ( update access-info , or remove alert source ) ??

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed all of the rest except STORAGE_DEVICE, POOL and VOLUME.

0,
len(constants.ResourceType) - 1,
constants.SyncStatus.SYNCING)
db.storage_update(ctxt, storage['id'], storage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think about the scenario that two sync_all rquest arived at the same time, let's say request_A and request_B, handle steps as below:

  1. request_A get storage info first, and then request_B get storage info too,
  2. request_A check sync_status is not syncing so update it to be syncing,
  3. request_B check sync_status is not syncing so it will update the sync status again, which is not correct.
    So read and update sync_status need lock.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sfzeng
Copy link
Collaborator

sfzeng commented Jun 4, 2020

@ThisIsClark , we also need to think about the dead lock scenario, say one consumer crashed and the coresponding bit will neber be reset.

Copy link
Collaborator

@wisererik wisererik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to covert sync_status to a friendly status before return in GET /v1/storages and GET /v1/storages/id

dolphin/api/v1/storages.py Outdated Show resolved Hide resolved
dolphin/api/v1/storages.py Outdated Show resolved Hide resolved
wisererik
wisererik previously approved these changes Jun 5, 2020
NajmudheenCT
NajmudheenCT previously approved these changes Jun 5, 2020
Copy link
Member

@NajmudheenCT NajmudheenCT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ThisIsClark ThisIsClark dismissed stale reviews from NajmudheenCT and wisererik via 549e0e5 June 6, 2020 06:44
@ThisIsClark
Copy link
Collaborator Author

@ThisIsClark , we also need to think about the dead lock scenario, say one consumer crashed and the coresponding bit will neber be reset.

Raised a issue: #165

Copy link
Collaborator

@sfzeng sfzeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants