Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TC57] Unit test to verify deletion of rebuild_snap once rebuild finished #200

Merged
merged 5 commits into from
Feb 19, 2019

Conversation

mynktl
Copy link
Member

@mynktl mynktl commented Feb 16, 2019

Changes:

  1. Unit-test to verify deletion of rebuild_snap once dataset becomes healthy.

  2. Handling of rebuild_snap deletion in timer_thread.

    • To track the stale clone during runtime, I have introduced a new variable stale_clone_exist in zvol_rebuild_info_t. stale_clone_exist is protected by rebuild_mtx of zvol_state_t.

    • Once uzfs_zvol_rebuild_scanner completes the rebuilding process fully, it will try to delete the stale clone. If this deletion fails then it will set stale_clone_exist to 1.

    • timer_thread will periodically check for the stale clone (if zv->rebuild_info.stale_clone_exists != 0). If stale clone found on any dataset then timer_thread will try to delete the rebuild_clone and rebuild_snap. Upon successful removal of rebuild_clone and rebuild_snap, timer_thread will update stale_clone_exist variable with value 0.

Signed-off-by: mayank mayank.patel@cloudbyte.com

…shed

Signed-off-by: mayank <mayank.patel@cloudbyte.com>
if (ret != 0) {
LOG_ERR("Rebuild_clone snap destroy failed on:%s"
" with err:%d", clone_zv->zv_name, ret);
" with err:%d", zv->zv_name, ret);
}

uzfs_zvol_release_internal_clone(zv, snap_zv, clone_zv);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if clone_zv is NULL, there can be panic in this release_internal_clone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -545,10 +545,10 @@ uzfs_zvol_destroy_snapshot_clone(zvol_state_t *zv, zvol_state_t *snap_zv,
clone_zv->zv_name, clonename, zv->zv_name);

/* Destroy clone's snapshot */
ret = uzfs_destroy_internal_all_snap(clone_zv);
ret = uzfs_destroy_all_internal_snapshots(clone_zv);
if (ret != 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should break if ret is nonzero. Any reason why we are not breaking from here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant - return if ret is nonzero

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. it should return to caller if uzfs_destroy_all_internal_snapshots return non-zero.

@@ -545,10 +545,10 @@ uzfs_zvol_destroy_snapshot_clone(zvol_state_t *zv, zvol_state_t *snap_zv,
clone_zv->zv_name, clonename, zv->zv_name);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here as well, we are using clone_zv and snap_zv.. so, these two can't be NULL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. In this function, there are no checks regarding non-null of clone and snap zv. Error checking should be added to this function. I will update this function with error checking of all zv.

REBUILD_SNAPSHOT_CLONENAME);

ret = uzfs_open_dataset(zv->zv_spa, clone_subname, &l_clone_zv);
if (ret == 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should handle error here. Any reason why we are not handling error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I will update this.

* cloned volume and backing snapshot.
*/
int
uzfs_zinfo_destroy_stale_clone(zvol_info_t *zinfo)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if open_dataset or hold_dataset of clone fails, its better to retry next time by releasing snap_zv and returning error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

if (ret != 0) {
LOG_ERR("Rebuild_clone snap destroy failed on:%s"
" with err:%d", clone_zv->zv_name, ret);
" with err:%d", zv->zv_name, ret);
}

uzfs_zvol_release_internal_clone(zv, snap_zv, clone_zv);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the callers of uzfs_zvol_release_internal_clone are setting snapshot_zv and clone_zv to NULL. So, release need to happen. Lets add this as comment for later usage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if all snapshots deletion fails, we need to release and exit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Signed-off-by: mayank <mayank.patel@cloudbyte.com>
REBUILD_SNAPSHOT_CLONENAME);

ret = uzfs_open_dataset(zv->zv_spa, clone_subname, &l_clone_zv);
if (ret == 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'else' of this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

mayank added 2 commits February 18, 2019 16:11
Signed-off-by: mayank <mayank.patel@cloudbyte.com>
Signed-off-by: mayank <mayank.patel@cloudbyte.com>
#endif

if (wquiesce) {
if (uzfs_zinfo_destroy_internal_clone(zinfo)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in else, better to set the stale_clone_exist to 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

vishnuitta
vishnuitta previously approved these changes Feb 18, 2019
Copy link

@vishnuitta vishnuitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes are good

@vishnuitta
Copy link

one minor comment to reset the flag to zero on successful deletion of snap/clone in dw_replica

Signed-off-by: mayank <mayank.patel@cloudbyte.com>
@codecov-io
Copy link

Codecov Report

Merging #200 into zfs-0.7-release will increase coverage by 0.23%.
The diff coverage is 77.5%.

Impacted file tree graph

@@                 Coverage Diff                 @@
##           zfs-0.7-release     #200      +/-   ##
===================================================
+ Coverage            52.09%   52.33%   +0.23%     
===================================================
  Files                  240      240              
  Lines                78204    78236      +32     
===================================================
+ Hits                 40742    40946     +204     
+ Misses               37462    37290     -172

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5ee05b...8bfe56d. Read the comment docs.

Copy link

@vishnuitta vishnuitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes are good

@vishnuitta vishnuitta merged commit 10610ef into mayadata-io:zfs-0.7-release Feb 19, 2019
mynktl added a commit to mynktl/cstor that referenced this pull request Feb 19, 2019
… cases (mayadata-io#200)

Signed-off-by: mayank <mayank.patel@cloudbyte.com>
vishnuitta pushed a commit that referenced this pull request Feb 19, 2019
* [DE249]fix(zrepl): destroying all internal snapshots of rebuild clone dataset once rebuild completes (#199)
* [TC57] fix(stale clone): delete stale volume in timer fn and its test cases (#200)
* fix(memleak): freeing string in error case in uzfs_zinfo_destroy_stale_clone (#201)

Signed-off-by: mayank <mayank.patel@cloudbyte.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants