New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sheep: avoid diskfull caused by recovery process #185
Conversation
TODO: another mode for simply skipping recovery and keep cluster running. However, it allows low redundancy situation and a little bit dangerous. |
@mitake thanks, but i'm prefer second suggestion, to keep cluster running. Because node already may have data and provide redundancy for some vdi |
@vtolstov sure, we'll provide the option later |
2015-10-28 6:51 GMT+01:00 Hitoshi Mitake notifications@github.com:
|
@sirio81 on the second thought, shutdown -> reboot will cause the same diskfull problem. I'll update this patch for just skipping recovery. Thanks a lot for your comment. |
ae04ea8
to
442b88e
Compare
I'm try it, but why not do this by default (avoid diskfull and not shutdown sheep ?) |
442b88e
to
07de915
Compare
The new behavior of avoiding dangerous recovery must be shared by all sheeps in a cluster. So I moved the new -F option to dog cluster format command. |
2015-12-23 13:45 GMT+01:00 Hitoshi Mitake notifications@github.com:
|
@sirio81 yes, this PR is waiting for testing. |
* about space consumption by metadata objects | ||
* e.g. inode, ledger | ||
*/ | ||
if (is_data_obj(oids[j])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be if (
!
is_data_obj(oids[j]))
.
sheep can corrupt its cluster by diskfull with recovery process. For avoiding this problem, this patch adds a new option -F to dog cluster format. If this command is passed during cluster formatting, every sheep process of the cluster skips recovery if there is a possibility of diskfull during recovery. Fixes #59 Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
07de915
to
e989779
Compare
@tmenjo thanks for your review. I resolved the invalid condition of object size calculation and the double counting. Could you review it (not tested yet...)? |
free(key); | ||
continue; | ||
} | ||
rb_insert(&seen_objects, key, node, seen_object_cmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
LGTM. I'll ask Matsumura, my colleague, to test this. |
* TODO: current calculation doesn't consider | ||
* about space consumption by metadata objects | ||
* e.g. inode, ledger | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to use get_objsize()
to accumulate required_space_per_node[node_idx]
when !is_data_obj(oids[j])
is false.
https://github.com/sheepdog/sheepdog/blob/master/include/sheepdog_proto.h#L532
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non data objects are sparse ones and get_objsize()
cannot consider it. Their actual size cannot be computed without fetching them. So currently I'm simply ignoring it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. If there are many deleted VDIs (i.e. sparsed inode objects), required_space_per_node[node_idx]
is overestimated. And fetching them is costful. Focusing on data objects seems moderate and good idea.
I think it is better to document when the new feature is or is NOT effective (or maybe harmful). Because this patch does not count ...
Each of them could lead unexpected disk full, and counting them in code seems difficult or costful. So document is helpful. I think both of two conditions below should be satisfied to use the new feature properly:
May I have your idea, @mitake ? |
@tmenjo I agree, it is still difficult to care about disk space consumption of recovery process. Could you add the doc to wiki or files under |
I verified this PR with the modified patch. Here is a procedure which I followed. _# sheep --port 7000 --log dir=/root/sheeptest/log/sheep1,level=debug --cluster zookeeper:localhost:2181 /root/sheeptest/meta/sheep1,/mnt/sheepdisk1/obj -z 1 _# dog cluster format --copies=2 -F _# dog vdi create testvdi1 1G _# dog node info Total virtual image size 1.4 GB _# dog vdi write testvdi1 < /root/sheeptest/data/test0.9G.raw _# dog cluster info -v Epoch Time Version [Host:Port:V-Nodes,,,] _# dog node info Total virtual image size 1.4 GB _# dog node kill 2 _# dog cluster info -v Cluster status: running, auto-recovery enabled Epoch Time Version [Host:Port:V-Nodes,,,] _# dog node info sheep.log Apr 26 18:48:14 EMERG [rw 3218] check_diskfull_possibility(1247) node IPv4 ip:10.36.4.8 port:7000 will cause disk full, stopping whole cluster |
@taiyo33 thanks for testing, I'm merging this PR. |
@mitake Sure. I'll make another issue about making document. |
sheep can corrupt its cluster by diskfull with recovery process. For
avoiding this problem, this patch adds a new option -F to sheep. If
this command is passed to the sheep process, every sheep process of
the cluster stops itself if there is a possibility of diskfull during
recovery.
Fixes #59
Signed-off-by: Hitoshi Mitake mitake.hitoshi@lab.ntt.co.jp