New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd: replace hb_out and hb_in with a single hb_peers #12178
Conversation
@liewegas , Please help take a look. Thanks. |
@@ -759,8 +759,8 @@ void OSDService::update_osd_stat(vector<int>& hb_peers) | |||
{ | |||
Mutex::Locker lock(stat_lock); | |||
|
|||
osd_stat.hb_in.swap(hb_peers); | |||
osd_stat.hb_out.clear(); | |||
osd_stat.hb_out.swap(hb_peers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liupan1111 i don't follow you. the only caller of OSDService::update_osd_stat()
is OSD::heartbeat()
, where OSD::hb_peers
is passed in. where hb_peers
is the peer osds which are UP. why would you want to store it in "hb_out" instead?
also, i found that there is a tracker ticket related to "hb_in" and "hb_out": http://tracker.ceph.com/issues/2896.
if "hb_out" is vestigal, maybe we can simply remove it from osd_stat_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tchaikov. First, it is correct, update_osd_stat() is only called by OSD::heartbeat, and hb_peers is an input parameter for lupdate_osd_stat.
Let's take a look how is "hb_peers" used in OSD::heartbeat:
"hb_peers" is got from heartbeat_peers, and heart beat will be sent to these peers one by one from current osd by using send_message.
In this way, these hb_peers are hb_out, not hb_in. That is what I want to fix in this PR.
Then for the issue 2896. As I described above, "hb_out" is what we want to display, and "hb_in" is vestigal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oic, the hb_out
stands for "peers for outgoing heartbeat messages".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we take this chance to remove hb_in
in a separated commit in this PR?
@liewegas what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) yes. do you agree to remove hb_in? I could give a commit to do this.
Yeah, I'd rather remove it (and possiblye rename the remaining field hb_peers or something).
|
sure, i will remove hb_in and rename hb_out in seperate commits in this PR. |
@@ -335,8 +331,7 @@ void osd_stat_t::encode(bufferlist &bl) const | |||
::encode(kb_avail, bl); | |||
::encode(snap_trim_queue_len, bl); | |||
::encode(num_snap_trimming, bl); | |||
::encode(hb_in, bl); | |||
::encode(hb_out, bl); | |||
::encode(hb_peers, bl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't drop a field or else we break compatibility. hb_peers replaces hb_in. in place of hb_out, ::encode((uint32_t)0, bl);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, update the encoding version (ENCODE_START(5, 2, bl)) just in case we need to distinguish between the encodings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the 4 to 5 in the DECODE_START_.. macro in decode() too
@@ -350,8 +345,7 @@ void osd_stat_t::decode(bufferlist::iterator &bl) | |||
::decode(kb_avail, bl); | |||
::decode(snap_trim_queue_len, bl); | |||
::decode(num_snap_trimming, bl); | |||
::decode(hb_in, bl); | |||
::decode(hb_out, bl); | |||
::decode(hb_peers, bl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32_t num_hb_out;
::decode(num_hb_out, bl);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encode/decode issues aside, this looks fine. Can you squash it down into a single commit?
f->open_array_section("hb_out"); | ||
for (vector<int>::const_iterator p = hb_out.begin(); p != hb_out.end(); ++p) | ||
f->open_array_section("hb_peers"); | ||
for (vector<int>::const_iterator p = hb_peers.begin(); p != hb_peers.end(); ++p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, could use range-based loop, like
for (auto peer : hb_peers) {
f->dump_int("osd", p);
}
::encode(op_queue_age_hist, bl); | ||
::encode(fs_perf_stat, bl); | ||
ENCODE_FINISH(bl); | ||
} | ||
|
||
void osd_stat_t::decode(bufferlist::iterator &bl) | ||
{ | ||
DECODE_START_LEGACY_COMPAT_LEN(4, 2, 2, bl); | ||
uint32_t num_hb_out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, could just move this line to where it is firstly used (line:352)
Signed-off-by: Pan Liu <pan.liu@istuary.com>
This PR was part of this tests http://pulpito.ceph.com/yuriw-2016-11-28_20:50:22-rados-wip-yuri-testing_2016_11_28-distro-basic-smithi/ @liewegas - pls merge |
Signed-off-by: Pan Liu pan.liu@istuary.com
http://tracker.ceph.com/issues/18057