-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-21405 [DOC] Add Details about Output of "status 'replication'" #1894
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left two minor points. Noticed that we have still "TimeStampXXX" over here. I remember that we started to move it to "TimestampXXX" in a few places, but it seems this wasn't finished yet in all places.
I think we had changed most metrics variables names, but this is actually hardcoded test being printed at related ruby files (I guess, admin.rb), in this case. Nevertheless, this doc is describing the output of the command as of now, so we should document how it actually prints the info. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Agree that we should have the docs as is. Just noticed that there are probably still places where we need to unify our spelling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
SOURCE: PeerID=1 | ||
Normal Queue: 1 | ||
AgeOfLastShippedOp=0, TimeStampOfLastShippedOp=Fri Jun 12 18:49:23 BST 2020, SizeOfLogQueue=1, EditsReadFromLogQueue=1, OpsShippedToTarget=1, TimeStampOfNextToReplicate=Fri Jun 12 18:49:23 BST 2020, Replication Lag=0 | ||
SINK: TimeStampStarted=1591983663458, AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Fri Jun 12 18:57:18 BST 2020 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: For active-active replication, in order to get peerId of both clusters (defined in each other), we need to run status 'replication'
at both clusters side right?
Getting ageOfLastShipped etc metric values from remote cluster is also not that easy even if we want to display here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: For active-active replication, in order to get peerId of both clusters (defined in each other), we need to run status 'replication' at both clusters side right?
Yes. The command only shows the context of an individual cluster, listing overall stats about the given cluster source queues and sink threads.
Getting ageOfLastShipped etc metric values from remote cluster is also not that easy even if we want to display here.
This "ageOfLastShipped" metric is related to the source cluster. On the source, we have ReplicationSourceShipper thread reading entries from the WAL and making synchronous RPC calls to ReplicationSink in the target. If the call is success, we get that time in ReplicationSourceShipper, decrease from the edit entry time and record it as "ageOfLastShipped". So "ageOfLastShipped" is how long a given edit took since it entered on source cluster until source cluster assumed it was successful replicated.
@virajjasani would you think this metric description should be improved? Looks like current text was not that clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, my bad. I know ageOfLastShipped
metric, I just took it's example to state that we can't get similar metrics from destination cluster anyways.
My question was, do we really have any metric (from replication viewpoint) available to us from destination cluster? (I don't know of any such metric so far, I hope we can't know), like cluster A knows it's own ageOfLastShipped
and ageOfLastApplied
but does it know cluster B's ageOfLastShipped
and ageOfLastApplied
if both clusters are each other's pairs?
And by this command output, I was thinking what if we could display both clusters' metrics together (in case of active-active), but that might not be possible (and might not even be worth spending time)
Something like this could be really fancy but nothing necessary (hbase01.home belongs to cluster A and hbase02.home belongs to cluster B, only if we have 2 way replication setup):
hbase01.home:
SOURCE: PeerID=1
Normal Queue: 1
AgeOfLastShippedOp=0, TimeStampOfLastShippedOp=Fri Jun 12 18:49:23 BST 2020, SizeOfLogQueue=1, EditsReadFromLogQueue=1, OpsShippedToTarget=1, TimeStampOfNextToReplicate=Fri Jun 12 18:49:23 BST 2020, Replication Lag=0
SINK: TimeStampStarted=1591983663458, AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Fri Jun 12 18:57:18 BST 2020
hbase02.home:
SOURCE: PeerID=1
Normal Queue: 1
AgeOfLastShippedOp=0, TimeStampOfLastShippedOp=Fri Jun 12 18:49:23 BST 2020, SizeOfLogQueue=1, EditsReadFromLogQueue=1, OpsShippedToTarget=1, TimeStampOfNextToReplicate=Fri Jun 12 18:49:23 BST 2020, Replication Lag=0
SINK: TimeStampStarted=1591983663458, AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Fri Jun 12 18:57:18 BST 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyways, even if one cluster can know another one's metrics, it's not related to this PR, it is good to merge anyways :)
Moreover, description looks all good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question was, do we really have any metric (from replication viewpoint) available to us from destination cluster? (I don't know of any such metric so far, I hope we can't know), like cluster A knows it's own ageOfLastShipped and ageOfLastApplied but does it know cluster B's ageOfLastShipped and ageOfLastApplied if both clusters are each other's pairs?
Ah, yeah, metrics from remote cluster are not available at source, and vice-versa.
And by this command output, I was thinking what if we could display both clusters' metrics together (in case of active-active), but that might not be possible (and might not even be worth spending time)
Maybe too much for shell command, but looks like a great idea for the replication stats page on the UI. Main problem I see is that it's not so easy to identify if a given cluster is actually a target, as it always exposes the sink rpc interface. I will dig further around and see if I can come with something not too complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, for shell, it is too much, and also the fact that determining whether the target cluster is also actively pushing WAL Edits to current cluster is not that straightforward.
Anyways, this was just a thought, maybe for some time in future, who knows we might have this status display feature in future :)
…pache#1894) Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com> Signed-off-by: Viraj Jasani <vjasani@apache.org> (cherry picked from commit 3ac99ad)
No description provided.