HDDS-5794. The misleading "No available thread in pool for past * second" log message in DN StateContext#2693
Conversation
There was a problem hiding this comment.
Will this be threadPoolNotAvailableCount instead?
There was a problem hiding this comment.
Yes, thanks for pointing it out. @ayushtkn , could you help to take another look of the change?
There was a problem hiding this comment.
I am not able to understand this calculation of unavailableTime
If the lastHeartbeatSent was at 5.
ThreadPool goes unavailable at 8. So the value we set is 5-8=(-3) Negative 3?
In the next iteration say it is now 11, & thread pool isn't available we set -3 -11 = -15 Negative 15?
The pool isn't available from 5 to 11? so should 6 right?
@ChenSammi Can you explain a bit more about the logic here
There was a problem hiding this comment.
Sorry, it should be “System.currentTimeMillis() - lastHeartbeatSent.get() ”.
…cond" log message in DN StateContext
| long unavailableTime = threadPoolNotAvailableTimeSum.addAndGet( | ||
| System.currentTimeMillis() - lastHeartbeatSent.get()); |
There was a problem hiding this comment.
Trying to decode the calculation after the change:
Say:
Last Heartbeat sent at : 5
First Unavailable at 8: So here unavailableTime = 8 - 5 = 3
Second Unavailable at 11: So here unavailableTime = 11 - 5 + 3 = 9
Shouldn't be this 6? Last heartbeat was at 5 & we are at 11, the
unavailable time should be 11-5=6
Can you help clarify
There was a problem hiding this comment.
So basically there is no need to sum up the unavailable time. "System.currentTimeMillis() - lastHeartbeatSent.get()" is enough. I will update it accordingly.
ayushtkn
left a comment
There was a problem hiding this comment.
Looks almost cool to me. @adoroszlai you touched this code last, Will you be able to help double check this once
|
|
||
| threadPoolNotAvailableCount.set(0); | ||
| task.execute(service); | ||
| lastHeartbeatSent.set(System.currentTimeMillis()); |
There was a problem hiding this comment.
Any reason for not using Time.monotonicNow()? That should be a safer bet IMO?
There was a problem hiding this comment.
Yes, I see that Time.monotonicNow() used in some places, but it involves a division operation. What's the paticular benifit of using Time.monotonicNow over System.currentTimeMills? I don't know the story behind, and would you like to shed some light on it?
There was a problem hiding this comment.
I think that is to be used if we want to calculate difference in times. The first cut info I got is from the javaoc
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java#L51
In hadoop they changed everywhere- https://issues.apache.org/jira/browse/HDFS-6841
Will try to find some more descriptive doc & share
There was a problem hiding this comment.
I wrote a small test to show the System.currentTimeMills. Between the two runs, I changed the Linux system time. The System.currentTimeMills is really infected by the Linux time. That's the meaning of "because it will be broken by settimeofday". I will change it to Time.monotonicNow(). Thanks @ayushtkn for the info.
|
Thanks @ayushtkn for the code review. |
https://issues.apache.org/jira/browse/HDDS-5794