Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat gh project 3 heartbeat calculation improvement #89

Conversation

JaydenLiang
Copy link
Contributor

@JaydenLiang JaydenLiang commented Sep 24, 2021

this PR target the issue #84

resolve #84 #85

@@ -242,6 +251,9 @@ export class ConstantIntervalHeartbeatSyncStrategy implements HeartbeatSyncStrat
this.result = HealthCheckResult.OnTime;
// no old next heartbeat time for the first heartbeat, use the current arrival time.
oldNextHeartbeatTime = heartbeatArriveTime;
// no old record for reference, use the seq provided by the device. If the seq is
// NaN, it means no sequence provided by the device, then use 0.
oldSeq = (isNaN(deviceSyncInfo.sequence) && 0) || deviceSyncInfo.sequence;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(isNaN(NaN) && 0) || undefined -> undefined

you should use a ternary operator instead

isNaN(deviceSyncInfo.sequence) ? 0 : deviceSyncInfo.sequence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -252,9 +264,17 @@ export class ConstantIntervalHeartbeatSyncStrategy implements HeartbeatSyncStrat
nextHeartbeatTime: nextHeartbeatTime,
syncState: HeartbeatSyncState.InSync,
syncRecoveryCount: 0, // sync recovery count = 0 means no recovery needed
seq: 1, // set to 1 because it is the first heartbeat
// use the device sequence if it exists or 1 as the initial sequence
seq: (!isNaN(deviceSyncInfo.sequence) && deviceSyncInfo.sequence) || 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ternary pls, this one is probably fine but the ternary is clearer.

!isNaN(deviceSyncInfo.sequence) ? deviceSyncInfo.sequence : 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if (useDeviceSyncInfo) {
delayCalculationMethod = 'by device send time';
// check if the sequence is in an incremental order compared to the data in the db
// if not, the heartbea should be marked as outdated and to be dropped
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heartbeat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// in this situation, there are race conditions happening between some
// heartbeat requests. The reason is one autoscale handler is taking much
// longer to process another heartbeat and unable to complete before this
// heartbeat arrives at the handler (by a parallel cloud function thread).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically: 'parallel cloud function process' (maybe even on a separate machine)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// for the other hb (new seq > old seq + 1), the delay cannot be calculated
// thus discarding the delay calculation, and trust it is an on-time hb
else {
delay = -1; // on-time hb must have a negative delay
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like it would be cleaner to just have null and then handle the null case later. -1ms delay will be shown in the logs otherwise?

To properly account for this situation I guess you would need to store the last 3 heartbeat sequnce numbers in the database so you could retroactively count the missing sequence numbers?

10 -> 7 8 9 (+10): OK
12 -> 8 9 10 (+12): 1 missing...
11 -> 9 10 12 (+11): OK again

This is a simplified diagram for explanation. The final solution would need one record per heartbeat.. Could store the timestamp also so that could be (re)considered when the missing entries appear later.
(afterward delete them if the sequence number is lower than seq - 10 or so?)

This would allow you to calculate the loss count at each heartbeat based on multiple records from the db, rather than storing one record and overwriting each time and having issues with race conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea! I believe since we now have the sequence from the sender, we can also make good use of the sequence. Let me think it again.

// calculate delay using the arrive time (classic method)
else {
// NOTE:
// heartbeatArriveTime: the starting time of the function execution, considerred as
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considerred -> considered

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

snapshot
);
// NOTE: strictly update the record when the sequence to update is equal to or greater
// than the seq in the db ton ensure data not to fall back to old value in race conditions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ton ensure -> to ensure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

},
snapshot
);
// NOTE: strictly update the record when the sequence to update is equal to or greater
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keeping multiple heartbeat records instead of overwriting them might improve this situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea! The current heartbeat monitoring has been improved. Will take this idea into consideration next time.

@JaydenLiang
Copy link
Contributor Author

latest QA/DEV release created: 3.3.3-dev.1

// the action for updating primary record
let action: 'save' | 'delete' | 'noop' = 'noop';
let redoElection = false;
let reloadPrimarRecord = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reloadPrimaryRecord ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 372 to 377
points += this.weightMethod1(rec);
points += this.weightMethod2(rec, allHealthCheckRecords);
points += this.weightMethod3(rec, allHealthCheckRecords);
points += this.weightMethod4(rec);
points += this.weightMethod5(rec, allHealthCheckRecords);
points += this.weightMethod6(rec, allHealthCheckRecords);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the weight methods should have descriptive names instead of 1 2 3 4 5 6....

haveChecksumWeight() + sharedChecksumWeight() + groupedChecksumWeight() + ... etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some weigh methods have mixed and complicated conditions which are hard to have a self-descriptive name so I gave up naming them in that way. I prefer to keep their naming well-formatted and with a better jsdoc. It also does the job. Any good IDE could display the information properly.

@jamie-pate Let me know whether the self-descriptive naming is still strongly desired to apply here in this case or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

* @returns {number} point
*/
weightMethod1(rec: HealthCheckRecord): number {
return (rec.deviceChecksum !== null && 1) || 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer rec.deviceChecksum !== null ? 1 : 0

misusing boolean operators instead is confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@JaydenLiang JaydenLiang force-pushed the feat_gh_project_3_heartbeat_calculation_improvement branch from 432a578 to 971b741 Compare December 7, 2021 22:28
@JaydenLiang JaydenLiang merged commit 207e7ff into staging_project_3_heartbeat_calculation_improvement Dec 7, 2021
heartbeat calculation improvement automation moved this from In progress to Done Dec 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
3 participants