Integrate failure detector into client #21

rosalogia · 2022-03-30T21:05:24Z

This PR integrates the failure detection component introduced in #5 by @gsebil08 into the client using systems introduced in #8. It adds a failure detector as a field of the client and runs message handlers and auto-probing functions periodically as part of the client's server. Unfortunately, due to a cyclic dependency issue, it required that the failure detection code all be moved into the client module. I would love to explore solutions to this, as it makes the client module feel quite messy. The failure detector integration has not been tested yet, but the tests that do exist still pass.

* Moves failure detector module into client module to deal with cyclic dependency issue * Adds failure detection inbox to default initialization * Having 0 active peers no longer causes the program to crash * Updates routers to allow non request/response messages to pass * Adds optional argument to client init for initial peer list * Calls failure detector functions asynchronously in server

Gau-thier

Was not easy to review (re-organization + rework), but....
Nice job! We are really close to get something great!
I think it also solves (or at least it is linked to) the sequence_number issue #18
I took the liberty to run a esy b dune build @fmt --auto-promote to fix some useless troubles on the CI side.

lib/client.ml

Gau-thier · 2022-03-31T13:12:44Z

lib/client.ml

+    let new_seq_no = next_seq_no t in
+    let _ = send_ping_to client peer_to_update in
+    match%lwt wait_ack_timeout t new_seq_no t.config.round_trip_time with
+    | Ok _ -> Lwt.return ()


Are we missing an action here?
I think we should update the status of the Ack sender (recipient of Ping message) to Alive if it replies. Maybe this peer was Suspicious or Faulty before, but since it now replies, it is not the case anymore.

lib/client.ml

rosalogia · 2022-03-31T13:54:01Z

lib/client.ml

+           This is correct in the basic SWIM protocol, but it is a very heavy penalty.
+           When there is no ACK (direct or indirect) the peer must be set to `Suspicious`.
+           See section 4.2 from https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf *)
+        let _ = update_neighbor_status peer_of_client peer_to_update Faulty in


This is wrong now that I think about it. Operating on the peer_of_client has no effect whatsoever.

lib/client.ml

rosalogia · 2022-03-31T13:57:09Z

lib/client.ml

+    let new_seq_no = next_seq_no t in
+    let _ = send_ping_to client peer_to_update in
+    match%lwt wait_ack_timeout t new_seq_no t.config.round_trip_time with
+    | Ok _ -> Lwt.return ()


* feat(failureDetector): Integrate failure detector into client (#21)

rosalogia added 3 commits March 28, 2022 11:52

Modified failure detector to use new send function

7f8488f

Added peers field to client

21e1a9c

rosalogia added the enhancement New feature or request label Mar 30, 2022

rosalogia added this to the Working Gossip Protocol milestone Mar 30, 2022

rosalogia requested a review from Gau-thier March 30, 2022 21:05

Gau-thier force-pushed the @rosalogia/integrate-failure-detector branch 2 times, most recently from 0cecb08 to 97640ab Compare March 31, 2022 07:48

merge @rosalgia/routing + cleanup

8e54d89

Gau-thier force-pushed the @rosalogia/integrate-failure-detector branch from 97640ab to 8e54d89 Compare March 31, 2022 07:51

Gau-thier requested changes Mar 31, 2022

View reviewed changes

rosalogia commented Mar 31, 2022

View reviewed changes

This was referenced Mar 31, 2022

Rework client.ml #22

Closed

Fix server intercepting responses to client requests #8

Merged

rosalogia and others added 3 commits March 31, 2022 20:29

Changed peers_to_ping to helpers_size

0e46f73

Syntax fix in Client.init

114bc1d

last review comments

e729f76

Gau-thier approved these changes Apr 4, 2022

View reviewed changes

Gau-thier merged commit 6a92251 into @rosalogia/routing Apr 4, 2022

rosalogia pushed a commit that referenced this pull request Apr 4, 2022

failure_detector into Client (#23)

e358251

* feat(failureDetector): Integrate failure detector into client (#21)

This was referenced Apr 4, 2022

Failure detector does not change the state of the client's peer list #24

Closed

Failure detector is stuffed into client module #26

Closed

Test the failure detector #27

Open

rosalogia deleted the @rosalogia/integrate-failure-detector branch April 19, 2022 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate failure detector into client #21

Integrate failure detector into client #21

Uh oh!

rosalogia commented Mar 30, 2022

Uh oh!

Gau-thier left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Gau-thier Mar 31, 2022

Uh oh!

rosalogia Mar 31, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rosalogia Mar 31, 2022

Uh oh!

Uh oh!

Uh oh!

rosalogia Mar 31, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Integrate failure detector into client #21

Integrate failure detector into client #21

Uh oh!

Conversation

rosalogia commented Mar 30, 2022

Uh oh!

Gau-thier left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Gau-thier Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

rosalogia Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rosalogia Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rosalogia Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Gau-thier left a comment •

edited

Loading