Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"distributed system introspection" - thoughts #29

Open
stolsvik opened this issue Apr 5, 2020 · 0 comments
Open

"distributed system introspection" - thoughts #29

stolsvik opened this issue Apr 5, 2020 · 0 comments
Labels
thoughts Issues describing some thoughts around a subject

Comments

@stolsvik
Copy link
Contributor

stolsvik commented Apr 5, 2020

I would be nice to have a central place to ask questions like

  • What are all the apps in this Mats fabric?
  • What are all the nodes in this Mats fabric?
  • What are all the queues/topics in this Mats fabric?
  • .. with stages
  • .. concurrency per stage - active threads
  • How is the usage? Reqs/last 1, 2, 5, 10, 30 minutes, 1, 2, 4, 8, 24 hours?
  • .. implement with a 1440 long array, that can resolve which position the current timestamp is in the array (each of the minute in 24 hour will get its own slot), and takes into account the last accessed timestamp. If it is a while since last access, the intermediate slots will be zeroed out. Both read-requests and write requests would do this "evaluate last timestamp vs. this timestamp" logic.
  • "% duty cycle" for each thread.

However, centralization isn't exactly Mats' strong side.. ;-)

Implement a registry of nodes:

  • Each time a node "wakes up", it broadcast a "born" message
  • .. if queues are subscribed to (new endpoints/terminators/subscriptionTerminators are created) after initial "wakeup", a new "born" (or "still here!") message is broadcast.
  • Each time a node "goes down", it broadcasts a "dead" message
  • Every 15 minute +/- 5 minutes, all live nodes broadcast a "still here!" message.
  • The "born" and "still here!" messages contains its app name, version and all queues/topics it listens to.
  • For every "born" message, every other node sends a unicast "still here!"-style message to that node.
  • If any node records an inconsistency wrt. which app listens to a particular queue (not topic), then it "taints" this queue - and goes into status WARN.
  • If it realizes that a Mats stage tries to send a message to a tainted queue, it will DLQ its incoming message. If this is an initiation, it will throw an exception.
  • If it is the owner of a tainted queue, it will start issuing "still here!" broadcasts every minute. (Hoping that the owners of the system is in a transition period, where the other app soon "releases" its claim of the queue. As both services would realize that they are owners of the tainted queue, they should both go into each-minute broadcasting, so that when the situation is resolved, it would only take a minute before all services again have a consistent view of queue ownership).
  • For each received "born" and "still here!", the receivers should evaluate the timestamp in the message towards its own time. If this is off by more than a few seconds, it would go into status WARN.

Implement statistics gathering:

  • The timeline is divided into 30 seconds "stats slots".
  • Each node tries to randomly target the first 5 seconds of this slot - the "ping slot".
  • It evaluates whether it has gotten a ping from any other node
  • If not, it sends a broadcast ping.
  • .. In this ping, it records its current timestamp (which implicitly also defines which "ping slot" it relates to) and a random number.
  • .. also here an evaluation wrt. the received ping's timestamp vs. its own timestamp is done. They should not be to far off, otherwise WARN.
  • 10 seconds into the "stats slot" (that is, 5 seconds after the "ping slot"), "on the hour", all nodes independently evaluates all the pings it has gotten in this slot - there should ideally just be one. If there is more than one, the one with the lowest random number wins.
  • All nodes sends a unicast "pong" to the winning ping - The pong contains all the statistics.
  • All nodes accepts pongs for 10 seconds.
  • .. and if they got any pongs, they send a broadcast "collated results" package
  • .. Note: all these pongs should now go to the same node.
  • .. but if they are not (because someone ended up choosing a different winner - which either could happen due to pressure on the MQ, or more likely, time skews), each node will get several "collated results" packages. This is also a WARN.
  • The nodes uses the last 10 seconds to collate the collations - and when this time is up, it "internally publishes" the final statistics result.
  • It evaluates whether this result contains stats from all the nodes it knows about from the registry.

It would be easy to hijack this statistics gathering: Just always be the one with the earliest ping. Is this a problem?

This algorithm is severely hampered by time skews.

@stolsvik stolsvik transferred this issue from another repository Sep 27, 2021
@stolsvik stolsvik added the thoughts Issues describing some thoughts around a subject label Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
thoughts Issues describing some thoughts around a subject
Projects
None yet
Development

No branches or pull requests

1 participant