"distributed system introspection" - thoughts #29

stolsvik · 2020-04-05T09:05:02Z

I would be nice to have a central place to ask questions like

What are all the apps in this Mats fabric?
What are all the nodes in this Mats fabric?
What are all the queues/topics in this Mats fabric?
.. with stages
.. concurrency per stage - active threads
How is the usage? Reqs/last 1, 2, 5, 10, 30 minutes, 1, 2, 4, 8, 24 hours?
.. implement with a 1440 long array, that can resolve which position the current timestamp is in the array (each of the minute in 24 hour will get its own slot), and takes into account the last accessed timestamp. If it is a while since last access, the intermediate slots will be zeroed out. Both read-requests and write requests would do this "evaluate last timestamp vs. this timestamp" logic.
"% duty cycle" for each thread.

However, centralization isn't exactly Mats' strong side.. ;-)

Implement a registry of nodes:

Each time a node "wakes up", it broadcast a "born" message
.. if queues are subscribed to (new endpoints/terminators/subscriptionTerminators are created) after initial "wakeup", a new "born" (or "still here!") message is broadcast.
Each time a node "goes down", it broadcasts a "dead" message
Every 15 minute +/- 5 minutes, all live nodes broadcast a "still here!" message.
The "born" and "still here!" messages contains its app name, version and all queues/topics it listens to.
For every "born" message, every other node sends a unicast "still here!"-style message to that node.
If any node records an inconsistency wrt. which app listens to a particular queue (not topic), then it "taints" this queue - and goes into status WARN.
If it realizes that a Mats stage tries to send a message to a tainted queue, it will DLQ its incoming message. If this is an initiation, it will throw an exception.
If it is the owner of a tainted queue, it will start issuing "still here!" broadcasts every minute. (Hoping that the owners of the system is in a transition period, where the other app soon "releases" its claim of the queue. As both services would realize that they are owners of the tainted queue, they should both go into each-minute broadcasting, so that when the situation is resolved, it would only take a minute before all services again have a consistent view of queue ownership).
For each received "born" and "still here!", the receivers should evaluate the timestamp in the message towards its own time. If this is off by more than a few seconds, it would go into status WARN.

Implement statistics gathering:

The timeline is divided into 30 seconds "stats slots".
Each node tries to randomly target the first 5 seconds of this slot - the "ping slot".
It evaluates whether it has gotten a ping from any other node
If not, it sends a broadcast ping.
.. In this ping, it records its current timestamp (which implicitly also defines which "ping slot" it relates to) and a random number.
.. also here an evaluation wrt. the received ping's timestamp vs. its own timestamp is done. They should not be to far off, otherwise WARN.
10 seconds into the "stats slot" (that is, 5 seconds after the "ping slot"), "on the hour", all nodes independently evaluates all the pings it has gotten in this slot - there should ideally just be one. If there is more than one, the one with the lowest random number wins.
All nodes sends a unicast "pong" to the winning ping - The pong contains all the statistics.
All nodes accepts pongs for 10 seconds.
.. and if they got any pongs, they send a broadcast "collated results" package
.. Note: all these pongs should now go to the same node.
.. but if they are not (because someone ended up choosing a different winner - which either could happen due to pressure on the MQ, or more likely, time skews), each node will get several "collated results" packages. This is also a WARN.
The nodes uses the last 10 seconds to collate the collations - and when this time is up, it "internally publishes" the final statistics result.
It evaluates whether this result contains stats from all the nodes it knows about from the registry.

It would be easy to hijack this statistics gathering: Just always be the one with the earliest ping. Is this a problem?

This algorithm is severely hampered by time skews.

stolsvik transferred this issue from another repository Sep 27, 2021

stolsvik mentioned this issue Sep 27, 2021

Encryption of messages - thoughts #31

Open

stolsvik added the thoughts Issues describing some thoughts around a subject label Sep 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"distributed system introspection" - thoughts #29

"distributed system introspection" - thoughts #29

stolsvik commented Apr 5, 2020

"distributed system introspection" - thoughts #29

"distributed system introspection" - thoughts #29

Comments

stolsvik commented Apr 5, 2020