Replies: 18 comments 40 replies
-
|
Perhaps, we should start thinking about the framework's modeling approach from the most abstract level: the distributed system as a whole. At that level of abstraction, the distributed system can be represented as a single entity. Obviously, it doesn't make sense to consider it in total isolation, for it must exist within and somehow interact with the rest of the world. Here it already starts getting interesting and we can ponder on the following questions:
|
Beta Was this translation helpful? Give feedback.
-
What could be considered a part of the external world and what could belong to the internals of the distributed system?Internals of the Distributed System
External World
|
Beta Was this translation helpful? Give feedback.
-
|
I suggest to think of whole distributed systems as something purposeful, sufficient and complete. In a functional model, a distributed system must be a part of a greater system, playing a certain role within the encompassing system, performing a certain function therein, i.e. have a purpose. From this perspective, at the most abstract level, everything that is used to fulfill the purpose of the distributed system, and nothing more, can be considered internal to it, making it sufficient and complete. |
Beta Was this translation helpful? Give feedback.
-
|
Time is closely related to the notion of change. Intuitively, we can start by considering time as a dimension of change, a continuum for representing the dynamics of the system. Then a dynamically changing system can be represented with a function of time, where the image of the system at each point in time corresponds to the state of the system at that moment. Assuming there are countably many system states, the time-function effectively defines a sequence of system states or, equally, a sequence of state transitions. Time makes each state transition distinct from any other state transition in the time-sequence, even if the corresponding changes in the system state are equivalent. So we can think of time as ordering of changes. We can take a part of the system, a subsystem, and obtain that subsystem's time-function by projecting the result of the original time-function onto the corresponding partial system state. Partitioning the original time-sequence into contiguous intervals according to the partial state projection, we can relate the subsystem's state transitions to the state transitions of the original system corresponding to the transitions between those intervals. This will preserve the ordering according to the precedence relation. For any pair of state transitions in the same time-sequence we can tell which one precedes the other in time, thus defining a precedence relation. If we take two arbitrary subsystems and try to relate their state transitions in terms of precedence according to the encompassing system, there might be some state transitions that correspond to the same state transition of the encompassing system. We can say that such state transitions are simultaneous. The precedence relation should be undefined for simultaneous state transitions and thus become a partial relation. Time is also closely related to the notion of causality. We can consider state transitions as events, where one event may causally depend on another event. An event casually depends on another event if a modification of the latter can affect the former. Casual dependency must be an asymmetric and transitive relation. There might be events such that are neither casually depends on the other, so the casual dependency relation can be partial, as well. An event can only casually depend on an event that precedes it in time. Time must ensure casual consistency by ordering events such that this rule is obeyed. So we can think of time as causally consistent partial ordering of changes. |
Beta Was this translation helpful? Give feedback.
-
|
Time and non-determinism are closely related. If a system is fully determined by the initial assumptions then it's absolutely static, and time is irrelevant to it. Distributed systems are not like that. Non-determinism primarily manifests itself as uncertainty of changes happening in the system over time. On the other hand, deterministic intermediate transformations within the system should be irrelevant in terms of time. Building upon previous considerations, we can think of time as causally consistent ordering of non-deterministic changes. The ultimate source of non-determinism must be outside of the system because a system is determined by everything that is a part of it. Thus we can think of non-determinism as external dependencies of the system. If we take a certain state transition of a system then everything it depends on must also be certain. From that perspective, the past is fully deterministic; thus non-determinism belongs to the future. We can say that a dynamic system consumes (resolves?) non-determinism from outside as it progresses in time. |
Beta Was this translation helpful? Give feedback.
-
|
Building upon previous considerations about time and non-determinism, we can think of system state as consistent image of the system or information that is necessary and sufficient for preserving the system's causal consistency as it progresses in time. In other words, the system state represents the past of the system, the result of causally consistent partial ordering of changes that possible future changes may causally depend on. The system state does not, and often cannot, represent the history of the system in full detail; instead, it contains just enough information about the past to preserve the consistency of the system in future, thus connecting the past and the future. Since the system state may lose unnecessary information about the past, and it must do this if the lifetime of the system is indefinite, it may be uncertain about the exact prior history of the system, as if partially retaining the non-determinism absorbed from outside in the past. |
Beta Was this translation helpful? Give feedback.
-
|
Interaction is a dynamic process, something that happens over time, and thus it inherently involves non-determinism (see previous considerations about time and non-determinism). We can think of interaction as causal dependencies between the interacting systems. In unidirectional interaction only one system depends on the other; in bidirectional interaction both systems depend on each other. A subsystem interacts with the encompassing system by interacting with other subsystems within that encompassing system. The system that acts in interaction as an external dependency of the other represents a source of non-determinism for the latter. If we consider the interacting systems as one then we shall see that the non-determinism of their interaction must ultimately originate from the outside of both systems. Thus we can also think of interaction as redistribution of non-determinism from outside. |
Beta Was this translation helpful? Give feedback.
-
|
A correct system must fulfill its purpose given the assumptions hold. However, when the assumptions are violated, the system may fail to fulfill its purpose. In other words, a system as a whole may fail if the amount of uncertainty (non-determinism) imposed on the system exceeds what it was supposed to handle. In that case, the system will leak the excessive non-determinism imposed on it. |
Beta Was this translation helpful? Give feedback.
-
|
In computer science, there are metaphoric concepts of "angelic" and "demonic" non-determinism. Angelic non-determinism aims to favor desired results, whereas demonic non-determinism allows for unpredictable and potentially unfavorable outcomes. We might also want to distinguish those types of non-determinism in our system model. |
Beta Was this translation helpful? Give feedback.
-
|
As noted earlier, we can think of interaction between (sub)systems as establishing new causal relationships between events (changes) in those systems or, more precisely, occurrence of events in one system that causally depend on some events from the other system. We can also think of such interaction as communicating information about causal dependencies between past events. The information thus communicated can be of different fidelity, i.e. preserving different amount of details about transitive causal dependencies between past events. The amount of detail communicated determines what causal dependencies the new event can establish with the past events. The amount of information that can be communicated in interaction is limited; therefore, new causal dependencies may only refer directly to a limited part of the actual causal dependency graph. At the same time, each act of interaction effectively resolves some amount of non-determinism imposed on the recipient system from the outside by determining which from all possible communication events occurs. On the other hand, since only limited amount of information can be communicated, some uncertainty, i.e. non-determinism, may be re-introduced in communication from the perspective of the system where it originates from. |
Beta Was this translation helpful? Give feedback.
-
|
We cannot allow absolutely arbitrary interaction, i.e. establishing new casual relationships, between (sub)systems, or we end up with chaos instead of a purposeful system. Therefore, there must be something that determines which interactions are possible and which are not. Absolutely isolated, unrelated systems should not be able to interact; in order for interaction to happen between systems, they must already have some relationship. Thus interactions are enabled by already established relationships. We can think of those prior relationships enabling interactions as links or connections between systems. Such connections can be established dynamically as a result of prior interaction, but ultimately, there must be some set of relationships determined by the way the system was initially prepared. Those initial relationships effectively characterize, define the system and its possible behavior. |
Beta Was this translation helpful? Give feedback.
-
|
Imagine a (possibly infinite) closed, isolated universe. Nothing can be added to or removed from the universe, nothing can interfere with it. We can think of the universe as absolutely static and having total knowledge of everything. Now consider a part of the universe as an open system existing within that universe. The boundary of the system separates it from its environment, which is the rest of the universe. The system as a whole has complete knowledge of itself but absolute uncertainty about its environment, and the other way around. The system may acquire some new information and so expand its boundary; it may also discard some information that was part of it and so contract its boundary. Because nothing can be added to or removed from the universe, the total information must be conserved. So whatever the system learns or forgets must come from and return back to its environment. Thus we can think of evolution of an open system as a process of acquiring information from and returning information back to its environment. Note that the whole process is in principle reversible. Also note that the process of just adding or discarding information alone is monotonic and enjoys the confluence property. The state of the system can be defined as the information the system was initially prepared with plus the information acquired minus the information discarded by the system to the point. Real systems have physical limitations and therefore cannot hold unbounded amounts of information. On the other hand, the system should not discard any information that is still required to fulfill its purpose. So there must be some constraints specifying what information the system must acquire before it is allowed to discard some specific information and what information it must discard before it is allowed to acquire some new information. In simple words: the system may be required to forget in order to learn, and learn in order to forget. Some information that the system acquires from its environment may come from other systems of interest (e.g. input), whereas some information may come from ambient sources (e.g. timing, randomness). Some information that the system holds may be acquired by other systems of interest (e.g. output), whereas some information may be simply dissipated into the ambient environment. What if we define distributed and concurrent systems and their components with acquiring and discarding of information made explicit, so that system's evolution is reversible? How much can we benefit from the associated monotonicity and confluence properties? |
Beta Was this translation helpful? Give feedback.
-
|
Let's try to make the idea described in the previous comment a bit more concrete. We can model concurrent systems made out of components that are deterministic machines, each with a number of data inputs and the same number of outputs. The communication of a component with the environment happens only by passing plain data items through those inputs/outputs. Each input has a complementary output of the same type, so the data items entering the component through the input and exiting through the complementary output have the same size in binary representation. Inputs and outputs are in general concurrent, i.e. data items can enter and exit the component through individual inputs/outputs at any time, unless there is an internal data dependency required by the component's logic. The output data must only depend on the input data and nothing else. In particular, it does not matter in which order data items are supplied/returned through different inputs/outputs, only the content matters, although the order of data items supplied/returned through the same input/output matters. As data items pass through the component, they would exchange information according to the component's logic, thus creating causal dependencies. Note that the type and the size of data items remains unchanged. Essentially, it's the same piece of memory that enters the component through the input and exits it through the complementary output, but the content can change and/or affect the other inputs/outputs. Causal dependencies may require several inputs to become available and remain within the component until the information exchange between data items can happen, before the can exit the component. We may also require that the whole process is reversible, i.e. that we could supply the output data back to the component in the reverse direction in order to reconstruct the input. An example of such component, a generalized controlled-swap gate, was described in this comment. Multiple components can be combined together by connecting intputs to outputs of the same data type. Each input/output can be connected at most once. The inputs/outputs that remain unconnected constitute the interface of the combined system. As suggested by @pompon0 in a private discussion, one can think of this model as a collection of deterministic multi-tape multi-head Turing machines with the restriction that tapes are allowed to move only in one direction. |
Beta Was this translation helpful? Give feedback.
-
|
Now that we have some ideas at the most abstract level, we should consider modeling of individual nodes within a distributed system and their interaction. How should we think about distributed nodes and means of communication between them? |
Beta Was this translation helpful? Give feedback.
-
|
Physically, a distributed system is a collection of largely independent computing machines, called nodes, that only interact by communicating data over some kind of network. Generally, communication over the network is not reliable, i.e. data packets may be dropped, delivered multiple times or out of order. Therefore, there are usually some higher-level mechanisms built on top of the unreliable low-level network in order to ensure more reliable communication, e.g. ordered and reliable (under certain assumptions) delivery of data without duplication. Communication between nodes generally happens concurrently and asynchronously; therefore, nodes and the communication system connecting them have to deal with concurrency. So we can think of individual nodes as concurrent systems interacting with a common communication system, which is also concurrent. On the other hand, logically, one may want to distinguish pieces of functionality within a distributed system that span multiple physical nodes. In fact, the communication system is an example of this. So we can conceptually think of the overall distributed system as a collection of interacting concurrent systems possibly spanning multiple physical nodes. Different nodes may play different roles depending on their function withing the distributed (sub-)system, e.g. client, leader, follower, etc. A single node may even play multiple roles at once or dynamically change its role (de-/activate different roles). |
Beta Was this translation helpful? Give feedback.
-
|
We can adopt a uniform approach to modeling of distributed and local concurrent interaction, in which the difference between local (intra-node) and remote (inter-node, distributed) interaction is not treated as fundamental but emergent from the properties of communication components. |
Beta Was this translation helpful? Give feedback.
-
|
Dynamism and adaptivity can be expressed in a principally static description by capturing regular patterns as potential structure (i.e. any shape that the system might take), whereas the actual structure representing emergent dynamics is determined through activation (e.g. data-dependent) of a bounded subset of the potential structure under certain conditions. |
Beta Was this translation helpful? Give feedback.
-
|
We can think of coordination as correlation (entanglement) and fault tolerance as coarse-graining with error correction, both manifesting in redundant symmetry, whereby failures can be thought of as symmetry-breaking noise. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is a discussion for gathering initial ideas concerning the framework's modeling approach as described in #47.
Beta Was this translation helpful? Give feedback.
All reactions