<a href="https://colab.research.google.com/github/duplys/cca/blob/main/Component_Coupling_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Stable Dependencies Principle
The Stable Dependencies Principle (SDP) states that components must _depend in the direction of stability_. SDP ensures that modules that are intended to be easy to change are not depended on by modules that are harder to change.

Some volatility is necessary for the maintenance of a code base. Thus, some software components are _expected_ to change and are therefore _designed_ to be volatile. Components expected to be volatile must not be depended on by components that are difficult to change (otherwise, the volatile component will be de facto difficult to change).

One sure way&mdash;besides size, complexity, clarity, etc.&mdash;to make a software component difficult to change is to make lots of other software components depend on it. A component with lots of incoming dependencies is very stable because it requires a great deal of work to reconcile any changes with all the dependent components.

### Stability Metrics
Counting dependencies that enter and leave a component allows us to calculate the _positional_ stability of that component.

* _Fan-in_: Incoming dependencies, i.e., number of classes outside the component that depend on classes within the component.
* _Fan-out_: Outgoing dependencies, i.e., number of classes inside a component that depend on classes outside the component.
* _I_: Instability. $I = \textrm{Fan-out} \div (\textrm{Fan-in} + \textrm{Fan-out})$. This metric has the range $[0,1]$. $I = 0$ indicates a maximally stable component, $I=1$ indicates a maximally unstable component.

In [None]:
from typing import Dict
import logging


# create logger
logger = logging.getLogger('simple')
logger.setLevel(logging.DEBUG)

# create console handler and set level to debug
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)

# add ch to logger
logger.addHandler(ch)


dependencies = {'componentA': {'fan_in': 4, 'fan_out': 3},
                'componentB': {'fan_in': 6, 'fan_out': 0},
                'componentC': {'fan_in': 0, 'fan_out': 10}}



def compute_instability(depedencies) -> Dict:
  instability = {}

  for c in dependencies:
    fan_in = dependencies[c]['fan_in']
    fan_out = dependencies[c]['fan_out']
    i = fan_out / (fan_in + fan_out)
    log_data = "fan_in {}, fan_out {}, i {}".format(fan_in, fan_out, i)
    #logger.debug(log_data)
    #print("fan_in {}, fan_out {}, i {}".format(fan_in, fan_out, i))
    instability[c] = round(i,2)

  return instability

inst = compute_instability(dependencies)

print("=== Instability ===")
print(inst)


=== Instability ===
{'componentA': 0.43, 'componentB': 0.0, 'componentC': 1.0}


## Stable Abstractions Principle
The Stable Abstractions Principle (SAP) states that a _component should be as abstract as it is stable_, i.e., it sets up a relationship between stability and abstractness. It says that a stable component should also be abstract so that its stability does not prevent it from being extended. On the other hand, it says that an unstable component should be concrete since its instability allows the concrete code within it to be easily changed.

Thus, if a component is _stable_, it should consist of interfaces and abstract classes so taht it can be extended. Stable components that are extensible are flexible and do not overly constrain the architecture.

### Measuring Abstraction
The measure of abstractness is simply the ratio of interfaces and abstract classes in a component to the total number of classes in the component.
* $Nc$: The number of classes in the component
* $Na$: The number of abstract classes and interfaces in the component
* $A$: Abstractness. $A = Na \div Nc$

Metric $A$ ranges from 0 to 1. A value of 0 implies that the component has no abstract classes at all. A value of 1 implies that the component contains nothing but abstract classes.

In [None]:
components = {'componentA': {'abcif': 3, 'num_classes': 4},
              'componentB': {'abcif': 6, 'num_classes': 6},
              'componentC': {'abcif': 0, 'num_classes': 10}}

def compute_abstractness(components) -> Dict:
  abstractness = {}

  for c in components:
    ac = components[c]['abcif']
    tc = components[c]['num_classes']
    a =  ac / tc
    log_data = "abstr {}, total {}, a {}".format(ac, tc, a)
    abstractness[c] = round(a,2)

  return abstractness

abstr = compute_abstractness(components)

print("=== Abstractness ===")
print(abstr)

=== Abstractness ===
{'componentA': 0.75, 'componentB': 1.0, 'componentC': 0.0}


## The Main Sequence
To define the desired relationship between stability ($I$) and abstractness ($A$), we create a graph with $A$ on the vertical axis and $I$ on the horizontal axis. Components that are maximally stable and abstract are at the upper left at (0,1) and components that are maximally unstable and concrete are at the lower right at (1,0).

Most components should be kept on the Main Sequence, a line that connects  because components that sit on the Main Sequence are not "too abstract" for their stability, nor "too stable" for their abstractness.

Lower left area near (0,0) (zone of pain) is not desirable because these components are highly stable and concrete. Components in this zone are rigid: they cannot be extended because they are not abstract, and they are very difficult to change because of their stability. 

Upper right area near (1,1) (zone of uselessnesss) is also undesirable because they are maximally abstract, yet have no dependents. Such components are useless because these are abstract classes that no one ever implemented.

As a result, components have the best characteristics if they are on, or _close_, to the Main Sequence.

## Distance from the Main Sequence
This metric measures how far away a component is from the Main Sequence.

* $D$: Distance. $D = | A + I - 1|$. The range of this metric is $[0,1]$. A value of 0 indicates that the component sits directly on the Main Sequence. A value of 1 indicates that the component is as far away from the Main sequence of possible.

Given this metrics, a code base can be analyzed for its overall conformance to the Main Sequence. Any component with $D$ value that is not near zero should be reexamined and refactored.

Statistical analysis of a design is also possible. We can calculate the mean and variance of all the $D$ metrics for the components within a single code base. We would expect a conforming design to have a mean and variance that are close to zero. The variance can be used to establish "control limits" so as to identify components that are "exceptional" in comparison to all the others. So the idea is to produce a scatterplot for how the individual components are distributed around the Main Sequence.

Another way to use metric $D$ is to plot it for each component over time. This allows to see e.g., that strange dependencies creep in over time. This also allows to introduce a control threshold. 



In [None]:
#print(inst)
#print(abstr)

def compute_distance(instability, abstractness) -> Dict:
  distance = {}

  for c in inst:
    d = abstr[c] + inst[c] - 1
    log_data = "abstr {}, inst {}, d {}".format(abstr[c], inst[c], d)
    distance[c] = round(d,2)
    
  return distance

dist = compute_distance(inst, abstr)

print("=== Distance ===")
print(dist)



=== Distance ===
{'componentA': 0.18, 'componentB': 0.0, 'componentC': 0.0}
