# Introduction

In Chapter 1 we devised a signature for a graph and saw that it works fine with a graph up to order 3.

We saw that **Signature v1** was too simplistic to handle the complexity of graphs beyond order 3.

In this chapter we are going to write an algorithm that aim at higher orders.



# Definitions

Now that we are warming up, let's add a few more definitions that we did not cover in the previous chapter.

| Term        | Definition   |
| :---------- | :----------- |
| **Edge** | An edge (or link) is a connection between two nodes in a graph. |
| **Connected Graph** |	A **connected graph** is one where you can travel from any node to any other node by following the edges. There are no isolated "islands" of nodes. |
| **Disconnected Graph** |	A disconnected graph is the opposite of a connected graph. It is made up of two or more separate 'components' that are not linked by any edges. |
| **Isolated Node** |	An **isolated node** (or isolated vertex) is a node that has no edges connected to it. It has zero neighbours. A graph where all nodes are isolated is sometimes called a "null graph" or "empty graph". |
| **Complete Graph** | A **complete graph** is a graph where every single node is directly connected to every other single node. They are "all connected" to each other in the most direct way possible. |
| **Clique**|	A **clique** in a graph is a group of nodes where every node in that group is directly connected to every other node in that same group. It's like a "mini complete graph" within a larger graph. |

# Introduction

Our next step, Signature v2, builds directly on what we learned from Signature v1, which was too simplistic to handle the complexity of graphs beyond order 3.

The goal of this new algorithm is to generate a deterministic and unique signature for each node within a graph. This signature, reflecting the node's topological role, is invaluable for tasks like graph isomorphism testing or identifying structurally equivalent nodes.

To achieve this, the process is iterative. It starts with a simple characterization of each node (its number of neighbours) and recursively refines the signatures of ambiguous nodes until all possible distinctions have been made, giving us a much richer picture of the local connections around each node.

## Signature V2 main algorithm

![fig1: Simple Graph](images/ch3-main-diagram.png)


This image depicts the main flow of the "Signature v2" algorithm.

1. **Init**: The process starts with an initialization step.
2. **Sort Signatures**: The algorithm then sorts the graph signatures.
3. **Decision Point 1**:
    * If "all signatures are unique," the "algorithm is complete."
    * If "ambiguous signatures remain," the process moves to expansion.
4. **Expand recursively** all ambiguous signatures: Ambiguous signatures are expanded.
5. **Decision Point 2** (after expansion):
    * If there's "No expansion (Symmetry detected)," the "algorithm [is] complete."
    * If "expansion success[ful]," the process moves to an intermediate state: "Pass complete Ambiguous nodes remain."
6. **Loop**: From "Pass complete Ambiguous nodes remain," the algorithm goes back to "Sort Signatures" for the "next pass."
Essentially, it's an iterative process of sorting and expanding signatures until all are unique or a stable symmetry is detected.

## The Node Signature
The algorithm operates on a ***Node Signature*** for each node in the graph and its neighbours.

This is the primary object representing a node during computation. A signature is considered `collapsed` initially. It becomes `expanded` when its ***neighbours*** array is populated. It is `finalized`.

![fig1: Simple Graph](images/ch3-node-signature-state-diagram.png)


### Signature Lifecycle Stages

**Initial State**: A signature begins in the Collapsed state.

**First Evaluation**: From the Collapsed state, there are two paths:
    - If found to be unique, it transitions directly to the final Unique state.
    - If it's ambiguous, it transitions to the Expanded state for further processing.

**Processing Loop**: Once in the Expanded state, the signature is re-evaluated in each subsequent "next pass" of the algorithm, remaining in the Expanded state if it's still ambiguous.

**Final States**: From the Expanded state, the lifecycle can end in two ways:
    - If the signature is resolved and becomes unique, it moves to the Unique state.
    - If the main algorithm terminates while the signature is still ambiguous, it moves to the final Symmetric state, indicating it's part of an unresolvable symmetry.

A ***Node Signature*** has the following properties:

* ***label***: string - The node's unique identifier (e.g., 'A'). Used for tracking.
* ***neighbourCount***: number - The degree of the node. This is the primary sorting criterion.
* ***finalIndex***: number (optional) - The node's unique, zero-based rank in the sorted list. A signature is considered `finalized` once this is assigned.
* ***neighbours***: array (optional) - If a signature is ambiguous, this array is populated with `Neighbour Representation` objects to resolve the ambiguity.
* ***cycleDistance***: number (optional) - A marker used to describe a loop.
* ***resolutionStep***: number (optional) - The pass number in which the ***finalIndex*** or ***cycleDistance*** was assigned.


When a ***Node Signature*** is expanded, its ***neighbours*** array is populated with the same node signature structure:

* **For a finalized neighbour**: `{ finalIndex: number, resolutionStep: number }`
    * A reference to a neighbour that is already unambiguous.
* **For a non-finalized neighbour**: `{ neighbourCount: number }`
    * A reference to a neighbour that is still ambiguous.
* **For a cycle**: `{ cycleDistance: number, resolutionStep: number }`
    * A marker used to terminate a recursive expansion when it loops back to an ancestor in the current expansion path. ***cycleDistance*** is the number of steps back to the first occurrence.



### Core Sorting Logic

The comparison between any two signatures, `sigA` and `sigB`, follows a strict hierarchy of rules. This ensures a stable and deterministic order, meaning the relative order of two unequal signatures will never change in subsequent passes.

The comparison rules are applied in this exact order:

**1. By *neighbourCount* (descending)**
* If ***neighbourCount*** values are different, the comparison stops here.

**2. By *resolutionStep* (descending)**
* A signature with a ***resolutionStep*** comes before one that does not.
* The signature with the lower ***resolutionStep*** always comes first.
* If ***resolutionStep*** values are different, the comparison stops here. 

**3. By *cycleDistance* (ascending)**
* A signature with a ***cycleDistance*** comes before one that does not.
* If both have a ***cycleDistance***, the one with the lower value comes first.
* If ***cycleDistance*** values are different, the comparison stop here.

**4. By *finalIndex* (ascending)**
* A signature that has a ***finalIndex*** comes before one that does not.
* If both have a ***finalIndex***, the one with the lower value comes first.
* If ***finalIndex*** values are different, the comparison stop here.

**5. By *neighbours* array (recursive)**
* This rule is applied only if the signatures are still tied.
* If one signature is *expanded* (has a ***neighbours*** array) and the other is ***collapsed*** (does not), the ***expanded*** one comes first.
* If both are ***expanded***, sort by recursively comparing their ***neighbours*** arrays.
* If both are ***collapsed*** (i.e., not yet expanded in the current pass), they are considered **equal for now**.

## The Process

The algorithm proceeds in passes, attempting to finalize signatures at each step.

### Initialization

1.  For each node in the graph, create a ***Node Signature*** object in the `collapsed` state, populating its ***label*** and ***neighbourCount***.
2.  Sort the entire list of signatures using the `Core Sorting Logic`.
3.  Scan the sorted list. Any signature that is unique is considered unambiguous. Assign it a ***finalIndex*** (its current position in the array) and a ***resolutionStep*** (which is 1).

### Main Loop (Subsequent Passes)

If, after a pass, there are still ambiguous signatures (those without a ***finalIndex***), a new pass is required.

1.  For each ambiguous ***Node Signature***, we expand it by populating its ***neighbours*** array.
2.  For each neighbour of the corresponding node in the graph, we create a `Neighbour Representation` and add it to the array:
    * If the neighbour's signature is already `finalized`, add a `{ finalIndex: ... }` representation.
    * If the neighbour's signature is not finalized, add a `{ neighbourCount: ... }` representation.
3.  Once populated, each new ***neighbours*** array is sorted using the same `Core Sorting Logic`.
4.  After all ambiguous signatures are expanded, the entire list is re-sorted globally.
5.  The algorithm then attempts to assign a ***finalIndex*** and ***resolutionStep*** to any signature that has now become unique.

### Cycle Detection

During the recursive expansion of a signature (e.g., E -> F -> ...), the algorithm must track the current expansion path. If it attempts to expand a neighbour that is already an ancestor in this path (e.g., expanding F's neighbours and re-encountering E), a cycle is detected.

* The expansion of that branch is terminated.
* The neighbour is represented with a `{ cycleDistance: number }` object, marking the loop.

### Termination

The algorithm terminates when either:
* All signatures have been assigned a ***finalIndex***.
* A full pass completes with no new signatures being finalized. This occurs when remaining ambiguities are due to perfect structural symmetry, which the algorithm has correctly identified.

### Final Signature Generation

Once the algorithm terminates, the list of ***Node Signature*** objects can be converted into a cleaner, final format for output. This typically involves removing working properties like ***label*** to leave a purely structural, canonical signature.

# Running the **Signature v2** algorithm

## The Example Graph G1

Throughout this document, we will use the following 6-node graph as a reference.

![fig1: Simple Graph](images/simple-graph_G1.png)

| Node | Neighbour count | Neighbours |
| :--- | :-------------- | :--------- |
| A    | 3               | B C D      |
| B    | 4               | A D E F    |
| C    | 1               | A          |
| D    | 2               | A B        |
| E    | 2               | B F        |
| F    | 2               | B E        |


Now we are going to run the algorithm on this graph


## Pass 1: Initialization

This first pass establishes the baseline signatures and resolves any nodes that are immediately unique based on their number of neighbours.

---
### 1. Create Initial Signatures
First, we create a ***Node Signature*** object for each node, containing only its ***label*** and ***neighbourCount***.
```json
[
    { "label": "A", "neighbourCount": 3 },
    { "label": "B", "neighbourCount": 4 },
    { "label": "C", "neighbourCount": 1 },
    { "label": "D", "neighbourCount": 2 },
    { "label": "E", "neighbourCount": 2 },
    { "label": "F", "neighbourCount": 2 }
]
```

---
### 2. Sort the List
Next, we sort this list using the `Core Sorting Logic`. At this stage, only the first rule applies: sort by ***neighbourCount*** in **descending** order.

```json
[
    { "label": "B", "neighbourCount": 4 }, // index 0
    { "label": "A", "neighbourCount": 3 }, // index 1
    { "label": "D", "neighbourCount": 2 }, // index 2
    { "label": "E", "neighbourCount": 2 }, // index 3
    { "label": "F", "neighbourCount": 2 }, // index 4
    { "label": "C", "neighbourCount": 1 }  // index 5
]
```

---
### 3. Finalize Unambiguous Signatures
Now, we scan the sorted list to find signatures that are unique. A signature is unique if no other signature in the list has the same properties (at this point, just the ***neighbourCount***).

* **Node B** { neighbourCount: 4 } is **unique**.
* **Node A** { neighbourCount: 3 } is **unique**.
* **Nodes D, E, F** { neighbourCount: 2 } are **not unique**. They are tied with each other.
* **Node C** { neighbourCount: 1 } is **unique**.

We assign a ***finalIndex*** (its current position in the array) and `resolutionStep: 1` to each unique signature.

---
### Result at the End of Pass 1
The list of signatures is now in the following state. Nodes **B**, **A**, and **C** are considered `finalized`. Nodes D, E, and F remain ambiguous.

```json
[
    { "label": "B", "neighbourCount": 4, "finalIndex": 0, "resolutionStep": 1 },
    { "label": "A", "neighbourCount": 3, "finalIndex": 1, "resolutionStep": 1 },
    { "label": "D", "neighbourCount": 2 },
    { "label": "E", "neighbourCount": 2 },
    { "label": "F", "neighbourCount": 2 },
    { "label": "C", "neighbourCount": 1, "finalIndex": 5, "resolutionStep": 1 }
]
</pre>
```

The algorithm must proceed to a second pass to resolve the ambiguity between D, E, and F.

## Pass 2: Resolving Ambiguities

At the start of this pass, signatures for D, E, and F are ambiguous as they share the same ***neighbourCount***. We will now expand these three signatures.

---
### 1. Expand Ambiguous Signatures

We populate the ***neighbours*** array for each ambiguous signature (D, E, and F) using the `Neighbour Representation` rules, keeping the ***label*** for clarity.

* **For Node D (Neighbours: A, B):**
    * Neighbour 'A' is `finalized` with `finalIndex: 1`.
    * Neighbour 'B' is `finalized` with `finalIndex: 0`.
    * D's ***neighbours*** array becomes: `[ { "label": "A", "finalIndex": 1 }, { "label": "B", "finalIndex": 0 } ]`

* **For Node E (Neighbours: B, F):**
    * Neighbour 'B' is `finalized` with `finalIndex: 0`.
    * Neighbour 'F' is **not** finalized. Its ***neighbourCount*** is 2.
    * E's ***neighbours*** array becomes: `[ { "label": "B", "finalIndex": 0 }, { "label": "F", "neighbourCount": 2 } ]`

* **For Node F (Neighbours: B, E):**
    * Neighbour 'B' is `finalized` with `finalIndex: 0`.
    * Neighbour 'E' is **not** finalized. Its ***neighbourCount*** is 2.
    * F's ***neighbours*** array becomes: `[ { "label": "B", "finalIndex": 0 }, { "label": "E", "neighbourCount": 2 } ]`

---
### 2. Sort Internal ***neighbours*** Arrays

Next, we sort each of the newly created ***neighbours*** arrays using the `Core Sorting Logic`. The ***label*** is not used for sorting.

* **D's sorted ***neighbours***: `[ { "label": "B", "finalIndex": 0 }, { "label": "A", "finalIndex": 1 } ]`
* **E's sorted ***neighbours***: `[ { "label": "B", "finalIndex": 0 }, { "label": "F", "neighbourCount": 2 } ]` (already sorted)
* **F's sorted ***neighbours***: `[ { "label": "B", "finalIndex": 0 }, { "label": "E", "neighbourCount": 2 } ]` (already sorted)

---
### 3. Re-sort the Global List & Finalize

With the ambiguous signatures now expanded, we re-sort the entire list of six signatures. The sorting logic compares D, E, and F based on their new ***neighbours*** arrays.

* The signature for **D** is now unique.
* The signatures for **E** and **F** are identical to each other from a sorting perspective (since ***label*** is ignored in the comparison), so they remain ambiguous.

Node D's unique signature earns it a ***finalIndex*** of 2 (its new stable position in the sorted list) and a ***resolutionStep*** of 2.

---
### Result at the End of Pass 2

The list of signatures is now in the following state. Node **D** is now `finalized`. Nodes E and F remain ambiguous.

```json
[
  { "label": "B", "neighbourCount": 4, "finalIndex": 0, "resolutionStep": 1 },
  { "label": "A", "neighbourCount": 3, "finalIndex": 1, "resolutionStep": 1 },
  { "label": "D", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "A", "finalIndex": 1 } ], "finalIndex": 2, "resolutionStep": 2 },
  { "label": "E", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "F", "neighbourCount": 2 } ] },
  { "label": "F", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "E", "neighbourCount": 2 } ] },
  { "label": "C", "neighbourCount": 1, "finalIndex": 5, "resolutionStep": 1 }
]

## Pass 3: Deep Expansion and Termination

At the start of this pass, nodes E and F are still ambiguous, with identical signatures. We must expand their signatures further by looking inside their ***neighbours*** arrays for unresolved parts.

---
### 1. Recursive Expansion of Signatures

The current signature for both E and F contains a `Neighbour Representation` of an unresolved node: `{ "label": "F", "neighbourCount": 2 }` for E, and `{ "label": "E", "neighbourCount": 2 }` for F. We will now expand this part for each.

* **Expanding E's Signature:**
    * We expand the representation of node **F**.
    * The expansion path is now **E -> F**.
    * We look at F's actual neighbours: **B** and **E**.
    * Neighbour 'B' is `finalized`. Its representation is `{ "label": "B", "finalIndex": 0 }`.
    * Neighbour 'E' is the root of our current expansion path, creating a **cycle**. The path is `E -> F -> E` (distance 2). Since this is Pass 3, the representation is `{ "label": "E", "cycleDistance": 2, "resolutionStep": 3 }`.
    * The fully expanded representation for the neighbour F becomes `{ "label": "F", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "E", "cycleDistance": 2, "resolutionStep": 3 } ] }`.
    * E's full signature is updated accordingly.

* **Expanding F's Signature:**
    * The process is perfectly symmetrical. We expand its representation of node **E**.
    * The expansion path is **F -> E**.
    * E's neighbours are **B** and **F**.
    * Neighbour 'B' is `finalized` (`{ "label": "B", "finalIndex": 0 }`).
    * Neighbour 'F' is the root of this path (**F -> E -> F**), creating a cycle. The representation is `{ "label": "F", "cycleDistance": 2, "resolutionStep": 3 }`.
    * The fully expanded representation for the neighbour E becomes `{ "label": "E", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "F", "cycleDistance": 2, "resolutionStep": 3 } ] }`.
    * F's full signature is updated.

---
### 2. Final Comparison and Termination

After the deep expansion, we compare the new, complex signatures of E and F. They are still structurally identical from a sorting perspective, as the labels 'E' and 'F' within their cycle representations are not used for comparison.

The signatures are **still considered identical by the sorter**.

Because no new signatures were finalized in this pass, the algorithm's termination condition is met. The process is complete.

---
### Result at the End of Pass 3 (Final Working Signatures)
The final state of the working signatures before the cleanup step is:

```json
[
  { "label": "B", "neighbourCount": 4, "finalIndex": 0, "resolutionStep": 1 },
  { "label": "A", "neighbourCount": 3, "finalIndex": 1, "resolutionStep": 1 },
  { "label": "D", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "A", "finalIndex": 1 } ], "finalIndex": 2, "resolutionStep": 2 },
  { "label": "E", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "F", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "E", "cycleDistance": 2, "resolutionStep": 3 } ] } ] },
  { "label": "F", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "E", "neighbourCount": 2, "neighbours": [ { "label": "B", "finalIndex": 0 }, { "label": "F", "cycleDistance": 2, "resolutionStep": 3 } ] } ] },
  { "label": "C", "neighbourCount": 1, "finalIndex": 5, "resolutionStep": 1 }
]
```

# Final Step : Remove Labels

The last step consist only of removing the labels.

```json
[
  { "neighbourCount": 4, "finalIndex": 0, "resolutionStep": 1 },
  { "neighbourCount": 3, "finalIndex": 1, "resolutionStep": 1 },
  { "neighbourCount": 2, "neighbours": [ { "finalIndex": 0 }, { "finalIndex": 1 } ], "finalIndex": 2, "resolutionStep": 2 },
  { "neighbourCount": 2, "neighbours": [ { "finalIndex": 0 }, { "neighbourCount": 2, "neighbours": [ { "finalIndex": 0 }, { "cycleDistance": 2, "resolutionStep": 3 } ] } ] },
  { "neighbourCount": 2, "neighbours": [ { "finalIndex": 0 }, { "neighbourCount": 2, "neighbours": [ { "finalIndex": 0 }, { "cycleDistance": 2, "resolutionStep": 3 } ] } ] },
  { "neighbourCount": 1, "finalIndex": 5, "resolutionStep": 1 }
]
```


# Interpreting the Signature Results

## What the Signatures Reveal
This final array is the canonical "fingerprint" of the graph's topology.

### 1. Structural Equivalence (Symmetry)

The most significant conclusion is that **nodes E and F have perfectly identical signatures.** This means the algorithm has proven that from a topological standpoint, nodes E and F are indistinguishable and occupy symmetrical, interchangeable positions in the graph. They belong to the same "orbit" of the graph's automorphism group.

### 2. Unique Structural Roles

The algorithm successfully assigned a unique signature and ***finalIndex*** to four of the six nodes: **A, B, C, and D**. This demonstrates that these four nodes each have a distinct structural role within the graph that can be differentiated by their connectivity patterns.

### 3. A Hierarchy of Complexity

The ***resolutionStep*** value in each signature reveals how "difficult" it was for the algorithm to resolve each node's role, creating a structural hierarchy:

* **Simplest Roles (`resolutionStep: 1`):** Nodes **B, A, and C** were identified immediately, defined purely by their number of neighbours.
* **Intermediate Role (`resolutionStep: 2`):** Node **D** required looking at its immediate neighbours to be resolved.
* **Most Complex Roles (`resolutionStep: 3`):** Nodes **E and F** are the most complex, defined by a recursive, cyclical relationship that required a 3-pass analysis.

### 4. A Canonical Graph Signature

This final, sorted array of label-less signatures is a **canonical representation of the entire graph**. This output can be used for comparison; if another graph produces the exact same final array, the two graphs are **isomorphic** (structurally identical).

## Limitations and Considerations

While this algorithm produces a highly detailed and deterministic signature, it's important to acknowledge two significant trade-offs: its computational complexity and the potential size of the generated signatures.
### A Note on Empirical Validation

It is important to state that the algorithm detailed here is a formal specification. While its logic was battle-tested and refined using our 6-node example, it has **not yet been implemented in code or benchmarked** against a wide variety of complex graphs.

Therefore, its real-world performance characteristics, memory consumption, and correctness on edge cases are still theoretical. The logical next step would be a robust implementation to validate these results empirically and confirm its practical effectiveness.

### Computational Complexity

The performance of this algorithm is not its primary strength. The complexity can be considerable, especially for large or densely connected graphs. The main drivers of this complexity are:

1.  **Sorting in Each Pass:** The need to re-sort the entire list of node signatures after each expansion pass is computationally intensive.
2.  **Recursive Comparison:** The core comparison logic, which may involve deep, lexicographical comparisons of nested ***neighbours*** arrays, can be costly for nodes in complex regions of the graph.

This algorithm is therefore better suited for offline, in-depth structural analysis where descriptive accuracy is paramount, rather than for high-performance or real-time applications.

### Signature Size

The canonical signatures themselves, while rich in information, can become very large and deeply nested. This is because the signature of a node effectively encodes the structure of its surrounding neighbourhood, potentially to a great depth, to resolve ambiguities.

Consequently, the storage requirements for the final set of signatures for a large graph can be substantial.

Ultimately, this algorithm trades performance and conciseness for an extremely high degree of structural detail and accuracy.

# Conclusion

Now that the blueprint of the algorithm is complete, the ultimate question remains: does this theoretical power survive the harsh realities of implementation?
**Signature v1** was an 'order 3' algorithm. Now, with a recursive engine built to handle arbitrary depth, we must embark on the next phase to measure the true, and perhaps unlimited, order of our **Signature v2**.