---
title: 9.4 Graph Theory - Consensus
subject:  Symmetric Matrices
subtitle: convergence of agents' behavior
short_title: 9.4 Graph Theory - Consensus
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: 
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/08_Ch_9_Symmetric_Matrices/105-Graph-consensus.ipynb)

{doc}`Lecture notes <../lecture_notes/Lecture 17 - Introduction to Graph Theory and Consensus Protocols.pdf>`

## Reading

Material related to this page can be found in [](https://murray.cds.caltech.edu/images/murray.cds/1/1e/Eeci-sp09_L4_graphtheory.pdf)

## Learning Objectives

By the end of this page, you should know:
- what is a consensus protocol
- the consensus theorem
- how the eigenvalues/eigenvectors of the Laplacian matrix relate to consensus in a graph

**TO DO**: NumPy examples 

## Consensus Protocols 

Consider a collection of $N$ agents that communicate along a set of undirected links described by a graph $G$. Each agent has state $x_i(t) \in \mathbb{R}$, with initial value $x_i(0)$, and together, they wish to determine the average of the initial states $\text{avg}(\vv x(0)) = \frac{1}{N} \sum_{i=1}^N x_i(0)$.

The agents implement the following _consensus protocol_:

$$
\dot{x}_i = \sum_{j \in N_i} (x_j - x_i) = -|N_i| (x_i - \text{avg}(x_{N_i})),
$$

where $\text{avg}(x_{N_i}) = \frac{1}{|N_i|} \sum_{j \in N_i} x_j$ is the average of the states of the neighbors of agent $i$. This is equivalent to the first-order homogeneous linear ordinary differential equation:

\begin{equation}
\label{avg}
\dot{\vv x} = -L\vv x. \quad (\text{AVG})
\end{equation}

Based on our [previous analysis](../05_Ch_6_Eigenvalues_and_Eigenvectors/078-linear_odes.ipynb#odes-thm1) of such systems, we know that the solution to [(AVG)](#avg) is given by

\begin{equation}
\label{sol}
\vv x(t) = c_1 e^{\lambda_1 t} \vv v_1 + \cdots + c_n e^{\lambda_n t} \vv v_n, \quad \vv x(0) = \bm \vv v_1 & \cdots & \vv v_n \em \vv c. \quad (\text{SOL})
\end{equation}

where $(\lambda_i, \vv v_i)$, $i=1,\ldots,n$, are the eigenvalue/eigenvector pairs of the negative graph Laplacian $-L$. Thus, the behavior of the consensus system [(AVG)](#avg) is determined by the spectrum of $L$. We will spend the rest of this lecture on understanding the following theorem:

:::{prf:theorem} Consensus Theorem
:label: consensus-thm

If the graph $G$ defining the consensus system ([AVG](#avg)) is connected, then the state of the agents converges to $x^* = \text{avg}(\vv x(0))$ exponentially quickly.

:::

[This result](#consensus-thm) is extremely intuitive! It says that so long as the information at one node can eventually reach every other node in the graph, then we can achieve consensus via the protocol [(AVG)](#avg). Let's try to understand why. As in the previous lecture, we order the eigenvalues of $-L$ in decreasing order: $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n$.

Our first observation is that $\lambda_1 = 0$, $\vv v = \vv 1$ is an eigenvalue/eigenvector pair for $-L$. This follows from the fact that each row of $L$ sums to 0, and so:

$$
-L \mathbf{1} = \mathbf{0} = 0 \mathbf{1}.
$$

A fact that we'll show is true later is that the eigenvalues of $L$ are all nonnegative, and thus we know that $\lambda_i \leq 0$ for the eigenvalues of $-L$. As such, we know that $\lambda = 0$ is the largest eigenvalue of $-L$: hence we label them $\lambda_1 = 0$, $\vv v_1 =  \mathbf{1}$.

Next, we recall that for an undirected graph, the Laplacian $L$ is symmetric, and hence is diagonalized by an orthonormal eigenbasis $-L = Q \Lambda Q^T$, where $Q = \bm \vv u_1 & \cdots & \vv u_n\em $ is an orthogonal matrix composed of orthonormal eigenvectors of $L$, and $\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n)$. Although we do not know $\vv u_2, \ldots, \vv u_n$, we know that $\vv u_1 = \frac{\vv v_1}{\|\vv v_1\|}\frac{1}{\sqrt{N}} \mathbf{1}$.

We can therefore rewrite [(SOL)](#sol) as:

\begin{equation}
\label{sol_re}
\vv x(t) &= c_1 e^{0t} \frac{1}{\sqrt{N}} \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n \\
&= c_1 \frac{1}{\sqrt{N}} \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n
\end{equation}

where now we can compute $\vv c$ by solving $\vv x(0) = Q\vv c \Rightarrow \vv c = Q^T \vv x(0)$, as $Q$ is an orthogonal matrix.

Let's focus on computing $c_1$:

$$
c_1 = \vv u_1^T \vv x(0) = \frac{1}{\sqrt{N}} \mathbf{1}^T \vv x(0) = \frac{1}{\sqrt{N}} \sum_{i=1}^N x_i(0).
$$

Plugging this back into [](#sol_re), we get:

\begin{equation}
\label{sol_re2}
\vv x(t) &= \frac{1}{N} \sum_{i=1}^N x_i(0) \cot \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n \\
&= \text{avg}(\vv x(0)) \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n.
\end{equation}

This is very exciting! We have shown that the solution $\vv x(t)$ to [(AVG)](#avg) is composed of a sum of the final consensus state $\vv x^* = \text{avg}(\vv x(0)) \mathbf{1}$ and exponential functions $c_i e^{\lambda_i t} \vv u_i$, $i=2,\ldots,n$, evolving in the subspace $\vv u_1^\perp$ orthogonal to the consensus direction $\frac{1}{\sqrt{N}} \mathbf{1}$. Thus, if we can show that $\lambda_2, \ldots, \lambda_n < 0$, we will have established our result.

To establish this result, we start by stating a widely used theorem for bounding localizing eigenvalues.

:::{prf:theorem} Gershgorin's Disk Theorem
:label: gershgorin-thm

Let $A \in \mathbb{R}^{n \times n}$ and define the radius
$$
r_i = \sum_{j=1, j \neq i}^n |a_{ij}|
$$
as the absolute row sum with entry $a_{ii}$ deleted. Then all eigenvalues of $A$ are located in the union of $n$ disks:
$$
G(A) = \bigcup_{i=1}^n G_i(A), \quad G_i(A) = \{z \in \mathbb{C} \mid |z - a_{ii}| \leq r_i\}
$$

In the case of symmetric matrices, we can restrict the $G_i(A)$ to the real line:
$$
G_i(A) = \{\lambda \in \mathbb{R} \mid |\lambda - a_{ii}| \leq r_i\}
$$
:::

:::{prf:example}
:label: graphs-ex4

Consider $A = \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix}$. [Gershgorin's disk theorem](#gershgorin-thm)tells us that the eigenvalues $\lambda_1$ and $\lambda_2$ are contained within the set
$$
G(A) = \{\lambda \in \mathbb{R} \mid |\lambda - 3| \leq 1\},
$$
or equivalently that $2 \leq \lambda_2 \leq \lambda_1 \leq 4$. As we've computed in previous examples, $\lambda_1 = 4$ and $\lambda_2 = 2$, which indeed do lie within $G(A)$.

:::

Let's apply [this theorem](#gershgorin-thm) to a graph Laplacian $L$. The diagonal elements of $L = \Delta - A$ are given by $\Delta_{ii} = \text{out}(v_i)$, the out-degree of node $i$. Further, the radii $r_i = \text{out}(v_i)$ as well, as $a_{ij} = 1$ if node $i$ is connected to node $j$, and 0 otherwise. Therefore, for row $i$, we have the following Gershgorin intervals:
$$
G_i(L) = \{\lambda \in \mathbb{R} \mid |\lambda - \text{out}(v_i)| \leq \text{out}(v_i)\}.
$$

These are intervals of the form $[0, 2\text{out}(v_i)]$, and therefore the union $G(L) = \bigcup_{i=1}^n G_i(L) = [0, 2d_{\text{max}}]$, where $d_{\text{max}} = \max_i \text{out}(v_i)$ is the maximal out degree of a node in the graph. Taking the negative of everything, we conclude that $G(-L) = [-2d_{\text{max}}, 0]$.

This tells us that $\lambda_i \leq 0$ for $i=1,2,\ldots,n$ for the eigenvalues of $-L$. This is almost what we wanted. We still need to show that only $\lambda_1 = 0$ and that $\lambda_n \leq \cdots \leq \lambda_2 < 0$. To answer this question, we rely on the following proposition:

:::{prf:proposition} 
:label: connected-components-prop

The algebraic multiplicity of the 0 eigenvalue of a graph Laplacian $L$ is equal to the number of connected components in the graph. In particular, if the graph $G$ is connected, then only $\lambda_1 = 0$, and $\lambda_n \leq \cdots \leq \lambda_2 < \lambda_1 = 0$.

:::

Unfortunately proving [this result](#connected-components-prop) would take us too far astray. Instead, we highlight the intuitive nature of the result in terms of the consensus system [(AVG)](#avg). This proposition tells us that if the communication graph $G$ is strongly connected, i.e., if everyone's information eventually reaches everyone, then $\vv x(t) \to \vv x^* = \text{avg}(\vv x(0))\mathbf{1}$ at a rate governed by the slowest decaying node $e^{-\lambda_2 t}$.

:::{note} Algebraic Connectivity
The eigenvalue $\lambda_2$ that dictates the convergence rate of consensus, is called the _algebraic connectivity_ of the graph. 
:::

In contrast, suppose the graph $G$ is disconnected, and consists of the disjoint union of two connected graphs $G_1 = (\mathcal{V}_1, \mathcal{E}_1)$ and $G_2 = (\mathcal{V}_2, \mathcal{E}_2)$, i.e., $G = (\mathcal{V}_1 \cup \mathcal{V}_2, \mathcal{E}_1 \cup \mathcal{E}_2)$ and $\mathcal{V}_1 \cap \mathcal{V}_2 = \emptyset$ and $\mathcal{E}_1 \cap \mathcal{E}_2 = \emptyset$. Then if we run the consensus protocol [(AVG)](#avg) on $G$, the system effectively decouples into two parallel systems, each evolving on their own graph and blissfully unaware of the other:

\begin{equation}
\label{decouple}
\dot{\vv x}_1 = -L_1 \vv x_1 \quad \text{and} \quad \dot{\vv x}_2 = -L_2 \vv x_2.
\end{equation}

Here we use $\vv x_1$ to denote the state of agents in $G_1$, with Laplacian $L_1$, and similarly for $\vv x_2$. By the above discussion, if $L_1$ and $L_2$ are both strongly connected, then $\vv x_i(t) \to \text{avg}(\vv x_i(0)) \mathbf{1}$, and $\lambda = 0, \vv v = \mathbf{1}$ is an eigenvalue/vector pair for each graph.

If we now consider the joint graph $G$ composed of the two disjoint graphs $G_1$ and $G_2$, we can immediately see how to change our consensus protocol:
$\dot{\vv x}_i = -L_i\vv x_i$ will evolve as it did before.

To see how this manifests in the algebraic multiplicity of the 0 eigenvalue of $L = \begin{bmatrix} L_1 & \\ & L_2 \end{bmatrix}$, note that for the composite system with state $\vv x = \begin{bmatrix} \vv x_1 \\ \vv x_2 \end{bmatrix}$, we have the consensus dynamics:

$$
\begin{bmatrix} \dot{\vv x}_1 \\ \dot{\vv x}_2 \end{bmatrix} = \begin{bmatrix} -L_1 & \\ & -L_2 \end{bmatrix} \begin{bmatrix} \vv x_1 \\ \vv x_2 \end{bmatrix},
$$

which has $\lambda_1 = 0$ with $\vv v_1 = \begin{bmatrix} \vv 1 \\ \vv 0 \end{bmatrix}$ and $\lambda_2 = 0$ with $v_2 = \begin{bmatrix} \vv 0 \\ \vv 1 \end{bmatrix}$ so that:

$$
\begin{bmatrix} \vv x_1^* \\ \vv x_2^* \end{bmatrix} = \begin{bmatrix} \vv 1 \\ \vv 0 \end{bmatrix} \text{avg}(\vv x_1(0)) + \begin{bmatrix} \vv 0 \\ \vv 1 \end{bmatrix} \text{avg}(\vv x_2(0)).
$$

This is of course expected, as all we have done is rewrite [](#decouple) using block vectors and matrices --- we have not changed anything about the consensus protocol.

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/08_Ch_9_Symmetric_Matrices/105-Graph-consensus.ipynb)