# Connected components

[Run notebook in Google Colab](https://colab.research.google.com/github/pathpy/pathpy/blob/master/doc/tutorial/components.ipynb)  
[Download notebook](https://github.com/pathpy/pathpy/raw/master/doc/tutorial/components.ipynb)

A key characteristic of a network is whether it is connected, i.e. whether all nodes are connected via a path. For disconnected networks, where this is not the case, we can compute so-called connected components, i.e. the maximally connected subgraphs. In this second unit we implement an algorithm to compute connected components, and apply it to empirical data sets.

In [None]:
pip install git+git://github.com/pathpy/pathpy.git

In [None]:
import numpy as np
import pathpy as pp
import scipy as sp

Again, we start our tutorial with two simple networks. 
The first one is a simple undirected and disconnected network that has two connected components.  

In [None]:
n_undirected = pp.Network(directed=False)
n_undirected.add_edge('a', 'b')
n_undirected.add_edge('b', 'c')
n_undirected.add_edge('a', 'c')
n_undirected.add_edge('d', 'f')
n_undirected.add_edge('d', 'g')
n_undirected.add_edge('d', 'e')
n_undirected.add_edge('e', 'f')
n_undirected.add_edge('f', 'g')
n_undirected.plot()

The second network is directed and weakly connected, because from the nodes `a` and `b` we can only reach `c` and `d` in one direction, but not in the opposite direction.

In [None]:
n_directed = pp.Network(directed=True)
n_directed.add_edge('a', 'b')
n_directed.add_edge('b', 'a')
n_directed.add_edge('a', 'c')
n_directed.add_edge('b', 'c')
n_directed.add_edge('c', 'd')
n_directed.add_edge('d', 'c')
n_directed.plot()

## Computing connected components in `pathpy`

The `find_connected_components` function in `pathpy` returns a dictionary of connected components:

In [None]:
pp.algorithms.components.find_connected_components(n_undirected)

In [None]:
pp.algorithms.components.find_connected_components(n_directed)

We can use the function `largest_connected_component` to extract the largest connected component and return it as a new network object:

In [None]:
lcc = pp.algorithms.components.largest_connected_component(n_undirected)
lcc.plot()

To compute the size of the largest connected component in a network we can use a special function:

In [None]:
pp.algorithms.components.largest_component_size(n_undirected)

In [None]:
lcc = pp.algorithms.components.largest_connected_component(n_directed)
lcc.plot()