# The NetworkX Module

NetworkX is a python module. To start exploring NetworkX we simply need to start a python session (Like the IPython session you are in now!), and type

In [None]:
import networkx

All of NetworkX's data structures and functions can then be accessed using the syntax `networkx.[Object]`, where `[Object]` is the function or data structure you need. Of course you would replace `[Object]` with the function you wanted. For example to make a graph, we'd write:

In [None]:
G = networkx.Graph()

Usually to save ourselves some keystrokes, we'll import NetworkX using a shorter variable name

In [None]:
import networkx as nx

# Basic Graph Data Structures

One of the main strengths of NetworkX is its flexible graph data structures. There are four data structures
 - `Graph`: Undirected Graphs
 - `DiGraph`: Directed Graphs
 - `MultiGraph`: Undirected multigraphs, ie graphs which allow for multiple edges between nodes
 - `MultiDiGraph`: Directed Multigraphs
 
Each of these has the same basic structure, attributes and features, with a few minor differences.

# Creating Graphs

Creating Graphs is as simple as calling the appropriate constructor.

In [None]:
G = nx.Graph()
D = nx.DiGraph()
M = nx.MultiGraph()
MD = nx.MultiDiGraph()

You can also add attributes to a graph during creation, either by providing a dictionary, or simply using keyword arguments

In [None]:
import datetime as dt

G1 = nx.Graph(date_created=dt.date.today(), name="Example graph")

In [None]:
graph_dict = { "version":0.1, "created_by":"John Smith", "tags":{"social", "community", "network"} }

G2 = nx.Graph(**graph_dict)

In [None]:
G1.graph

In [None]:
G2.graph

The graph attribute is just a dictionary and can be treated as one, so you can add and delete more information from it.

In [None]:
G2.graph["validated"] = False # add new graph attribute

G2.graph["version"] = 0.2 # modify existing graph attribute

del G2.graph["created_by"] # delete graph attribute

In [None]:
G2.graph

## Nodes

Next we'll cover how to add and remove nodes, as well as check for their existance in a graph and add attributes to both!

### Adding Nodes

There are two main functions for adding nodes. `add_node`, and `add_nodes_from`. The former takes single values, and the latter takes any iterable (list, set, iterator, generator). Nodes can be of any _immutable_ type. This means numbers (ints and floats complex), strings, bytes, tuples or frozen sets. They cannot be _mutable_, such as lists, dictionaries or sets. Nodes in the same graph do not have to be of the same type

In [None]:
G = nx.Graph()

# Adding single nodes of various types
G.add_node(0)
G.add_node("A")
G.add_node(("PI", 3.141592))
G.add_node(frozenset(["apples", "oranges", "grapes", "onions"]))

In [None]:
# Adding collections of nodes
G.add_nodes_from([2, 4, 6, 8, 10])
G.add_nodes_from([1/x for x in range(1, 5)])

### Listing Nodes

Accessing nodes is done using the `nodes` property which is a member of the `Graph` object.

In [None]:
G.nodes

The `nx.Graph.nodes` function in NetworkX returns a `NodeView` object. This object has some interesting properties:


* **View of Nodes:** It provides a *view* of the nodes in the graph. This means it's a dynamic representation; if you add or remove nodes from the graph, the `NodeView` object will reflect those changes.
* **Iterable:** You can iterate over the `NodeView` to access each node in the graph (e.g., using a `for` loop or converting it to a list).
* **Supports Membership Testing:** You can use the `in` operator to check if a specific node exists in the graph (e.g., `'A' in G.nodes`).
* **Supports Length:** You can use the `len()` function to get the number of nodes in the graph.
* **Node Attributes Access:** If nodes have attributes associated with them, you can access these attributes through the `NodeView`. For example, `G.nodes['A']` will return a dictionary of attributes for node `'A'`. You can also access all node attributes using `G.nodes.data()`.
* **Orderable (in Python 3.7+):** In Python 3.7 and later, `dict_keys` (which `NodeView` is based on) maintain insertion order. So, iterating over `G.nodes` will yield nodes in the order they were added to the graph (though you shouldn't rely on a specific order if it's not explicitly important for your algorithm).

In essence, `nx.Graph.nodes` gives you an efficient and dynamic way to interact with the set of nodes in your NetworkX graph, allowing you to iterate, check for existence, get the count, and access node-specific information.

Sometimes to save memory we might only want to access a list of nodes one at a time, so we can use an _iterator_. These are especially useful in long running loops to save memory.

In [None]:
for n in G.nodes():
    print(f"{n} is a string") if type(n)==str else print(f"{n} is {type(n)}")

### Checking whether nodes are in a Graph

We can also check to see if a graph has a node several different ways. The easiest is just using the `in` keyword in python, but there is also the `has_node` function.

In [None]:
2 in G

In [None]:
3 in G

In [None]:
G.has_node("A")

In [None]:
G.has_node("B")

### Node attributes

You can also add attributes to nodes. This can be handy for storing information about nodes within the graph object. This can be done when you create new nodes using keyword arguments to the `add_node` and `add_nodes_from` function

In [None]:
G = nx.Graph()

G.add_node("sprite", company="Coca-Cola Co.", food="soft drink")

When using `add_nodes_from` you provide a tuple with the first element being the node, and the second being a dictionary of attributes for that node. You can also add attributes which will be applied to all added nodes using keyword arguments

In [None]:
G.add_nodes_from(
    [
        ("chunky monkey", {"company": "Ben & Jerrys", "food":"ice cream"}),
        ("oreos", {"company": "Mondelēz International"}),
        ("heineken", {"firma": "Heineken N.V."})
    ], allergens=None)


To list node attributes you need to provide the `data=True` keyword to the `nodes` and `nodes_iter` functions

In [None]:
G.nodes(data=True)

`NodeView` (obtained via `G.nodes`) provides an iterable view of only the node labels within a graph, allowing for simple iteration and membership testing of nodes. `NodeDataView` (obtained via `G.nodes.data()`) offers an iterable view of the nodes paired with their attribute dictionaries, enabling direct access to node properties during iteration as tuples of (node, attribute dictionary). Essentially, `NodeView` focuses on the existence and identity of nodes, while `NodeDataView` extends this by providing immediate access to the associated data for each node.

Attributes are stored in a special dictionary within the graph called `nodes` you can access, edit and remove attributes there

In [None]:
# list all nodes
G.nodes

In [None]:
# lookup a single node's attributes
G.nodes["sprite"]

In [None]:
G.nodes["sprite"]["available"] = False # add new attribute to a node

G.nodes["sprite"]["food"] = "soda" # modify existing attribute of a node

del G.nodes["sprite"]["company"] # delete an attribute of a node

In [None]:
G.nodes["sprite"]

Similiarly, you can remove nodes with the `remove_node` and `remove_nodes_from` functions

In [None]:
G = nx.Graph()

G.add_nodes_from(range(1, 10))

G.remove_node(9)
G.remove_nodes_from([1, 2, 3, 5, 7])

In [None]:
G.nodes()

### Exercises

#### Repeated Nodes

1. What happens when you add nodes to a graph that already exist?
2. What happens when you add nodes to the graph that already exist but have new attributes?
3. What happens when you add nodes to a graph with attributes different from existing nodes?
4. Try removing a node that doesn't exist, what happens?

In [None]:
...

#### The FizzBuzz Graph

Make a new graph, `FizzBuzz`. Add nodes labeled 0 to 100 to the graph. Each node should have an attribute `fizz` and `buzz`. If the nodes label is divisble by 3 `fizz=True` if it is divisble by 5 `buzz=True`, otherwise both are false.

In [None]:
...

## Edges

Adding edges is similar to adding nodes. They can be added, using either `add_edge` or `add_edges_from`. They can also have attributes in the same way nodes can. If you add an edge that includes a node that doesn't exist it will create it for you

In [None]:
G1 = nx.Graph()

G1.add_edge("bacon", "eggs", breakfast=True)
G1.add_edge("orange juice", "coffee", breakfast=True)
G1.add_edge("soup", "salad", breakfast=False)

G1.nodes, G1.edges

A simple way to create edges is to use list comprehension

In [None]:
G2 = nx.Graph()

G2.add_edges_from(
    [
        (i, i+2) 
        for i in range(2, 20, 2)
    ]
)
G2.edges

Similarly to nodes, graph edges can be accessed either by `Graph.edges` or `Grpah.edges()`. Both functions produce an object of the type `EdgeView`. `EdgeView` provides a dynamic view of the edges in a graph, represented as tuples of the connected nodes (e.g., `{u, v}` for undirected graphs, and `(u, v)` for directed graphs. This view is iterable, allowing you to easily loop through all the edges in the graph, and it dynamically reflects any additions or removals of edges. For graphs with edge attributes, you can access these attributes by iterating through `G.edges(data=True)`, which yields an object of the type `EdgeDataView`, a list of tuples of `(u, v, attributes_dictionary)`.

In [None]:
G2.edges()

In [None]:
G1.edges, G1.edges(data=True)

If you want to limit the list of edges to a subset of nodes, you have to pass this subset as an argument to the `nx.Graph.edges()` funcion

In [None]:
G2.edges(range(10))

Removing edges is accomplished by using the `remove_edge` or `remove_edges_from` function. Remove edge attributes can be done by indexing into the graph

In [None]:
G1["eggs"]

In [None]:
G1["eggs"]["bacon"]

In [None]:
G1["bacon"]["eggs"]

In [None]:
del G1["eggs"]["bacon"]["breakfast"] # delete edge attriute

G1.edges(data=True)

In [None]:
G1.remove_edge("eggs", "bacon") # delete an edge

In [None]:
G1.edges(data=True)

You can check for the existance of edges with `has_edge`

In [None]:
("eggs", "bacon") in G1.edges, ("orange juice", "coffee") in G1.edges, G1.has_edge("soup", "salad")

For directed graphs, ordering matters. `add_edge(u,v)` will add an edge from `u` to `v`

In [None]:
D = nx.DiGraph()

D.add_nodes_from(range(10))
D.add_edges_from([(i,i+2) for i in range(8)])

D.edges()

In [None]:
D.has_edge(0, 2), D.has_edge(2, 0)

### Exercises

For the `FizzBuzz` graph, add edges betweeen two nodes `u` and `v` if they are both divisible by 2 or by 7. Each edge should include attributes `div2` and `div7` which are true if `u` and `v` are divisible by 2 and 7 respecitively. Exclude self loops.

In [None]:
...

## Multigraphs

Multigraphs can have multiple edges between any two nodes. They are referenced by a key.

In [None]:
M = nx.MultiGraph()

M.add_edge(0,1)
M.add_edge(0,1)

In [None]:
M.edges()

The keys of the edges can be accessed by using the keyword `keys=True`. This will give a tuple of `(u,v,k)`, with the edge being `u` and `v` and the key being `k`.

In [None]:
M.edges(keys=True)

`MultiDraphs` and `MultiDiGraphs` are similar to `Graphs` and `DiGraphs` in most respects

## Adding Graph Motifs

In addition to adding nodes and edges one at a time `networkx` has some convenient functions for adding complete subgraphs.

In [None]:
Gc = nx.Graph()
Gs = nx.Graph()
Gp = nx.Graph()

nx.add_cycle(Gc, range(10))
nx.add_star(Gs, range(10))
nx.add_path(Gp, range(10))

print(f"Cycle: {Gc.edges()}")
print(f"Star: {Gs.edges()}")
print(f"Path: {Gp.edges()}")

# Basic Graph Properties

Basic graph properties are functions which are member of the `Graph` class itself. We'll explore different metrics in part III.

## Node and Edge Counts

The _order_ of a graph is the number of nodes, it can be accessed by calling `G.order()` or using the builtin length function: `len(G)`.

In [None]:
G = nx.Graph()
nx.add_star(G, range(10))

G.order(), len(G)

The number of edges is usually referred to as the _size_ of the graph, and can be accessed by `G.size()`. You could also find out by calling `len(G.edges())`, but this is much slower.

In [None]:
G.size(), len(G.edges)

Number of nodes and edges can be also obtained by explicitly calling functions `number_of_nodes` and `number_of_edges`

In [None]:
G.number_of_nodes(), G.number_of_edges()

For multigraphs it counts the number of edges includeing multiplicity

In [None]:
M.size()

## Node Neighbors

Node neighbors can be accessed via the `neighbors` function (which returns an iterator) 

In [None]:
G.edges(0)

In [None]:
list(G.neighbors(0)), list(G.neighbors(1))

In the case of directed graphs, neighbors are only those originating at the node.

In [None]:
list(range(1, 10, 2)), list(range(2, 10, 2))

In [None]:
D = nx.DiGraph()

D.add_edges_from([(0, i) for i in range(1, 10, 2)])
D.add_edges_from([(i, 0) for i in range(2, 10, 2)])

list(D.neighbors(0)), list(D.neighbors(1)), list(D.neighbors(2))

For multigraphs, neighbors are only reported once.

In [None]:
list(M.neighbors(0))

## Degree

`DegreeView` (obtained via `G.degree`) provides a dynamic view of the degree of each node in the graph. For each node, it presents a pair consisting of the node label and its corresponding degree (the number of edges connected to that node). This view is iterable, allowing you to efficiently iterate through the degree of all nodes, and it updates automatically as the graph structure changes. For directed graphs, you can access in-degree and out-degree using `G.in_degree` and `G.out_degree`, respectively, which also return `DegreeView` objects.

You can use these as properties (`G.degree`) or functions (`G.degree()`)

In [None]:
G.degree

In [None]:
D.in_degree

In [None]:
D.out_degree

Both of these can be called on a single node or a subset of nodes if not all degrees are needed

In [None]:
D.in_degree(5), D.out_degree([0, 1, 2])

You can also calculate weighted degree. To do this each edge has to have specific attribute to be used as a weight.

In [None]:
WG = nx.Graph()

nx.add_star(WG, range(10))

for (u, v) in WG.edges:
    WG[u][v]["weight"] = (u + v) / 2

In [None]:
WG.edges(data=True)

In [None]:
WG.degree(weight="weight")

# Exercises

## Exercise: Social Network Analysis

* **Description:**
    * Create an undirected graph called `social_network` to represent friendships in a small community.
    * Add nodes representing people: "Alice", "Bob", "Charlie", "David", "Emily".
    * Add edges to represent friendships: Alice is friends with Bob and Charlie, Bob is friends with Charlie and David, Charlie is friends with David and Emily, and David is friends with Emily.
    * Add an attribute "age" to each person with the following values: Alice (25), Bob (30), Charlie (28), David (22), Emily (27).

* **Tasks:**
    1.  Print the number of people (nodes) in the network.
    2.  List all of Alice's friends (neighbors).
    3.  Calculate and print the average age of the people in the network.

## Exercise: Task Dependency Graph

* **Description:**
    * Create a directed graph called `task_dependency_graph` to represent tasks in a project and their dependencies.
    * Add nodes representing tasks: "Task A", "Task B", "Task C", "Task D".
    * Add directed edges to represent dependencies: Task A must be completed before Task B, Task B before Task C, and Task A before Task D.
    * Add an attribute "time_estimate" (in days) to each task: Task A (5 days), Task B (3 days), Task C (4 days), Task D (6 days).

* **Tasks:**
    1.  Print the total number of tasks (nodes) and dependencies (edges).
    2.  Determine and print how many tasks need to be completed before "Task C" can start.
    3.  Calculate and print the total estimated time for all tasks in the project.

## Exercise: Analyzing a Small Network

* **Description:**
    * Create an undirected graph called `small_network` and add the following edges: (1, 2), (2, 3), (3, 4), (4, 1), (1, 5), (2, 6).


* **Tasks:**
    1.  Draw the graph on paper
    2.  Determine the number of nodes and edges in the graph programmatically
    3.  Calculate the degree of node 1 and node 2 programmatically.