# Network Analysis with Python 2

You'll gain the conceptual and practical skills to analyze evolving time series of networks, learn about bipartite graphs, and how to use bipartite graphs in product recommendation systems. You'll also learn about graph projections, why they're so useful in Data Science, and figure out the best ways to store and load graph data from files.

In [1]:
import matplotlib.pyplot as plt
%matplotlib inline
import pickle
import networkx as nx

path = 'data/dc24/'

## Definitions and Recap

<img src="images/graphs201.png" alt="" style="width: 400px;"/>

<img src="images/graphs202.png" alt="" style="width: 400px;"/>


In [None]:
# Add the degree centrality score of each node to their metadata dictionary
dcs = nx.degree_centrality(G)
for n in G.nodes():
    G.node[n]['centrality'] = dcs[n]

## Bipartite graphs

<img src="images/graphs203.png" alt="" style="width: 400px;"/>

<img src="images/graphs204.png" alt="" style="width: 400px;"/>

<img src="images/graphs205.png" alt="" style="width: 400px;"/>

<img src="images/graphs206.png" alt="" style="width: 400px;"/>


In [None]:
# https://networkx.github.io/documentation/stable/reference/readwrite/yaml.html
#nx.write_yaml(G, 'path_for_yaml_file')

# https://networkx.github.io/documentation/stable/reference/readwrite/gexf.html
# nx.write_gexf(G, 'file_name')

In [11]:
G = nx.read_gexf(path+'G.xml')

In [13]:
type(G.edges())

networkx.classes.reportviews.EdgeView

## The bipartite keyword

The `'bipartite'` keyword is part of a `node's metadata dictionary`, and can be assigned both when you add a node and after the node is added. Remember, though, that by definition, in a **bipartite graph**, `a node cannot be connected to another node in the same partition`.

Here, you're going to write a function that returns the nodes from a given partition in a **bipartite graph**. In this case, the relevant partitions of the Github bipartite graph you'll be working with are 'projects' and 'users'.

- Write a function called `get_nodes_from_partition()` which accepts two arguments - a bipartite graph G and a partition of G - and returns just the nodes from that partition.
    - Iterate over all the nodes of G (not including the metadata) using a for loop.
    - Access the `'bipartite'` keyword of the current node's metadata dictionary. If it equals partition, append the current node to the list nodes.
- Use your `get_nodes_from_partition()` function together with the `len()` function to:
    - Print the number of nodes in the 'projects' partition of G.
    - Print the number of nodes in the 'users' partition of G.

In [14]:
# Define get_nodes_from_partition()
def get_nodes_from_partition(G, partition):
    # Initialize an empty list for nodes to be returned
    nodes = []
    # Iterate over each node in the graph G
    for n in G.nodes():
        # Check that the node belongs to the particular partition
        if G.node[n]['bipartite'] == partition:
            # If so, append it to the list of nodes
            nodes.append(n)
    return nodes

# Print the number of nodes in the 'projects' partition
print(len(get_nodes_from_partition(G, 'projects')))

# Print the number of nodes in the 'users' partition
print(len(get_nodes_from_partition(G, 'users')))

11774
10677
