## Basic drawing of a network using NetworkX

NetworkX provides some basic drawing functionality that works for small graphs. We have selected a subset of nodes from the graph for you to practice using NetworkX's drawing facilities. It has been pre-loaded as T_sub.

### Instructions
    - Import matplotlib.pyplot as plt and networkx as nx.
    - Draw T_sub to the screen by using the nx.draw() function, and don't forget to also use plt.show() to display it.

In [None]:
# Import necessary modules
import matplotlib.pyplot as plt
import networkx as nx

# Draw the graph to screen
nx.draw(T_sub)
plt.show()

## Queries on a graph

Now that you know some basic properties of the graph and have practiced using NetworkX's drawing facilities to visualize components of it, it's time to explore how you can query it for nodes and edges. Specifically, you're going to look for "nodes of interest" and "edges of interest". To achieve this, you'll make use of the .nodes() and .edges() methods that Eric went over in the video. The .nodes() method returns a Node view iterable, while the .edges() method returns an edge view iterable, in which each tuple shows the nodes that are present on that edge. Recall that passing in the keyword argument data=True in these methods retrieves the corresponding metadata associated with the nodes and edges as well.

You'll write list comprehensions to effectively build these queries in one line. For a refresher on list comprehensions, refer to Part 2 of DataCamp's Python Data Science Toolbox course. Here's the recipe for a list comprehension:

[ output expression for iterator variable in iterable if predicate expression ].

You have to fill in the _iterable_ and the _predicate expression_. Feel free to prototype your answer by exploring the graph in the IPython Shell before submitting your solution.

### Instructions
    - Use a list comprehension to get a list of nodes from the graph T that have the 'occupation' label of 'scientist'.
        - The output expression n has been specified for you, along with the iterator variables n and d. Your task is to fill in the iterable and the conditional expression.
        - Use the .nodes() method of T access its nodes, and be sure to specify data=True to obtain the metadata for the nodes.
        - The iterator variable d is a dictionary. The key of interest here is 'occupation' and value of interest is 'scientist'.
    - Use a list comprehension to get a list of edges from the graph T that were formed for at least 6 years, i.e., from before 1 Jan 2010.
        - Your task once again is to fill in the iterable and conditional expression.
        - Use the .edges() method of T to access its edges. Be sure to obtain the metadata for the edges as well.
        - The dates are stored as datetime.date objects in the metadata dictionary d, under the key 'date'. To access the date 1 Jan 2009, for example, the dictionary value would be date(2009, 1, 1).

In [None]:
# Use a list comprehension to get the nodes of interest: noi
noi = [n for n, d in T.nodes(data=True) if d['occupation'] == 'scientist']

# Use a list comprehension to get the edges of interest: eoi
eoi = [(u, v) for u, v, d in T.edges(data=True) if d['date'] < date(2010, 1, 1)]

## Specifying a weight on edges

Weights can be added to edges in a graph, typically indicating the "strength" of an edge. In NetworkX, the weight is indicated by the 'weight' key in the metadata dictionary.

Before attempting the exercise, use the IPython Shell to access the dictionary metadata of T and explore it, for instance by running the commands T.edges[1, 10] and then T.edges[10, 1]. Note how there's only one field, and now you're going to add another field, called 'weight'.

### Instructions
    - Set the 'weight' attribute of the edge between node 1 and 10 of T to be equal to 2. Refer to the following template to set an attribute of an edge: network_name.edges[node1, node2]['attribute'] = value. Here, the 'attribute' is 'weight'.
    - Set the weight of every edge involving node 293 to be equal to 1.1. To do this:
        - Using a for loop, iterate over all the edges of T, including the metadata.
        - If 293 is involved in the list of nodes [u, v]:
            - Set the weight of the edge between u and v to be 1.1.

In [None]:
# Set the weight of the edge
T.edges[1, 10]['weight'] = 2

# Iterate over all the edges (with metadata)
for u, v, d in T.edges(data=True):

    # Check if node 293 is involved
    if 293 in [u, v]:

        # Set the weight to 1.1
        d['weight'] = 1.1

## Checking whether there are self-loops in the graph

As Eric discussed, NetworkX also allows edges that begin and end on the same node; while this would be non-intuitive for a social network graph, it is useful to model data such as trip networks, in which individuals begin at one location and end in another.

It is useful to check for this before proceeding with further analyses, and NetworkX provides a method for this purpose: nx.number_of_selfloops(G).

In this exercise as well as later ones, you'll find the assert statement useful. An assert-ions checks whether the statement placed after it evaluates to True, otherwise it will throw an AssertionError.

To begin, call on the nx.number_of_selfloops() function, passing in T, in the IPython Shell to get the number of edges that begin and end on the same node. A number of self-loops have been synthetically added to the graph. Your job in this exercise is to write a function that returns these edges.

### Instructions
    - Define a function called find_selfloop_nodes() which takes one argument: G.
        - Using a for loop, iterate over all the edges in G (excluding the metadata).
        - If node u is equal to node v:
            - Append u to the list nodes_in_selfloops.
            - Return the list nodes_in_selfloops.
    - Check that the number of self loops in the graph equals the number of nodes in self loops. This has been done for you, so hit 'Submit Answer' to see the result!

In [None]:
# Define find_selfloop_nodes()
def find_selfloop_nodes(G):
    """
    Finds all nodes that have self-loops in the graph G.
    """
    nodes_in_selfloops = []

    # Iterate over all the edges of G
    for u, v in G.edges(data=False):

    # Check if node u and node v are the same
        if u == v:

            # Append node u to nodes_in_selfloops
            nodes_in_selfloops.append(v)

    return nodes_in_selfloops

# Check whether number of self loops equals the number of nodes in self loops
assert nx.number_of_selfloops(T) == len(find_selfloop_nodes(T))

## Visualizing using Matrix plots

It is time to try your first "fancy" graph visualization method: a matrix plot. To do this, nxviz provides a matrix() function. This function, like all of nxviz's top-level API functions, will return a matplotlib axes object that can be displayed with plt.show().

nxviz is a package for visualizing graphs in a rational fashion. Under the hood, the matrix function utilizes nx.to_numpy_matrix(G), which returns the matrix form of the graph. Here, each node is one column and one row, and an edge between the two nodes is indicated by the value 1. In doing so, however, only the weight metadata is preserved; all other metadata is lost, as you'll verify using an assert statement.

A corresponding nx.from_numpy_matrix(A) allows one to quickly create a graph from a NumPy matrix. The default graph type is Graph(); if you want to make it a DiGraph(), that has to be specified using the create_using keyword argument, e.g. (nx.from_numpy_matrix(A, create_using=nx.DiGraph)).

One final note, matplotlib.pyplot and networkx have already been imported as plt and nx, respectively, and the graph T has been pre-loaded. For simplicity and speed, we have sub-sampled only 100 edges from the network.

### Instructions
    - Import matrix from nxviz.
    - Plot the graph T as a matrix plot. To do this:
        - Create the matrix plot called m using the nv.matrix() function with T passed in as an argument.
        - Display the plot using plt.show(). 
    - Convert the graph to a matrix format, and then convert the graph to back to the NetworkX form from the matrix as a directed graph. This has been done for you.
    - Check that the category metadata field is lost from each node. This has also been done for you, so hit 'Submit Answer' to see the results!

In [None]:
# Import nxviz
from nxviz import matrix

# Create the matrix plot: m
m = matrix(T)

# Display the plot
plt.show()

# Convert T to a matrix format: A
A = nx.to_numpy_matrix(T)

# Convert A back to the NetworkX form as a directed graph: T_conv
T_conv = nx.from_numpy_matrix(A, create_using=nx.DiGraph())

# Check that the `category` metadata field is lost from each node
for n, d in T_conv.nodes(data=True):
    assert 'category' not in d.keys()

## Visualizing using Circos plots

Circos plots are a rational, non-cluttered way of visualizing graph data, in which nodes are ordered around the circumference in some fashion, and the edges are drawn within the circle that results, giving a beautiful as well as informative visualization about the structure of the network.

In this exercise, you'll continue getting practice with the nxviz API, this time with the circos plot. matplotlib.pyplot has been imported for you as plt.

### Instructions
    - Import circos from nxviz.
    - Plot the Twitter network T as a Circos plot without any styling. Use the circos() function to do this. Don't forget to display it using plt.show().

In [15]:
## Correct Answer (for sumbitting)
# Import necessary modules
#import matplotlib.pyplot as plt
#from nxviz import circos

# Create the circos plot: c
#c = circos(T)

# Display the plot
#plt.show()

In [25]:
import pandas as pd
import numpy as np
import networkx as nx

df = pd.read_csv("../../data/twitter-followers.csv")
#T = nx.from_pandas_dataframe(df, 'FOLLOWER', 'FOLLOWEE')
T = nx.from_pandas_edgelist(df, 'FOLLOWER', 'FOLLOWEE')

In [None]:
# Import necessary modules
import matplotlib.pyplot as plt
from nxviz import circos

# Create the circos plot: c
c = circos(T)

# Display the plot
plt.show()

## Visualizing using Arc plots

Following on what you've learned about the nxviz API, now try making an Arc plot of the network. Two keyword arguments that you will try here are sort_by='keyX' and node_color_by='keyX', in which you specify a key in the node metadata dictionary to color and order the nodes by.

matplotlib.pyplot has been imported for you as plt.

### Instructions
    - Import arc from nxviz.
    - Create an un-customized Arc plot of T. To do this, use the arc() function with just T as the argument.
    - Create another Arc plot of T in which the nodes are ordered and colored by the 'category' keyword. You'll have to specify the sort_by and node_color_by parameters to do this. For both plots, be sure to draw them to the screen and display them with plt.show().

In [None]:
# Import necessary modules
import matplotlib.pyplot as plt
from nxviz import arc

# Create the un-customized Arc plot: a
a = arc(T)

# Display the plot
plt.show()

# Create the customized Arc plot: a2
a2 = arc(T, sort_by='category', node_color_by='category')

# Display the plot
plt.show()