# Ch7. Working with Network Data

<div id="toc"></div>

## Unit38_Dissecting Graphs

### Graph Elements, Types, and Density

### Graph Structure

### Centralities

* Degree  
* Closeness  
* Betweenness  
* Eigenvector  

## Unit39_Network Analysis Sequence

## Unit40_Harnessing Networkx

### Building and Fixing a Network

In [None]:
import networkx as nx
borders = nx.Graph()
not_borders1 = nx.DiGraph() # Just for our reference
not_borders2 = nx.MultiGraph() # Just for our reference

You can modify an existing network graph by adding or removing individual
nodes or edges, or groups of nodes or edges. When you remove a node, all
incident edges are removed, too. When you add an edge, its end nodes are
added, too, unless they already existed in the graph. You can label nodes
with either numbers or strings:

In [None]:
borders.add_node( "Zimbabwe" )
borders.add_nodes_from([ "Lugandon" , "Zambia" , "Portugal" , "Kuwait" ,
                        "Colombia" ])
borders.remove_node( "Lugandon" )
borders.add_edge( "Zambia" , "Zimbabwe" )
borders.add_edges_from([( "Uganda" , "Rwanda" ), ( "Uganda" , "Kenya" ),
                        ( "Uganda" , "South Sudan" ), ( "Uganda" , "Tanzania" ),
                        ( "Uganda" , "Democratic Republic of the Congo" )])

* http://en.wikipedia.org/wiki/List_of_countries_and_territories_by_land_borders

### Exploring and Analyzing a Network

In [None]:
len(borders)

In [None]:
borders.nodes()

In [None]:
borders.node

In [None]:
borders.edge

In [None]:
borders.edges()[:5]

In [None]:
borders.neighbors( "Germany" )

In [None]:
borders.degree( "Poland" )

In [None]:
borders.degree()

In [None]:
degrees = pandas.DataFrame(list(borders.degree().items()),
                           columns=( "country" , "degree" )).set_index( "country" )
degrees.sort( "degree" ).tail(4)

In [None]:
nx.clustering(not_borders1) # Doesn't work for a directed network!
nx.clustering(nx.Graph(not_borders1)) # Would work!
nx.clustering(borders)

In [None]:
nx.clustering(borders, "Lithuania" )

In [None]:
list(nx.weakly_connected_components(borders)) # Doesn't work!
list(nx.connected_components(borders)) # Works!

In [None]:
[len(x) for x in nx.connected_component_subgraphs(borders)]

In [None]:
nx.degree_centrality(borders) # People's Republic of China
nx.in_degree_centrality(borders)
nx.out_degree_centrality(borders)
nx.closeness_centrality(borders) # France
nx.betweenness_centrality(borders) # France
nx.eigenvector_centrality(borders) # Russia

* http://networkit.iti.kit.edu

* http://gephi.org

### Managing Attributes

In [None]:
# Edge attribute
borders[ "Germany" ][ "Poland" ][ "weight" ] = 456.0
# Node attribute
borders.node[ "Germany" ][ "area" ] = 357168
borders.add_node( "Penguinia" , area=14000000)

In [None]:
borders.nodes(data=True)

In [None]:
borders.edges(data=True)

### Cliques and Community Structure

In [None]:
nx.find_cliques(not_borders1) # Not implemented for digraphs!
nx.find_cliques(nx.Graph(not_borders1)) # Would work!
list(nx.find_cliques(borders))

In [None]:
nx.isolates(borders)

In [None]:
import community
partition = community.best_partition(borders)

In [None]:
community.modularity(partition, borders)

### Input and Output

In [None]:
with open( "borders.graphml" , "wb" ) as netfile:
    nx.write_pajek(borders, netfile)
with open( "file.net" , "rb" ) as netfile:
    borders = nx.read_pajek(netfile)

* http://gephi.org/users/supported-graph-formats/

## Your Turn

* http://www.slideshare.net/DmitryZinoviev/desdemona-52994413  
* http://snap.stanford.edu/data/soc-Epinions1.html  
* http://shakespeare.mit.edu  