# Analyzing Wikipedia voter network

Download the Wikipedia voting network wiki-Vote.txt.gz: http://snap.stanford.edu/ data/wiki-Vote.html.

Using one of the network analysis tools above, load the Wikipedia voting network. Note that Wikipedia is a directed network. Formally, we consider the Wikipedia network as a directed graph G = (V, E), with node set V and edge set E ⊂ V × V where (edges are ordered pairs of nodes). An edge (a, b) ∈ E means that user a voted on user b.

To make our questions clearer, we will use the following small graph as a running example: Gsmall = (Vsmall, Esmall), where Vsmall = {1, 2, 3} and Esmall = {(1, 2), (2, 1), (1, 3), (1, 1)}.

Compute and print out the following statistics for the wiki-Vote network: 

1. The number of nodes in the network. (Gsmall has 3 nodes.)

2. The number of nodes with a self-edge (self-loop), i.e., the number of nodes a ∈ V where (a, a) ∈ E. (Gsmall has 1 self-edge.)

3. The number of directed edges in the network, i.e., the number of ordered pairs (a, b) ∈ E for which a ̸= b. (Gsmall has 3 directed edges.)
 
4. The number of undirected edges in the network, i.e., the number of unique unordered pairs (a,b), a ̸= b, for which (a,b) ∈ E or (b,a) ∈ E (or both). If both (a,b) and (b,a) are edges, this counts a single undirected edge. (Gsmall has 2 undirected edges.)

5. The number of reciprocated edges in the network, i.e., the number of unique unordered pairs of nodes (a, b), a ̸= b, for which (a, b) ∈ E and (b, a) ∈ E. (Gsmall has 1 reciprocated edge.)

6. The number of nodes of zero out-degree. (Gsmall has 1 node with zero out-degree.)

7. The number of nodes of zero in-degree. (Gsmall has 0 nodes with zero in-degree.)

8. The number of nodes with more than 10 outgoing edges (out-degree > 10).

9. The number of nodes with fewer than 10 incoming edges (in-degree < 10).

Each sub-question is worth 3 points.

In [10]:
import snap
import numpy as np

In [2]:
from google.colab import drive

In [3]:
drive.mount('/content/drive/')

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).


In [5]:
G = snap.LoadEdgeList(snap.PNGraph, "/content/drive/MyDrive/Colab Notebooks/Wiki-Vote.txt", 0, 1)

In [9]:
# Problem (1)

G.GetNodes()

7115

In [11]:
# Problem (2)

snap.CntSelfEdges(G)

0

In [13]:
# Problem (3)

snap.CntUniqDirEdges(G)

103689

In [15]:
# Problem (4)

snap.CntUniqUndirEdges(G)

100762

In [14]:
# Problem (5)

snap.CntUniqDirEdges(G) - snap.CntUniqUndirEdges(G)

2927

In [16]:
# Problem (6)

snap.CntOutDegNodes(G,0)

1005

In [19]:
# Problem (7)

snap.CntInDegNodes(G,0)

4734

In [21]:
# Problem (8,9)

node_ten_out = 0
node_ten_in = 0
for i in G.Nodes():
  if i.GetOutDeg() > 10:
    node_ten_out += 1
  if i.GetInDeg() < 10:
    node_ten_in += 1
print("nodes with > 10 outgoing edges ", node_ten_out) 
print("nodes with < 10 incoming edges ", node_ten_in)


nodes with > 10 outgoing edges  1612
nodes with < 10 incoming edges  5165
