# HW1

* Robert Hensley
* [Snap Reference Manual](https://snap.stanford.edu/snappy/doc/reference/index-ref.html)

In [1]:
import snap
import gzip

## Importing Data

* [reading gzip data](https://www.tutorialspoint.com/python-support-for-gzip-files-gzip)
* [how to load the data from a file of bytes](https://snap.stanford.edu/snappy/doc/reference/LoadEdgeListStr.html)

In [9]:
# read the gzip file
with gzip.open("amazon0601.txt.gz", "rb") as f:
    data = f.read()

# write to txt as bytes
with open("amazon0601.txt", "wb") as f:
    f.write(data)

In [11]:
G = snap.LoadEdgeListStr(snap.PNGraph, "amazon0601.txt", 0, 1)

In [12]:
G

<snap.PNGraph; proxy of <Swig Object of type 'PNGraph *' at 0x00000214E76CBE70> >

## Objectives 

* Number of nodes in the graph 

In [13]:
print("Number of Nodes: %d" % G.GetNodes())

Number of Nodes: 403394


* Number of directed edges in the graph. This is the number of edges (a,b) ∈ E, where a ≠ b. 

In [15]:
Count = snap.CntUniqDirEdges(G)
print("Directed Graph: Count of unique directed edges: %d" % Count)

Directed Graph: Count of unique directed edges is 3387388


* Number of undirected edges in the graph. Same as above, except that if both (a,b) and (b,a) are in E, they count once. 

In [17]:
Count = snap.CntUniqUndirEdges(G)
print("Directed Graph: Count of unique undirected edges: %d" % Count)

Directed Graph: Count of unique undirected edges is 2443408


* Number of nodes with zero out-degree 

In [19]:
Count = snap.CntOutDegNodes(G, 0)
print("Directed Graph: Count of nodes with out-degree 0: %d" % Count)

Directed Graph: Count of nodes with out-degree 0 is 955


* Number of nodes with zero in-degree 

In [20]:
Count = snap.CntInDegNodes(G, 0)
print("Directed Graph: Count of nodes with in-degree 0: %d" % Count)

Directed Graph: Count of nodes with in-degree 0 is 82


* Number of nodes with more in-coming edges than out-going edges 
    * they all seem to have the same number of out-degrees (10)

In [31]:
Count = len([NI.GetId() for NI in G.Nodes() if NI.GetInDeg() > NI.GetOutDeg()])
print("Count of nodes with more in-degrees than out-degrees: %d" % Count)

Count of nodes with more in-degrees than out-degrees is 96986


## Final Program

In [32]:
import snap
import gzip

# read the gzip file
with gzip.open("amazon0601.txt.gz", "rb") as f:
    data = f.read()

# write to txt as bytes
with open("amazon0601.txt", "wb") as f:
    f.write(data)

# Create A Graph Object
G = snap.LoadEdgeListStr(snap.PNGraph, "amazon0601.txt", 0, 1)

# Number of nodes in the graph 
print("Number of Nodes: %d" % G.GetNodes())

# Number of directed edges in the graph
Count = snap.CntUniqDirEdges(G)
print("Directed Graph: Count of unique directed edges: %d" % Count)

# Number of undirected edges in the graph
Count = snap.CntUniqUndirEdges(G)
print("Directed Graph: Count of unique undirected edges: %d" % Count)

# Number of nodes with zero out-degree 
Count = snap.CntOutDegNodes(G, 0)
print("Directed Graph: Count of nodes with out-degree 0: %d" % Count)

# Number of nodes with zero in-degree
Count = snap.CntInDegNodes(G, 0)
print("Directed Graph: Count of nodes with in-degree 0: %d" % Count)

# Number of nodes with more in-coming edges than out-going edges
Count = len([NI.GetId() for NI in G.Nodes() if NI.GetInDeg() > NI.GetOutDeg()])
print("Count of nodes with more in-degrees than out-degrees: %d" % Count)

Number of Nodes: 403394
Directed Graph: Count of unique directed edges: 3387388
Directed Graph: Count of unique undirected edges: 2443408
Directed Graph: Count of nodes with out-degree 0: 955
Directed Graph: Count of nodes with in-degree 0: 82
Count of nodes with more in-degrees than out-degrees: 96986
