# Colaboratory Assignment 7.1

**Instructions**. Below you will find several text cells with programming (short) problems. You can create how many code cells you need to answer them.

There are four problems, but you will only need to solve two. You **must** choose at least one of the problems with the title in <font color='#006633'>green</font>.


**BEFORE YOU START**

Make sure to run the code cell below, to fix the adjacency matrix problem. Also, remember that the next code cell should be the first thing you evaluate. Otherwise, you will to restart your runtime and reimport `networkx`

In [None]:
!pip uninstall scipy networkx
!pip install scipy==1.8
!pip install networkx==2.7

In [None]:
import networkx as nx
nx.__version__

In [None]:
from google.colab import drive
drive.mount('/content/drive')
import matplotlib.pyplot as plt
import sys
sys.path.append('/content/drive/MyDrive/ColabNotebooks')
from readlist import readlist

## 0. Preliminary Work (Mandatory)

Some of the problems in this assignment require a network with around 500 nodes and 1500 links. More than it, and the functions we will use to count triangles will take too long to calculate. Less than that, and the calculation will be too fast to notice difference between methods.

Therefore, we will use a `networkx` function to generate a random network with a specific number of nodes and links. What's interesting about this function, is that we can control $n$ and $m$, but the actual wiring is random. Therefore, each one of you will likely get a different network. The function is `nx.gnm_random_graph(n, m)`, and you can find the documentation [here](https://networkx.org/documentation/stable/reference/generated/networkx.generators.random_graphs.gnm_random_graph.html#networkx.generators.random_graphs.gnm_random_graph)

Using this function, create network $G$ with $n = 550$ and $m = 1800$

In [None]:
G = nx.gnm_random_graph(550, 1800)

## 1. Counting Triangles Inefficiently

In the slides, you could see several ways to count triangles. There are some ways that make the count faster, which becomes extremely important when the number of nodes grows.

We will try to make a count using a function defined in the videos.

1. Modify the function `triangle_present` to return 1 when the triangle is present, and 0 otherwise.
2. Use this modified function to obtain the number of global triangles in the random network $G$ (previously created).
3. Save the time it took to run. You can use the methods shown in previous colaboratory assignments (i.e. using the `datetime` module); or manually save the time shown in colaboratory.

In [None]:
import time

In [None]:
#1 Modify the function triangle_present to return 1 when the triangle is present, and 0 otherwise.
def a(G, i, j):
    if G.has_edge(i, j):
        return 1
    else:
        return 0

def triangle_present(G, i, j, q):
    if a(G, i, j) * a(G, j, q) * a(G, q, i) == 1:
        return 1
    else:
        return 0

#2 Use this modified function to obtain the number of global triangles in the random network  ùê∫  (previously created).
triangles = 0
n = 550
start = time.time()
for i in range(1, n - 1):
    for j in range(i + 1, n):
        for q in range(j + 1, n + 1):
            if triangle_present(G, i, j, q):
              triangles += 1
end = time.time()


#3 Save the time it took to run. You can use the methods shown in previous colaboratory assignments (i.e. using the datetime module); or manually save the time shown in colaboratory.
time_elapsed = end - start
print(time_elapsed, triangles)

## 2. Improving Efficiency

Although the procedure used in the previous problem does not seem like a slow method, the use of a 3-level nested `for` loop makes it slow, compared to alternatives. We have to pass through every node in the network several times.

We should always be careful on the computational time it takes to count triangles for a network, or at least identify which method is more efficient.

We could try counting triangles using the adjacency matrix. To do this, repeat the count for the local and global triangles in the random network $G$ using the adjacency matrix. Make sure to also time your execution, for comparison purposes.


In [None]:
def TriangCheck(G, i, h, q):
    if G.has_edge(i, h) and G.has_edge(h, q) and G.has_edge(q, i):
        return 1
    else:
        return 0

def T(G):
    NL = list(G.nodes())
    n = G.number_of_nodes()
    T = 0
    for c1 in range(n - 2):
        for c2 in range(c1 + 1, n - 1):
            for c3 in range(c2 + 1, n):
                i = NL[c1]
                h = NL[c2]
                q = NL[c3]
                T += TriangCheck(G, i, h, q)
    return T

T(G)

## <font color='#006633'>3. Simulate triangles</font>

For this problem, you have to define a `Python` function that takes as input `T`, the number of triangles (global); and returns a network with `T` triangles. To make things easier, create the network to make each triangle a cluster. In practice, this means that every triangle is going to be *isolated* from the rest of the network.

To test your function, create a network with 7 triangles and draw it. Then, use any of the methods in the slides or the videos to count the number of triangles. You should obtain 7. **Hint**: To label the nodes, the easiest way will be using multiples of an integer. For example, if you use powers of 10 your first triangle will contain the nodes $(1, 10, 100)$, your second triangle will contain the nodes $(2, 20, 200)$, etc. This won't be a problem wince we will only create 7 triangles.

In [None]:
def make_triangle(T):
  tri_net = nx.Graph()
  for i in range(1,T+1):
    tri_net.add_edges_from([(i,i*10),(i*10,i*100),(i*100,i)])
  return tri_net

nx.draw_networkx(make_triangle(7))

## <font color='#006633'>4. Count triangles and show the results</font>

Even though this is not covered completely in the slides, this problem involves a **triangle histogram**. The logic behind this is the same as previously created histograms (like degree and shortest path). Using a loop, we will go through every node and count how many triangles do that node belongs to. Then, we update the **frequency** of that number of triangles in the network. The steps in detail are

1. Using `networkx`'s function `triangles()`, create a dictionary that contains the information on how many triangles each node participates in. For example, if you want to use the function in the network `G`, you simply write `nx.triangles(G)`
3. Create an empty dictionary `H` that will contain the histogram data
2. Loop through the nodes in the dictionary created in the previous step. For each one of them, save the number of triangles as `t`
3. Update `H` with the information of every node. You want to do something like `H[t] = H.get(t, 0) + 1`

Use the steps above to show the frequencies of the number of triangles in random networkx $G$. Plot your results

In [None]:
G = nx.Graph()
n = 51

for i in range(1,n):
  G.add_edge(i,i+1)
  G.add_edge(1,i+1)

nx.draw_networkx(G)

In [None]:
def TriangCheck(G, i, h, q):
    if G.has_edge(i, h) and G.has_edge(h, q) and G.has_edge(q, i):
        return 1
    else:
        return 0

def T(G):
    NL = list(G.nodes())
    n = G.number_of_nodes()
    T = 0
    for c1 in range(n - 2):
        for c2 in range(c1 + 1, n - 1):
            for c3 in range(c2 + 1, n):
                i = NL[c1]
                h = NL[c2]
                q = NL[c3]
                T += TriangCheck(G, i, h, q)
    return T

T(G)