# GNN: Graph Neural Network

learning representations of the nodes

representations that we learn from graphs can encode properties of the structure of the graph

Uses insights from:
- past order
- food items connected to past orders
- similar users

Objective: find a vector representation such that nodes that are structurally similar in the graph have similar representations.`

**Why GNNs?**
Use a neural network to obtain a representation for a node by aggregating the representations of neighboring nodes in a recursive fashion limited to a certain depth.

Advantages:
- Learning scalable to large graphs (neighboring nodes are sampled to be a certain fixed amount when obtaining the representation of a specific node)
- A representation can be induced for a newly added node by virtue of its basic features and connections.

The Uber Eats recommendation system can be broken down into two phases: 
- **candidate generation**: pre-filtering can be based on factors such as geographical location
- **personalized ranking**: ML model that ranks the pre-filtered dish and restaurant candidates based on additional contextual information (ordering certain types of food on specific days of the week or different types of dishes for lunch and dinner)

Two bipartite graphs: 
- nodes (users, dishes) and edges (number of times a user ordered a specific dish)
- nodes (users, restaurants) and edges (how many times a user ordered from a specific restaurant)

https://github.com/Quantyca/demo-ateam-ai-misc/blob/master/recommendation/CollabMovielens.ipynb

https://github.com/williamleif/GraphSAGE/blob/master/README.md

In [172]:
import numpy as np

In [180]:
data = [[1,2,4],
        [1,3,2],
        [2,1,3],
        [2,2,2],
        [3,1,5],
        [3,3,3]]

users = set([1,2,3,4,5,6,7])
items = set([1,2,3])

In [181]:
# Using a Python dictionary to act as an adjacency list
graph = {
    1 : [2,3],
    2 : [1,2],
    3 : [1,3]}

In [182]:
users_feat = {
    1 : np.array([1,1,1]),
    2 : np.array([1,1,1]),
    3 : np.array([1,1,1])}

items_feat = {
    1 : np.array([2,2,2]),
    2 : np.array([2,2,2]),
    3 : np.array([2,2,2])}

In [183]:
def W(k):
    w = np.array([[1,0,0],
                  [0,1,0],
                  [0,0,1]])
    return w

def B(k):
    b = np.array([[1,0,0],
                  [0,1,0],
                  [0,0,1]])
    return b

In [184]:
sigma = 0.01

def h(k,v):
    if k == 0:
        #print('Leaf')
        #print(nodes_feat[v])
        return nodes_feat[v]
    else:
        #print('node : ' + str(v))
        N = graph[v]
        #print('----' + str(k) +'----')
        agg = sum([h(k-1,u)/len(N) for u in N])
        #print('----------')
        return sigma*(np.dot(W(k),agg) + np.dot(B(k),h(k-1,v)))

In [185]:
print('Racine : ' + str(1))
print(h(1,1))
print(h(1,2))
print(h(1,3))

Racine : 1
[0.035 0.035 0.035]
[0.035 0.035 0.035]
[0.05 0.05 0.05]


In [186]:
import random
delta = 1

def loss(data, k):
    l = 0
    for line in data:
        u, v, r = line[0],line[1],line[2]
        N = graph[u]
        
        # Nodes not connected to u
        not_N = users
        for i in N:
            not_N.discard(i)
        
        # Select random node from not_N
        n = random.choice(list(not_N))
        
        # Get node embedding
        z_u = h(k,u)
        z_v = h(k,v)
        z_n = h(k,n)
        
        l += max(0, delta + np.dot(z_u,z_v) - np.dot(z_u,z_n))
    return l

In [187]:
def similarity(u,i):
    z_u = h(k,u)
    z_v = h(k,i)
    s = 0
    for i in range(len(z_u)):
        s += (z_u[i]-z_v[i])**2
    return np.sqrt(s)

In [None]:
def GNN():

### DFS

In [None]:
visited = set() # Set to keep track of visited nodes.

def dfs(visited, graph, node, level):
    print(str(node) + ' (level ' + str(level) + ')')
    level -= 1
    if level != 0:
            visited.add(node)
            for neighbour in graph[node]:
                dfs(visited, graph, neighbour,level)
                
#dfs(visited, graph, 'A', 3)