# Hedonic Clustering
> _Finding Clusters with Cooperative Game Theory._<br/>
> A research experiment of _Daniel Sadoc_$^\ast$ and _Lucas Lopes_$^\ast$.<br/>
> $^\ast$Federal University of Rio de Janeiro.<br/>
> December, 2018.

## Parameters
For convenience and reachability, let's put the parameters on top.<br/>
_Verbose_ will print algorithm's steps. To understand the others parameters, click: [dataset](), [initial]() and [alpha]()

In [1]:
#samples/sample_2.csv
#conference/conference.csv
#terrorist/terrorists.csv
dataset = "datasets/samples/sample_2.csv"
initial = ['e']
verbose = True
alpha   = 0.95

## Main Function

The hedonic function tells how good a node is 'fit' in a given network.
It takes:

1. A $\alpha$ value, where $\alpha \in [0,1]$ (_closer to 1 is broader_);
2. The number of vertices in a particular graph;
3. The number of connections that a node (in this graph) has.

And returns the _"score"_ of that node.

In [2]:
def hedonic(alpha, num_vert, my_connections):
    a = (1 - alpha) *             my_connections
    b =      alpha  * (num_vert - my_connections - 1)
    return a - b

## Helper Functions

### Converting a CSV file to a Python Dictionary

Since the graph network is represented as a dictonary in the form$^\ast$, and usually a dataset that stores a network is in the _.csv_ format, it is helpful that have a function that convert a dataset in a form that the algorithm can read and operate on it.

To do so, the _.csv_ file must be in the following way:

| From Node | To Node  |
| :-------: | :------: |
| _number_  | _number_ |

$^\ast$**Key** is the vertices; and **Value** is a list of its connections.

In [3]:
import csv

def join(d, a, b):
    if a not in d:
        d[a] = [b]
    elif b not in d[a]:
        d[a].append(b)
    return d

def csv_2_dict(file):
    d = {}
    with open(file, 'r') as f:
        table = csv.reader(f)
        for row in table:
            a = int(row[0])
            b = int(row[1])
            d = join(d, a, b)
            d = join(d, b, a)
    return d

### Doing graph operations in a Dictionary

It is possible to do many operations in a graph, such as:

- Add a new vertice;
- Remove an old one;
- Move a node from a graph to another.

So here are helper functions to do that above operations. But before that, we need to import the _copy_ library because of this$^\ast$.

$^\ast$`dict.copy()` method do a [shallow copy](https://docs.python.org/2/library/stdtypes.html#dict.copy). We need a [deep copy](https://docs.python.org/2/library/copy.html#copy.deepcopy) to avoid [this problem](https://stackoverflow.com/questions/3975376/understanding-dict-copy-shallow-or-deep).

In [4]:
import copy

def add(other, node):
    g = copy.deepcopy(other)
    g[node] = []
    for key in g:
        if node in graph[key]:
            g[key].append(node)
            g[node].append(key)
    return g

def remove(original, node):
    g = copy.deepcopy(original)
    for v in g[node]:
        g[v].remove(node)
    g.pop(node)
    return g

def move(values, refer, other):
    if type(values) is list:
        for v in values:
            other = add(other, v)
            refer = remove(refer, v)
    else:
        other = add(other, values)
        refer = remove(refer, values)
    return refer, other

### Creating the initial condicions

In [5]:
import random

def rand(amount):
    v = []
    i = 0
    if amount < 0:
        amount += len(graph)
    while i < amount:
        r = random.choice(list(graph.keys()))
        if r not in v:
            v.append(r)
            i += 1
    return v

In [6]:
def init(option):
    
    if option[0] == 's':
        return option[1]
    
    if option[0] == 'r':
        return rand(option[1])
    
    if option[0] == 'e':
        return []

### Printing steps information

In [7]:
def print_desire(n, m, s):
    print('node:', n, '| move: {:.2f} | stay: {:.2f}'.format(m, s))

def print_node(w):
    print('\n-> will move:', w[0], '| its alpha: {:.2f}'.format(w[1]))
    
separator  = '\n%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\n'
graph_diff = '----- /remain\ ---- \cluster/ -----'

## Learning Funcions

### Tells if a node what to move and how much

In [8]:
def want_to_move(node, here, other):
    stay_desire = hedonic(alpha, len(here), len(here[node]))
    there       = add(other, node)
    move_desire = hedonic(alpha, len(there), len(there[node]))
    
    if verbose: print_desire(node, move_desire, stay_desire)
    
    if move_desire >= stay_desire:
        return move_desire
    else:
        return float("-inf")

### Who is the node that want to move?

In [9]:
def who_want_move(chosen_node, my_graph, other_graph):

    for key in my_graph:
        node_desire = want_to_move(key, my_graph, other_graph)
        
        if chosen_node[1] < node_desire:
            chosen_node = [key, node_desire]
    
    return chosen_node

### Trainning Loop

In [10]:
def local_cluster(remain, cluster):
    
    wanna_move = True
    while wanna_move:

        if verbose:   print(separator)
        chosen_node = [None, float("-inf")]
        chosen_node = who_want_move(chosen_node, remain, cluster)
        if verbose:   print(graph_diff)
        chosen_node = who_want_move(chosen_node, cluster, remain)
        if verbose:   print_node(chosen_node)
        
        if chosen_node[0]:
            if chosen_node[0] in remain:
                remain, cluster = move(chosen_node[0], remain, cluster)
            else:
                remain, cluster = move(chosen_node[0], cluster, remain)
        else:
            wanna_move = False

    return remain, cluster

### Create the Remain and Cluster graphs and update them

In [11]:
graph = csv_2_dict(dataset)
init  = init(initial)

In [12]:
from datetime import datetime
begin = datetime.now()

In [13]:
remain, cluster = move(init, graph, {})
remain, cluster = local_cluster(remain, cluster)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

node: 1 | move: 0.00 | stay: -5.65
node: 3 | move: 0.00 | stay: -2.65
node: 2 | move: 0.00 | stay: -5.65
node: 4 | move: 0.00 | stay: -2.65
node: 5 | move: 0.00 | stay: -2.65
node: 6 | move: 0.00 | stay: -2.65
node: 7 | move: 0.00 | stay: -4.65
node: 8 | move: 0.00 | stay: -4.65
----- /remain\ ---- \cluster/ -----

-> will move: 1 | its alpha: 0.00

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

node: 3 | move: 0.05 | stay: -2.70
node: 2 | move: -0.95 | stay: -4.70
node: 4 | move: -0.95 | stay: -1.70
node: 5 | move: -0.95 | stay: -1.70
node: 6 | move: -0.95 | stay: -1.70
node: 7 | move: -0.95 | stay: -3.70
node: 8 | move: -0.95 | stay: -3.70
----- /remain\ ---- \cluster/ -----
node: 1 | move: -5.65 | stay: 0.00

-> will move: 3 | its alpha: 0.05

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

node: 2 | move: -1.90 | stay: -3.75
node: 4 | move: -0.90 | stay: -1.75
node: 5 | move: -0.90 | stay: -1.75
node: 6 | move: -0.90 | stay: -1.75
node: 7 | move: -1.90 | stay: -2.7

In [14]:
done = datetime.now()
print('Finish in:', done - begin)

Finish in: 0:00:00.042851


## Results

In [15]:
if verbose:
    print(graph)
    print(cluster)
    print(remain)

{1: [3], 3: [1, 4, 5, 6], 2: [4], 4: [2, 3, 5, 6], 5: [3, 4, 6, 7], 6: [3, 4, 5, 8], 7: [5, 8], 8: [6, 7]}
{1: [3], 3: [1, 4, 5, 6], 4: [3, 5, 6], 5: [3, 4, 6], 6: [3, 4, 5]}
{2: [], 7: [8], 8: [7]}


In [16]:
import json
 
j = json.dumps(cluster)
f = open("cluster.json","w")
f.write(j)
f.close()