## Performing box covering with _boxes_

This notebook contains the advisable workflow of using the __boxes__ package for performing box covering.

We will mostly rely on the builtin functions but due to the memory-intensiveness of the computing tasks, we will have to use the __resource__ and __gc__ packages.

In [1]:
import os
import networkx as nx
import copy
import resource
import boxes
import gc
import time
import random
import numpy as np

In [2]:
import matplotlib.pyplot as plt


In [3]:
resource.setrlimit(resource.RLIMIT_AS, (int(4e9),int(4e9))) # control memory usage, in bytes

#### There are unfortunately many parameters that we will treat as keyword arguments
This is a cheat sheet for every implemented algorithm:

As a rule of thumb, _boxing=True_ means that only the box no. is returned.

The majority of implemented algorithms works on undirected, connected graphs, meaning that we only accept undirected networks and extract their largest component.

+ __greedy_coloring__: greedy_coloring(network, lb, boxing=False, pso_position=False, strategy='random_sequential')

set boxing True, otherwise OK to have default keyword args
+ __cbb__: (network, lb, boxing=False)


+ __differential_evolution__: differential_evolution(network, lb, num_p=15, big_f=0.9, cr=0.85, gn=15, boxing=False, dual_new=False)
looks OK for unconnected graphs

meaning of parameters: "$k_1$ gives the approximate number of nodes moved in (i), $k_2$ is the number of maximally created new clusters (made up of one node), $k_3$ is the number of outer cycles - in every iteration, the temperature is decreased as specified by _cc_." 
These are only rough meanings, for more refer to the docs.
+ __mcwr__: mcwr(network, rb, p=1, boxing=False)

may work to unconnected graphs!
_p_ denotes the probability of choosing the MEMB branch istead of random centres.
+ __memb__: memb(network, rb, boxing=False)

may work on unconnected graphs too!
+ __merge_algorithm__: merge_algorithm(network, lb_max,return_for_sa=False, boxing=False, measure_time=True)


advisable: _boxing=True_, everyone else _False_
+ __random_sequential__: random_sequential(network, rb, boxing=False)

works with unconnected graphs
+ __remcc__: remcc(network, rb, return_centres=True)


+ __simulated_annealing__: simulated_annealing(network, lb, k1=20, k2=2, k3=15, temp=0.6, cc=0.995)

+ __pso__: pso(network, lb, gmax=5, pop=5, c1=1.494, c2=1.494, boxing=False)

+ __obca__: obca(network, lb, boxing=False)

+ __fuzzy__: fuzzy(network, lb, boxing=True)



### Sample boxing with MEMB

#### UV flower

Peep the docs if you are interested.

__UV_22, gen=5__

Generate graph and create a network object from it

In [4]:
uv225=boxes.network(boxes.generators.uv_flower(2,2,5)) # instantiate boxes.network object from generator output

In [5]:
uv225.graph.number_of_nodes()

684

In [6]:
uv225.graph.number_of_edges()

1024

In [7]:
lb=range(1,32,4)
rb=range(1,32,2)

In [8]:
help(boxes.memb)

Help on function memb in module boxes.memb:

memb(network, rb, boxing=False)



Run MEMB for a given box size

In [9]:
boxes.memb(uv225,5,boxing=True)

computing shortest path data


12

In [12]:
os.mkdir('sample_results')

In [13]:
logpath='sample_results/'

Now for multiple sizes at the same time!

In [11]:
help(boxes.io_.run_boxing)

Help on function run_boxing in module boxes.io_:

run_boxing(names, time_offset, network, box_sizes, algorithm, merge_alg=False, **kwargs)
    assuming that necessary preprocessing (e.g. get shortest path data) has been perfomed
    its time passed in time_offset
    
    names: dicitonary for log naming, names['path'],names['net'],names['alg']



In a real applicaton, the offset time shall be measured separately for __run_boxing__ (the overhead of computing shortest path data)

In [15]:
boxes.io_.run_boxing({'path':logpath,'net':'uv225','alg':'memb'},time_offset=0,network=uv225,box_sizes=rb,
                    algorithm=boxes.memb,merge_alg=False,**{'boxing':True})

In [16]:
os.listdir(logpath)

['uv225_memb.txt']

Read back logfiles, raw

In [18]:
exec_time, readout=boxes.io_.read_logfile(logpath+'uv225_memb.txt')

In [19]:
exec_time # containing false (0) offset

2.070331573486328

In [20]:
readout # rb - Nb

[(1, 172.0),
 (3, 44.0),
 (5, 12.0),
 (7, 12.0),
 (9, 4.0),
 (11, 4.0),
 (13, 4.0),
 (15, 4.0),
 (17, 2.0),
 (19, 2.0),
 (21, 2.0),
 (23, 2.0),
 (25, 2.0),
 (27, 2.0),
 (29, 2.0),
 (31, 2.0)]

Convert box sizes to the canonical lb values

In [21]:
exec_time, canonized_readout=boxes.io_.canonized_lb(logpath+'uv225_memb.txt','memb')

In [22]:
exec_time

2.070331573486328

In [23]:
canonized_readout # rb converted to lb, in the sense of the literature

[(3, 172.0),
 (7, 44.0),
 (11, 12.0),
 (15, 12.0),
 (19, 4.0),
 (23, 4.0),
 (27, 4.0),
 (31, 4.0),
 (35, 2.0),
 (39, 2.0),
 (43, 2.0),
 (47, 2.0),
 (51, 2.0),
 (55, 2.0),
 (59, 2.0),
 (63, 2.0)]

Do not forget to free memory!

In [24]:
del(uv225)
gc.collect()

26847

__There are more wrappers implementing benchmarking, running more algs on the same net,  etc.__

__I strongly recommend that you read through the documentation for these! (It is not that long) Also, you may want to look into the source code too.__