# 1 - Exploring Network Generation I

Netsim package includes two key modules:
- `generate` (network generator). This module includes several functions aimed at generating different network configurations. This module is the subject of this notebook.
- `simulate` (network simulation). This is the subject of tutorial 3. 

The *NetSim* simulation uses information stored in a (attribute) table to set up the simulation. The table must contain a series columns with concrete header names (see below). Values within these columns must be restricted to specific ranges. 
- *id*: exclusive identifier for each location. 
- *group*: identifies a location as being part of a specific group. Groups can be of any size. Groups of size 1 will  automatically mixed (see below) with the following groups. A column with a single group affiliation will be created in case this column does not exist. *Default*:1.
- *seq*: identifies rank/ordering of location within a group. There are two possible options:
  - *No ordering/ ranking (default)*. Within each group use a **single** value for all locations in that group if you do not want to identify any particular ranking or ordering within the group. Depending on the number of locations in the group, the *NetSim* with either generate all possible permutations or a particular fixed number, *num_samples* of randomized samples (with repetition).
  - *Ordering /ranking*. To specify the order in which locations are going to join a network, use any sequence of numbers (with no repetitions) for all locations in a group.

If any of these columns are not present the simulation will genereate these and populate them with default values. 

### Imports

In [2]:
import geopandas as gpd
import netsim.generate as ng
from pathlib import Path

### Set data path

In [4]:
data_path = Path.cwd().parent / "data"
data_path

PosixPath('/Users/jacobdeppen/Desktop/netsim/data')

### Read a sample shapefile

In [5]:
fn_shp = data_path / "sample" / "sample5.shp"

We use geopandas to read the shapefile that contains the location and the columns needed to run the simulation. In case the location file is a simple a text file, e.g. *comma-delimited* or *csv*, use
```python
import pandas as pd
df = pd.read_csv(filename)
```
instead.


In [6]:
df= gpd.read_file(fn_shp)
df

Unnamed: 0,id,seq,group,mix,easting,northing,geometry
0,0,1,1,0,530782,4389390,POINT (530782.000 4389390.000)
1,1,1,1,0,531119,4388860,POINT (531119.000 4388860.000)
2,2,1,1,0,530403,4388580,POINT (530403.000 4388580.000)
3,3,1,1,0,530503,4388620,POINT (530503.000 4388620.000)
4,4,1,1,0,530729,4388930,POINT (530729.000 4388930.000)
5,5,1,1,0,530606,4389150,POINT (530606.000 4389150.000)


**N.B.** It is a good idea to make a copy of your original table in case we need to change values along the way

### Checking values

We run the ```check()``` function in the *netgen* module to check whether the table with the information needed to run the simulation has all the appropriate columns and values within these. 

In [4]:
df = ng.check(df)


 No corrections or errors !! 


Let's introduce some errors and see how the ```check()``` function behaves. Let's remove the 'seq' column.

In [5]:
df.drop(columns=['seq'], inplace= True)

In [6]:
df = ng.check(df)
df


seq column - created sequence with no sequence (default 1.) !


Unnamed: 0,id,group,mix,easting,northing,geometry,seq
0,0,1,0,530782,4389390,POINT (530782 4389390),1
1,1,1,0,531119,4388860,POINT (531119 4388860),1
2,2,1,0,530403,4388580,POINT (530403 4388580),1
3,3,1,0,530503,4388620,POINT (530503 4388620),1
4,4,1,0,530729,4388930,POINT (530729 4388930),1
5,5,1,0,530606,4389150,POINT (530606 4389150),1


The sequence ('seq') column must contain a value of '1' or a montonic sequence of numbers per group. Let's change it so that the sequence we have is no longer monotonic 

In [7]:
df.loc[3,'seq']= 2
df

Unnamed: 0,id,group,mix,easting,northing,geometry,seq
0,0,1,0,530782,4389390,POINT (530782 4389390),1
1,1,1,0,531119,4388860,POINT (531119 4388860),1
2,2,1,0,530403,4388580,POINT (530403 4388580),1
3,3,1,0,530503,4388620,POINT (530503 4388620),2
4,4,1,0,530729,4388930,POINT (530729 4388930),1
5,5,1,0,530606,4389150,POINT (530606 4389150),1


Now, when we run ```check()``` function it will spit out an error

In [8]:
ng.check(df)



 ERROR: seq column - sequence for group 1 is not 1 or sequential!


TypeError: exceptions must derive from BaseException

Let's restore this column,

In [9]:
df['seq']= [1,1,1,1,1,1]
df

Unnamed: 0,id,group,mix,easting,northing,geometry,seq
0,0,1,0,530782,4389390,POINT (530782 4389390),1
1,1,1,0,531119,4388860,POINT (531119 4388860),1
2,2,1,0,530403,4388580,POINT (530403 4388580),1
3,3,1,0,530503,4388620,POINT (530503 4388620),1
4,4,1,0,530729,4388930,POINT (530729 4388930),1
5,5,1,0,530606,4389150,POINT (530606 4389150),1


We can replace the original 'seq' column by another one. Provided all values in a group are unique the simulation will work. 

In [10]:
df['seq']= [3,4,5,6,7,8]
ng.check(df)


 No corrections or errors !! 


Unnamed: 0,id,group,mix,easting,northing,geometry,seq
0,0,1,0,530782,4389390,POINT (530782 4389390),3
1,1,1,0,531119,4388860,POINT (531119 4388860),4
2,2,1,0,530403,4388580,POINT (530403 4388580),5
3,3,1,0,530503,4388620,POINT (530503 4388620),6
4,4,1,0,530729,4388930,POINT (530729 4388930),7
5,5,1,0,530606,4389150,POINT (530606 4389150),8


### Creating a network generator

The next step is to run the ```create_network_generator()``` function. The main aim of this function is to generate a **network generator** that we can later use to produce different versions, or iterations, of our network. The function provide us with additional information: a dictionary with details about the type of iteration for each group and the total number of iterations that are possible. Let's explore this function

In [11]:
netgentor, net_info, total_iterations = ng.create_network_generator(df)


 iteration broken per group....

   group  num_loc  num_iter iter_type
0      1        6         1    single

 total number of iterations.... 1


In [12]:
net_info

Unnamed: 0,group,num_loc,num_iter,iter_type
0,1,6,1,single


The above example is not very informative given that we have specified one one group with a single ordering. Let us change the sequence column to all 1s and see what happens.

In [13]:
df['seq'] = [1,1,1,1,1,1]
df

Unnamed: 0,id,group,mix,easting,northing,geometry,seq
0,0,1,0,530782,4389390,POINT (530782 4389390),1
1,1,1,0,531119,4388860,POINT (531119 4388860),1
2,2,1,0,530403,4388580,POINT (530403 4388580),1
3,3,1,0,530503,4388620,POINT (530503 4388620),1
4,4,1,0,530729,4388930,POINT (530729 4388930),1
5,5,1,0,530606,4389150,POINT (530606 4389150),1


In [14]:
netgentor, net_info, total_iterations = ng.create_network_generator(df)


 iteration broken per group....

   group  num_loc  num_iter iter_type
0      1        6       100    sample

 total number of iterations.... 100


In [15]:
net_info

Unnamed: 0,group,num_loc,num_iter,iter_type
0,1,6,100,sample


Notice how the iteration type (iter_type) has changed from 'single' to 'sample' and the number of iterations (num_iter) has gone from 1 to 100 (also the total number of iterations is now 100). This number 100 is the default number of interations that are generated when the number of locations in a group is greater than 5. 

Here is an example of one of the possible network iterations,

In [16]:
list(next(netgentor))

[(5, 0, 2, 4, 3, 1)]

You can use ```next()``` function on the network generator instance repeatedly until you arrive to *total_iterations*. For instance, the next code generates 5 new interations,

In [17]:
for i in range(5):
    print(next(netgentor))

((4, 1, 5, 3, 0, 2),)
((3, 4, 1, 2, 5, 0),)
((4, 2, 0, 5, 3, 1),)
((1, 0, 3, 2, 5, 4),)
((3, 4, 0, 5, 2, 1),)


Let's change the number of groups so that we end up with two groups of two and four locations.

In [18]:
sel = df['id'].isin([2, 3, 4, 5])
df.loc[sel,'group']= 2
df

Unnamed: 0,id,group,mix,easting,northing,geometry,seq
0,0,1,0,530782,4389390,POINT (530782 4389390),1
1,1,1,0,531119,4388860,POINT (531119 4388860),1
2,2,2,0,530403,4388580,POINT (530403 4388580),1
3,3,2,0,530503,4388620,POINT (530503 4388620),1
4,4,2,0,530729,4388930,POINT (530729 4388930),1
5,5,2,0,530606,4389150,POINT (530606 4389150),1


Let's first re-run ```check()``` to make sure that our new dataframe is fine and then ```create_network_generator()```

In [19]:
ng.check(df)


 No corrections or errors !! 


Unnamed: 0,id,group,mix,easting,northing,geometry,seq
0,0,1,0,530782,4389390,POINT (530782 4389390),1
1,1,1,0,531119,4388860,POINT (531119 4388860),1
2,2,2,0,530403,4388580,POINT (530403 4388580),1
3,3,2,0,530503,4388620,POINT (530503 4388620),1
4,4,2,0,530729,4388930,POINT (530729 4388930),1
5,5,2,0,530606,4389150,POINT (530606 4389150),1


In [20]:
netgentor, net_info, total_iterations = ng.create_network_generator(df)


 iteration broken per group....

   group  num_loc  num_iter    iter_type
0      1        2         2  permutation
1      2        4        24  permutation

 total number of iterations.... 48


Notice how the iteration type (iter_type) has changed to 'permutation' and how the total number of permutations (48) is the product of two permutations. Let's ask for a few iterations and see what they look like,

In [21]:
for i in range(5):
    print(next(netgentor))

((0, 1), (2, 3, 4, 5))
((0, 1), (2, 3, 5, 4))
((0, 1), (2, 4, 3, 5))
((0, 1), (2, 4, 5, 3))
((0, 1), (2, 5, 3, 4))


This conclude this tutorial. Next tutorial will explore a bit more the `generate` module. It will focus on the different types of network layouts that you can generate for a given sequence of locations.