#  Introduction

To understand how Blockchain works, we must first discuss how distributed systems work.

The goal of this notebook is to explain the building blocks for any distributed system, and specifically a peer-to-peer system. P2P environment is a network where peers are connected, exchange information and run applications on top of it.

The goals are as follows:
- We will build a system block by block.
- Distributed systems are hard as there are many random errors: race conditions, nasty bugs, deadlocks.
- We will simplify the world to get to the juiciest parts without leaving out the essentials.  


We will simulate the P2P environment locally to get an idea of how such systems would work and understand the design decision that developers make.

To show the different tradeoffs and why developers and researchers choose specific designs, we need a way to simulate such an environment. Here we will simulate the network and message exchange via a  discrete event simulator: [SimPy](https://simpy.readthedocs.io/en/latest/).

For the simplicity of use, you can use wrapper for SimPY implement a P2P network simulation: `p2psimpy`. 


# Simulation of P2P environments

We start the notebook by introducing the backbone for such a simulation: `BaseSimulation`.
This class represents the network of peers with their physical representations in the world, such as physical `location`, [`bandwidth`](https://en.wikipedia.org/wiki/Bandwidth_(computing)). 

These properties both define and restrict the capabilities of how fast you can transfer messages and how message travel through the network (which peers see the message, which channels are used).


`BaseSimulation` consists of 3 main parameters: 
 - *Locations*
 - *Network Topology* 
 - *Services and Message Processors* 

We will explain them one by one. 




### Simulating latencies between locations

The simulator simulates connection delays (`latency`) between the nodes in the physical space (`location`).
This is a very rough representation of reality but is sufficient for us. You can read more about jitter and latency [here](https://www.tpx.com/blog/latency-jitter-and-packet-loss/) or [here](https://www.youtube.com/watch?v=WdbJdUh6W08).   



Locations are defined as a `Config` class. Variables can be constant, or probabilistic (with a probability distribution using scipy).
This class can be used as generators of parameters, saved and loaded as a [YAML](https://en.wikipedia.org/wiki/YAML) file.

Location configuration should contain at least two fields: `locations` and `latencies`. 
- `locations` is an array or a tuple with names of locations.
- `latencies` is a dictionary (matrix) with latencies of pairwise connections.

Depending on the experiments, latencies can be constant, but a more realistic model is a probabilistic distribution.
Such a distribution is represented with the `Dist` class. 

`Dist` class is a wrap-around [scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html), which has a rich collection of distributions. Use the [link](https://docs.scipy.org/doc/scipy/reference/stats.html) as reference documentation when choosing a distribution function.   

Finally,  `Config.get()` samples all the parameters from distributions and returns them as a dictionary.  

Let's try it out:

In [None]:
from p2psimpy.config import *


class ConstLocations(Config):
    '''A configuration with 2 locations: LocA and LocB with constant latencies.'''
    locations = ['LocA', 'LocB']
    latencies = {
        'LocA': {'LocB': 10, 'LocA': 2},
        'LocB': {'LocB': 1}
    }    

class DistLocations(Config):
    '''A configuration with 2 locations: LocA and LocB with variable latencies defined with a statistical function.'''
    locations = ['LocA', 'LocB']
    latencies = {
        'LocB': {'LocB': Dist('gamma', (1, 1, 1))},
        'LocA': {'LocB': Dist('norm', (12, 2)), 'LocA': Dist('norm', (2, 0.5))},
    }    
    

In [None]:
ConstLocations.get()

In [None]:
DistLocations.get()

Every time you call `Config.get()` it will return new sample

In [None]:
DistLocations.get()

To visualize it we can represent latencies sample as pandas dataframe. It provides a readable table:

In [None]:
import pandas as pd 

lat_sample = pd.DataFrame(DistLocations.get()['latencies'])
lat_sample

*Notice that table has **NaN** values. They will not be automatically filled with symmetric value if not specified.*

We can transform the table to fill them up symmetrically. 

In [None]:
lat_sample[pd.isnull(lat_sample)] = lat_sample.T[pd.isnull(lat_sample)]
lat_sample


### Visualizing the data


 1. Prepare data: 
  - Trasfer to pandas dataframe
  - Fill null values with the symmetic counter-part
  - Transform into list of tuples: 'from', 'to', 'value' 
  
 2. Get 100 samples from the configuration with latencies
 
 3. Visualize in a grid of histograms


The visulations shows the latency matrix for our two locations: 


In [None]:

import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style='darkgrid')

# 1. Data prepare
def prepare(x):
    x = pd.DataFrame(x)
    x[pd.isnull(x)] = x.T[pd.isnull(x)]
    return x.stack().reset_index()\
                    .rename(columns={'level_0':'from', 'level_1':'to', 0: 'value'})
                    
# 2. Get 100 samples
df = pd.concat([prepare(DistLocations.get()['latencies']) for _ in range(100)], ignore_index=True)

#df 

# 3. Visualize in a grid of histograms
g = sns.FacetGrid(df, col='to', row='from', height=4, aspect=1)
g = g.map(sns.histplot, 'value')


----
## Peers and Topologies 

Usually when designing your own system, the architect should think about the network topology: *How are peers connected to each other?*. 


Simulation can work in two modes: 

1. **NetworkX**. You know what topology you want to simulate, pass a network topology to the simulation.

2. **Emergent**. Peers connect to each other without a pre-defined topology. A certain topology emerges through local rules, for example, peers connect with each other through the bootstrap peers and exchange data with each other. Typically, unstructured topologies of a peer-to-peer network fall in this category. You can read more [here](https://en.wikipedia.org/wiki/Bootstrapping_node), or  read how [Bitcoin network](https://en.bitcoin.it/wiki/Network) is formed.  


### Networkx topology

Let's first see how we can form a network topology.


In [None]:
import networkx as nx

peer_num = 10

# Generate a random network network topology 
G = nx.erdos_renyi_graph(10, 0.5)

# Assign a peer type to the peers with a type. Type if later used to tell which services to run.
nx.set_node_attributes(G, {k: 'basic' for k in G.nodes()}, 'type')

sns.set(style='dark')
nx.draw_networkx(G)


NetworkX is a library with rich semantics to work with graphs. You can read more [here](https://networkx.github.io/documentation/stable/reference/generators.html).  

You can choose any graph, generate them and put `type` as node attributes. `type` is a name for the peer type. We will explain later on how it will be used in a simulation.  



### Emergent topology

What if topology is not known?

In this case we can model the peer discovery mechanism through *bootstrapping* peers. The general bootstrap process is the following: 
1. Bootstrap nodes are first created and introduced to the network. The goal of bootstrap nodes is to maintain list of peers and connect them with each other upon request. Bootstrap nodes are usually known in advance, for example, hardcoded in the supplied code. 
2. The bootstrap procedure is as follows: a) new peer joins the network, b) peer sends a `Hello` message to one of bootstrap nodes, c) a bootstrap node responds to the peer with a random sample of known online peers. 
3. The peer connect with other peers from the given sample list.


To run default bootstrap you should pass as a parameter `topology`, a dictionary with type `peer_id` -> {`type`: type_name}
This is all abstracted in simulation, but you can specify your own bootstrap logic later. 


<!-- ### Explain more

Discovery  -->

-------
## Peer types and services

Finally, our simulation needs messages and service to process these messages. In `p2psimpy` we model it through Manager Classes and `PeerType`.

Below we show an example of a simple map with one type of the peer: `basic`: 

- Peer is described with a `PeerType(PeerConfig, services)`. 
    - `PeerConfig` describes a physical capacities of the node: bandwidth for the messages that go through (upload and download) and location (one the specified before)
    - `services` is a list of service that peer should run. There some standard services that you can use in your simulation. But you will also implement your own later!
- Minimally peer should have at least a connection service to connect to the network and respond to the introduction messages. These are the standard connection services available for you: 
  - `BaseConnectionManager`: a simple connection service that can connect, ping other peers and disconnect unresponsive peers. This is the recommended service for a **Networkx** topology approach.  
  - `P2PConnectionManager`: an extended BaseConnectionManger that additionally keeps the number of connections between specified `min_peers` and `max_peers`. If the number of local connections is lower than `min_peers`, service will actievly pool and ask other peers for new connection. If the number of local connections is higher than `max_peers`, service will refuse all new connection and disconnect the slowest peers. This is a recommended service for an **Emergent** topology approach. 

Let's see how it looks in code:


In [None]:
from p2psimpy.config import *
from p2psimpy.consts import *
    
class PeerConfig(Config):
    # Location of the peer - random location from the locations specified earlier 
    location = Dist('sample', DistLocations.locations)
    # Bandwidth is normally distributed with average of 50 and var 10 Mbit
    bandwidth_ul = Dist( 'norm', (50*MBit, 10*MBit))
    bandwidth_dl = Dist( 'norm', (50*MBit, 10*MBit))

# Let's add ConnectionManager - that will periodically ping neighbours and check if they are online.
# We use BaseConnectionManager - that will periodically ping peer and disconnect unresponsive peers.
from p2psimpy.services.connection_manager import BaseConnectionManager

services = (BaseConnectionManager,)
# We have on peer role: basic
peer_types = {'basic': PeerType(PeerConfig, services)}

How to use these configurations? Configs are used to get value. 
They work as generators and attribute descriptors.  

----

# Putting it all together


Let's combine all you learned so far to run a simple simulation:

1. Define locations and simulation world parameters.
2. Define topology and number of peers with their types.  
3. Define what each peer type is with a `PeerConfig` and what it does with services. 


After we defined all this we can create a simulation object and run it with `.run(time)`.

Simulation has it's internal clock and scheduler for all events. The parameter `time` is a time until which the simulation is run. 


***The time is defined in milliseconds**!





## Random given topology 

Let us first try out with a random given topology generated as a networkx graph

In [None]:
from p2psimpy.config import *
from p2psimpy.consts import *
import networkx as nx

class Locations(Config):
    locations = ['LocA', 'LocB']
    latencies = {
        'LocB': {'LocB': Dist('gamma', (1, 1, 1))},
        'LocA': {'LocB': Dist('norm', (12, 2)), 'LocA': Dist('norm', (2, 0.5))},
    } 

# Number of nodes
N = 10
    
# Generate network topology 
G = nx.erdos_renyi_graph(N, 0.5)
# Assign a peer type to the peers 
nx.set_node_attributes(G, {k: 'basic' for k in G.nodes()}, 'type')

class PeerConfig(Config):
    location = Dist('sample', Locations.locations)
    bandwidth_ul = Dist( 'norm', (50*MBit, 10*MBit))
    bandwidth_dl = Dist( 'norm', (50*MBit, 10*MBit))

# Let's add ConnectionManager - that will periodically ping neighbours and check if they are online 
from p2psimpy.services.connection_manager import BaseConnectionManager
# For each service you can define own configuration, or use default values.   
# Lets use base connection manager - that will periodically ping peer and disconnect unresponsive peers.

services = (BaseConnectionManager,)
peer_types = {'basic': PeerType(PeerConfig, services)}

# Display the topology 
nx.draw_networkx(G)

In [None]:
from p2psimpy.simulation import Simulation

# Create BaseSimulation with enabled logger and save in the logs directory. 
sim = Simulation(Locations, G, peer_types, enable_logger=True, logger_dir='logs')

# Let's run the simulation for 5 seconds
sim.run(5_000)

----------------------------

What's next? There is no output.

We run the simulator with logger enabled to see the all the messages exchanged in the network. 

The output is written to the logs in `\logs` directory. Let's see what is the output. 
Each peer writes its own log with events happening like received message, connecting to a peer etc. 

For example, here is the log of peer 1:

In [None]:
with open('logs/Peer_1:basic.log') as s:
    print(s.read())

## Emergent topology

Now let's try an emergent topology, we will use the same configuration, but use bootstrap peer for discovery and building a network. For the connection manager we will use `P2PConnectionManager`.  


In [None]:
from p2psimpy.services.connection_manager import P2PConnectionManager

num_peers = 10

topology_specs = {i:{'type': 'basic'} for i in range(1, num_peers+1)}

class ConnectionConfig(Config):
    min_peers = 4
    max_peers = 8

peer_types = {'basic': PeerType(PeerConfig, {P2PConnectionManager: ConnectionConfig})}


In [None]:
sim = Simulation(Locations, topology_specs, peer_types, logger_dir='logs2')

# Let's run the simulation for 5 seconds
sim.run(5_000)

#### The peers form themselves a random topology

In [None]:
G1 = sim.current_topology()
nx.draw_networkx(G1)

In [None]:
with open('logs2/Peer_1:basic.log') as s:
    print(s.read())

------------------

Now you get how to work a simulation, in the next notebook we will take a look into services and implement our own.  


## Your experiments here 


1. Change the latency between the locations and look at the logs.  
2. Change the rules for the discovery

