
# Beyond CFT: Byzantine behavior

Beyond regular crashes peer can behave in various ways violating the protocol: hide transactions, send bogus data, create Sybil entities etc.

The goal of a blockchain system is to withstand against powerful adversary. 


To ensure that message will be seen by the peer, once the peer is back online it must fetch the data from the neighboring peers. But what if neighboring nodes are malicious and will hide certain transactions?   

In the next notebooks we will cover techniques that help to detect/prevent malicious nodes.

# Byzantine gossip agent 


One of the goal of a blockchain system is to record transaction in a 'hard-to-tamper' way.

How can you achieve that in p2p settings?  
It is common in databases to use Merkle trees and signatures to verify the integrity of a transaction. **[TODO]**



Let's first create a byzantine agent that will change the data of received transactions to split the network.  


In [1]:
# Initialize the experiment:
import networkx as nx
import p2psimpy as p2p
import warnings
warnings.filterwarnings('ignore')

# Load the previous experiment configurations
exper = p2p.BaseSimulation.load_experiment(expr_dir='crash_gossip')

Locations, topology, peer_services, serv_impl = exper


## Define byzantine agents 
Let's assign first byzantine nodes, we will assign randomly: 

In [18]:
# Change peer to byzantine 
from itertools import groupby
from random import sample

frac_byznatine_nodes = 0.3 # 10 % of byzantine nodes


def assign_byzantine_peers(topology, byz_frac):
    type_dict = nx.get_node_attributes(topology, 'type')
    inv_type_dict = {k: {j for j, _ in list(v)}
                                for k, v in groupby(type_dict.items(), lambda x: x[1])}
    byz_nodes = sample(list(inv_type_dict['peer']), 
                       int(frac_byznatine_nodes * len(inv_type_dict['peer'])))
    for b in byz_nodes:
        type_dict[b] = 'byzantine'
        
    nx.set_node_attributes(topology, type_dict, 'type')
    
assign_byzantine_peers(topology, frac_byznatine_nodes)

## Define byzantine services 

In this notebook byzantine agents will change the content of a message. 

In [19]:
from p2psimpy.messages import *

class ByzantineGossipService(p2p.GossipService):
    
    
    def handle_message(self, msg):
        # Store the original message localy 
        self.peer.store('msg_time', msg.id, self.peer.env.now)
        self.peer.store('msg_data', msg.id, msg.data)

        if msg.ttl > 0:
            # Rely message further, modify the message
            exclude_peers = {msg.sender} | self.exclude_peers
            
            # Send the original message to one half of the network, 
            selected = self.peer.gossip(GossipMessage(self.peer, msg.id, msg.data, msg.ttl-1), 
                                        self.fanout//2, 
                                        except_peers=exclude_peers, 
                                        except_type=self.exclude_types)
            # Change the message and send it to the other half
            new_data = 'ChangedString'
            exclude_peers = exclude_peers | set(selected)
            self.peer.gossip(GossipMessage(self.peer, msg.id, new_data, msg.ttl-1), 
                             self.fanout//2, 
                             except_peers=exclude_peers, 
                             except_type=self.exclude_types)
            

##  Add byzantine type and services 

We deliberately keep byzantine nodes uncrashable. 


In [20]:
gossip_config = peer_services['peer'].service_map['RangedPullGossipService']
serv_impl['RangedPullGossipService'] = p2p.GossipService


peer_services['byzantine'] = p2p.PeerType(peer_services['peer'].config,
                                      {p2p.BaseConnectionManager:None,
                                       ByzantineGossipService: gossip_config}
                                     )

## Run simulation 

Let's see how byzantine agents together with crashing nodes affect the message dissemination. 

In [21]:
# Init Graph
sim = p2p.BaseSimulation(Locations, topology, peer_services, serv_impl)
sim.run(5_200)

# Analyze the storage data




## Message data

Let's see how this fraction of byzantine nodes affected the network. 



In [22]:
import pandas as pd

def message_data(sim, peer_id, storage_name):
    store = sim.peers[peer_id].storage[storage_name].txs
    for msg_id, tx in store.items():
        msg_num, client_id = msg_id.split('_')
        client_tx = sim.peers[int(client_id)].storage[storage_name].txs[msg_num]
        yield (int(msg_num), tx.data == client_tx.data)
        
def get_gossip_table(sim, storage_name, func):
    return pd.DataFrame({k: dict(func(sim, k, storage_name)) 
                         for k in set(sim.types_peers['peer'])}).sort_index()

    
df = get_gossip_table(sim, 'msg_data', message_data)
df

Unnamed: 0,1,2,3,4,5,7,8,9,10,12,14,15,16,17,18,19,20,22,23,25
1,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True
2,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True
3,True,True,True,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True
4,True,True,True,True,False,True,True,True,False,True,True,True,False,True,True,True,False,True,False,False
5,True,True,True,True,True,True,True,False,True,False,True,True,True,True,True,True,True,True,True,False
6,,,True,True,True,True,True,True,True,True,True,,True,False,True,True,True,True,True,True
7,True,True,True,True,True,True,True,,True,True,True,True,True,True,True,True,True,True,True,True
8,True,True,,False,True,True,,False,True,False,True,True,True,False,,True,True,True,True,True
9,True,True,False,True,True,True,True,True,False,False,True,True,True,True,True,True,True,False,True,True
10,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True


In [23]:
df[df==False].count()

1     0
2     1
3     1
4     4
5     2
7     2
8     1
9     4
10    7
12    4
14    1
15    0
16    2
17    5
18    1
19    0
20    3
22    1
23    2
25    3
dtype: int64

Byzantine nodes managed to trick some peers into accepting wrong data. As peers will write 'first-seen' value, adversary once having advantage over the network can perfectly split the network. 

How to deal with this? 

In general the answer is to ensure "consensus" **[TODO]**.



# TBA

