## How to create a custom blockchain in FlexBlock
In this notebook we are showing how to create a custom blockchain in FlexBlock. We will use it to train a linear regression model on the diabetes set from sklearn. First we will load the data and create the corresponing `FedDataset`.

This notebook asumes previous experience with the Flex library for federated learning experiments. If you are not familiar with it, we recommend you to check the Flex tutorials first.

In [1]:
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from flex.data import Dataset, FedDataDistribution
import numpy as np

# Load the diabetes dataset
diabetes = load_diabetes()

# Generate train-test splits
X_train, X_test, y_train, y_test = train_test_split(
    diabetes.data[:, np.newaxis, 2], diabetes.target, test_size=0.33, random_state=42
)

# We are going to use the train dataset for mining since we are supossed to learn from it too
train_diabetes = Dataset.from_array(X_train, y_train)
federated_diabetes = FedDataDistribution.iid_distribution(train_diabetes, n_nodes=5)

Now let's define our custom block and blockchain structures. In FlexBlock we can define a custom block by inheriting from the `Block` class and implementing the `__init__` and `compute_hash` methods. The `__init__` method will recieve the weights stored by the block which should be passed to the parent class constructor. We can also add additional attributes to the block, for example, the mining difficulty or the number of iterations used to train the model. The `compute_hash` method should return a hash of the block. In our case we will just return a constant string. In a real application we would compute the hash from the block attributes using functions such as sha256 or any other cryptographical safe function.

The blockchain is defined by inheriting from the `Blockchain` class and implementing the `__init__` and `add_block` method. The `__init__` method should call the parent class constructor with the genesis block given as an argument, we can add more parameters through `**kwargs`. Note that we are heriting from the `Blockchain` class with `MyCustomBlock` as a type parameter, this is not necesary but will give us type hinting by default. The `add_block` method should add a new block to the blockchain. Here one should ever call the parent method but we can add additional logic. For example, we can check that the block is valid before adding it to the blockchain in a certain way or regulating the difficulty of the block.

In [2]:
from flexBlock.blockchain import Blockchain, Block

class MyCustomBlock(Block):
    def __init__(self, weights):
        # Here we can add what we want to store on the block
        super().__init__(weights)
    
    def compute_hash(self):
        # Always return the same hash just for testing 
        return "MyCustomHash"

class MyCustomBlockchain(Blockchain[MyCustomBlock]):
    # Remember to always define a __init__ function since it is abstract by default
    def __init__(self, genesis_block: MyCustomBlock, *args, **kwargs):
        # We initialize the blockchain with the custom genesis block
        super().__init__(genesis_block)
    
    def add_block(self, block: MyCustomBlock):
        # Here we can add custom logic such as adjusting the blockchain difficulty based on the previous blocks
        return super().add_block(block)

Now let's create our custom BlockchainPool. This is done by inheriting from the `BlockchainPool` class (note also the type hinting here) and implementing the `__init__`, `_consensus_mechanism` and `_pack_block` methods. The `__init__` method should call the `initialize_pool` method with a Blockchain and `FlexPool` of our choice. with the genesis block given as an argument, we can add more parameters through `**kwargs`. In our case we will create a `p2p_pool` for the underlying pool and a genesis block with no weights. In Blockflex the underlying pool is used to define the clients and miners of the experiment, where the aggregator role would denote the miners of the network.

The `_consensus_mechanism` method defines the consensus mechanism of the network, that is, a function that should return which miner will pack the block and do the aggregation. Here miners is a dictionary of the form `{miner_id: miner}` where `miner_id` is the id of the miner and `miner` is a `FlexModel`. Remember that in fact this miners are the aggregators of the underlying pool. We can access the blockchain in this method and more information through `self.blockchain` or any other extra defined method. In our case we will just return a random miner.

Finally, the `pack_block` method defines how the block is packed. Here we should return a `Block` object with the weights of the model and any other information we want to store. In our case we will just return a `MyCustomBlock` with the weights of the model.

In [3]:
from flexBlock.pool import BlockchainPool, PoolConfig
from flex.data import FedDataset
from flex.pool import FlexPool
from typing import Callable
from random import choice

class MyCustomBlockchainPool(BlockchainPool[MyCustomBlockchain]):
    def __init__(self, flex_dataset: FedDataset, init_func: Callable, *args, **kwargs):
        # First we are creating the underlying flex pool that will be managed
        pool = FlexPool.p2p_pool(flex_dataset, init_func=init_func, *args, **kwargs)
        # Then the blockchain with his custom genesis block
        blockchain = MyCustomBlockchain(MyCustomBlock([]))
        # WARNING: Always call self._initialize_pool()
        self.initialize_pool(blockchain, pool,config=PoolConfig(aggregate_before_acc=True, gossip_before_agg=False), **kwargs)
    
    # Now let's define our concensus mechanism
    def consensus_mechanism(self, miners, *args, **kwargs):
        # We need to return a miner key, in our case our consensus mechanism is just a random choice
        keys = list(miners.keys())
        selected = choice(keys)
        print(f"Selected miner: {selected}")
        return selected
    
    # Also we need to define how we are going to pack the block
    def pack_block(self, weights):
        # Just returning the new block
        return MyCustomBlock(weights)


Now that we have everything defined, we will write some boilerplate for our federated learning experiment.

In [4]:
from flex.pool import aggregate_weights, init_server_model, deploy_server_model, set_aggregated_weights
from flex.model import FlexModel
from sklearn.linear_model import LinearRegression
import copy

from flexBlock.pool import send_weights_to_miner, deploy_miner_model

@aggregate_weights
def aggregate(list_of_weights: list):
    return np.mean(np.asarray(list_of_weights, dtype=object), axis=0)


def train(client_flex_model: FlexModel, client_data: Dataset):
    client_flex_model["model"].fit(client_data.X_data, client_data.y_data)


@send_weights_to_miner
def get_clients_weights(client_flex_model: FlexModel):
    return [client_flex_model["model"].intercept_, client_flex_model["model"].coef_]


@init_server_model
def build_server_model(**kwargs):
    flex_model = FlexModel()
    flex_model["model"] = LinearRegression()
    return flex_model


@deploy_miner_model
def copy_server_model_to_clients(server_flex_model: FlexModel):
    return copy.deepcopy(server_flex_model)

@set_aggregated_weights
def set_weights_to_server_model(server_flex_model: FlexModel, aggregated_weights):
    server_flex_model["model"].intercept_ = aggregated_weights[0]
    server_flex_model["model"].coef_ = aggregated_weights[1]

Now we can create our `MyCustomBlockchainPool` and run the experiment.

In [5]:
# Create pool
p = MyCustomBlockchainPool(
    flex_dataset=federated_diabetes,
    init_func=build_server_model,
)

servers = p.servers
aggregators = p.aggregators
clients = p.clients
print(
    f"Number of nodes in the pool {len(p.actor_ids)}. All of them are miners and clients."
)

Number of nodes in the pool 5. All of them are miners and clients.


Let's train the model

In [6]:
for _ in range(5):
    servers.map(copy_server_model_to_clients, clients)
    clients.map(train)
    aggregators.map(get_clients_weights, clients)
    p.aggregate(aggregate, set_weights=set_weights_to_server_model)
    aggregators.map(set_weights_to_server_model, servers)

Selected miner: 4
Selected miner: 1
Selected miner: 1
Selected miner: 1
Selected miner: 0
