# Bytes List

This notebook demonstrates the training and evaluation of a neural network designed to convert lists of bytes (bit representations) into their corresponding integer values. 

The experiment explores two training approaches: supervised learning, where the network learns directly from input-output pairs, and fitness-based learning, where the network is trained using a fitness function based on output accuracy.

The notebook covers data generation, neural network construction, training, and performance evaluation, providing insights into how neural networks can learn to interpret binary data.

## Code Implementation

### Importing the Neural Network Library

First, we need to import the necessary classes from the `neural_network` library to construct the neural network.

In [None]:
import logging
import sys

import numpy as np

from neural_network.layer import HiddenLayer, InputLayer, OutputLayer
from neural_network.math.activation_functions import LinearActivation, SigmoidActivation
from neural_network.neural_network import NeuralNetwork

rng = np.random.default_rng()

In [2]:
class FlushedConsoleHandler(logging.StreamHandler):
    def emit(self, record: logging.LogRecord) -> None:
        msg = self.format(record)
        sys.stdout.write('\r' + msg)
        sys.stdout.flush()

formatter = logging.Formatter('[%(asctime)s] %(message)s', datefmt='%d-%m-%Y | %H:%M:%S')

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(formatter)
logger.handlers = [handler]

flushed_logger = logging.getLogger('flushed_logger')
flushed_logger.setLevel(logging.INFO)
handler = FlushedConsoleHandler()
handler.setFormatter(formatter)
flushed_logger.handlers = [handler]

The following parameters are required to define the architecture of the neural network. The number of inputs is `NUM_BITS`, and the number of outputs is 1.

The number of data points to use for training and testing is given by `DATASET_SIZE`. This is split into training and testing datasets to introduce variety in the training and testing.

In [3]:
# Neural network parameters
HIDDEN_LAYER_SIZES = [3]
INPUT_ACTIVATION = LinearActivation
HIDDEN_ACTIVATION = SigmoidActivation
OUTPUT_ACTIVATION = SigmoidActivation
WEIGHTS_RANGE = (-1, 1)
BIAS_RANGE = (-0.3, 0.3)
LR = 0.1
SMOOTHING_ALPHA = 0.25

# Dataset parameters
DATASET_SIZE = 30000
TRAIN_SIZE_RATIO = 0.8
EPOCHS = 5

# Bytes lists parameters
NUM_BITS = 8
IN_LIMS = [0, 255]
OUT_LIMS = [0, 1]


### Creating Methods to Generate Training Data


We will be using 8-bit numbers to train the neural network.
The following functions will allow us to convert between between integers and bytes lists.
We can use those functions to create the training and testing datasets.
We will select random numbers and train the neural network with the corresponding byte lists and expected outputs.


In [4]:
def num_to_byte_list(num: int) -> list[int]:
    """
    Convert a number to a list of bits.

    Parameters:
        num (int): Number to convert

    Returns:
        byte_list (list[int]): Number represented as list of bits
    """
    _num_bin = bin(num)
    _num_bytes = _num_bin[2:]
    _padding = [0] * (NUM_BITS - len(_num_bytes))
    return _padding + [int(b) for b in _num_bytes]


def map_val(x: float, in_min: float, in_max: float, out_min: float, out_max: float) -> float:
    """
    Map a value from an input range to an output range.

    Parameters:
        x (float): Number to map to new range
        in_min (float): Lower bound of original range
        in_max (float): Upper bound of original range
        out_min (float): Lower bound of new range
        out_max (float): Upper bound of new range

    Returns:
        y (float): Number mapped to new range
    """
    return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min

def training_data_from_num(num: int) -> tuple[list[int], float]:
    """
    Generate byte list and mapped number from a number to use in training.

    Parameters:
        num (int): Number to use for training data

    Returns:
        training_data (tuple[list[int], float]): Input and expected output
    """
    _byte_list = np.array(num_to_byte_list(num))
    _mapped_num = map_val(num, IN_LIMS[0], IN_LIMS[1], OUT_LIMS[0], OUT_LIMS[1])
    return (_byte_list, _mapped_num)


def split_data(
    data: list[tuple[list[int], float]], train_size_ratio: float = TRAIN_SIZE_RATIO
) -> tuple[list[tuple[list[int], float]], list[tuple[list[int], float]]]:
    """
    Split the dataset into training and testing sets.

    Parameters:
        data (list[tuple[list[int], float]]): The dataset to split.
        train_size_ratio (float): The proportion of the dataset to include in the training split.

    Returns:
        tuple: A tuple containing the training and testing datasets.
    """
    train_size = int(len(data) * train_size_ratio)
    train_data = data[:train_size]
    test_data = data[train_size:]
    return train_data, test_data

def calculate_errors(expected_outputs: np.ndarray, actual_outputs: np.ndarray) -> np.ndarray:
    """
    Calculate the error between expected and actual outputs.

    Parameters:
        expected_outputs (np.ndarray): The expected output values.
        actual_outputs (np.ndarray): The actual output values.

    Returns:
        errors (np.ndarray): The calculated errors.
    """
    errors = expected_outputs - np.array(actual_outputs)
    return map_val(errors, OUT_LIMS[0], OUT_LIMS[1], IN_LIMS[0], IN_LIMS[1])


def calculate_percentage_error(errors: np.ndarray) -> float:
    """
    Calculate the percentage error from a list of errors.

    Parameters:
        errors (np.ndarray): The list of errors.

    Returns:
        percentage_error (float): The average error as a percentage.
    """
    avg_error = np.average(errors)
    return np.abs(avg_error) / IN_LIMS[1]

### Neural Network Creation

The following functions are used to create and test neural networks using the parameters defined earlier in this notebook.

In [5]:
def create_nn(
    input_size: int = NUM_BITS,
    hidden_layer_sizes: list[int] = HIDDEN_LAYER_SIZES,
    input_activation: type = INPUT_ACTIVATION,
    hidden_activation: type = HIDDEN_ACTIVATION,
    output_activation: type = OUTPUT_ACTIVATION,
    weights_range: tuple[float, float] = WEIGHTS_RANGE,
    bias_range: tuple[float, float] = BIAS_RANGE,
    lr: float = LR,
) -> NeuralNetwork:
    """Create a neural network with specified parameters."""
    input_layer = InputLayer(size=input_size, activation=input_activation)
    hidden_layers = [
        HiddenLayer(size=size, activation=hidden_activation, weights_range=weights_range, bias_range=bias_range)
        for size in hidden_layer_sizes
    ]
    output_layer = OutputLayer(size=1, activation=output_activation, weights_range=weights_range, bias_range=bias_range)

    return NeuralNetwork.from_layers(layers=[input_layer, *hidden_layers, output_layer], lr=lr)

def evaluate_nn(
    nn: NeuralNetwork, data: list[tuple[list[int], float]]
) -> tuple[float, float]:
    """
    Evaluate the neural network on a dataset.

    Parameters:
        nn (NeuralNetwork): The neural network to evaluate.
        data (list[tuple[list[int], float]]): The dataset to evaluate on.

    Returns:
        errors (np.ndarray): The list of errors.
        percentage_error (float): The average error as a percentage.
    """
    dataset_size = len(data)
    outputs = []
    for i in range(dataset_size):
        inputs = data[i][0]
        output = nn.feedforward(inputs)[0]
        outputs.append(output)

    errors = calculate_errors(
        expected_outputs=np.array([data[i][1] for i in range(dataset_size)]), actual_outputs=np.array(outputs)
    )
    percentage_error = calculate_percentage_error(errors)
    return errors, percentage_error


### Dataset Creation

The supervised learning approach uses expected outputs against given inputs to backpropagate errors.
In the fitness-based approach, we need to calculate the fitness value for each output against given inputs, and use that to calculate the errors.

In [6]:
# Supervised training
def generate_supervised_training_data(dataset_size: int) -> list[tuple[list[int], float]]:
    """
    Generate supervised training data for the neural network.

    Returns:
        training_data (tuple[list[list[int]], list[float]]): Input and expected output pairs
    """
    random_num = rng.integers(low=IN_LIMS[0], high=(IN_LIMS[1] + 1), size=dataset_size)
    return [training_data_from_num(num) for num in random_num]


# Fitness training
def calculate_fitness(expected_output: float, nn_output: float) -> float:
    """
    Calculate fitness based on the accuracy of the neural network's output.

    Parameters:
        expected_output (float): The correct output value.
        nn_output (float): The neural network's predicted output.

    Returns:
        fitness (float): A fitness value where higher is better.
    """
    error = abs(expected_output - nn_output)
    normalized_error = error / (OUT_LIMS[1] - OUT_LIMS[0])
    return np.exp(-normalized_error * 5)


def generate_fitness_training_data(dataset_size: int, nn: NeuralNetwork) -> list[tuple[list[int], float]]:
    """
    Generate fitness training data for the neural network.

    Returns:
        training_data (tuple[list[list[int]], list[float]]): Input and fitness values
    """
    data = generate_supervised_training_data(dataset_size)
    nn_outputs = [nn.feedforward(input_data) for input_data, _ in data]
    return [
        (input_data, calculate_fitness(expected_output, nn_output))
        for (input_data, expected_output), nn_output in zip(data, nn_outputs, strict=False)
    ]

### Running the Algorithm

Now we can run the training algorithm and test the neural network to evaluate its accuracy.
First, we will use the supervised learning approach:

In [7]:
logger.info("Creating neural network for supervised training...")
nn_supervised = create_nn()
logger.info("Generating supervised training data...")
data_supervised = generate_supervised_training_data(DATASET_SIZE)
training_data_supervised, testing_data_supervised = split_data(data_supervised)
logger.info("Training neural network with supervised learning...")
nn_supervised.run_supervised_training(
    inputs=[input_data for input_data, _ in training_data_supervised],
    expected_outputs=[expected_output for _, expected_output in training_data_supervised],
    epochs=EPOCHS,
)

logger.info("Testing neural network with supervised learning...")
_, percentage_error = evaluate_nn(nn_supervised, testing_data_supervised)
msg = f"Supervised training percentage error: {percentage_error:.4f}%"
logger.info(msg)

[05-06-2025 | 00:12:30] Creating neural network for supervised training...
[05-06-2025 | 00:12:30] Generating supervised training data...
[05-06-2025 | 00:12:30] Training neural network with supervised learning...
[05-06-2025 | 00:12:38] Testing neural network with supervised learning...
[05-06-2025 | 00:12:38] Supervised training percentage error: 0.0001%


Now for the fitness-based approach:

In [8]:
# Fitness training
logger.info("Creating neural network for fitness training...")
nn_fitness = create_nn()
logger.info("Generating fitness training data...")
data_fitness = generate_fitness_training_data(DATASET_SIZE, nn_fitness)
training_data_fitness, testing_data_fitness = split_data(data_fitness)
logger.info("Training neural network with fitness-based learning...")
nn_fitness.run_fitness_training(
    inputs=[input_data for input_data, _ in training_data_fitness],
    fitnesses=[fitness for _, fitness in training_data_fitness],
    epochs=EPOCHS,
)

# Testing
logger.info("Testing neural network with fitness-based learning...")
_, percentage_error = evaluate_nn(nn_fitness, testing_data_fitness)
msg = f"Fitness training percentage error: {percentage_error:.4f}%"
logger.info(msg)

[05-06-2025 | 00:12:38] Creating neural network for fitness training...
[05-06-2025 | 00:12:38] Generating fitness training data...
[05-06-2025 | 00:12:39] Training neural network with fitness-based learning...
[05-06-2025 | 00:12:47] Testing neural network with fitness-based learning...
[05-06-2025 | 00:12:47] Fitness training percentage error: 0.0358%
