# AI-ANNE: (A) (N)EURAL (N)ET FOR (E)XPLORATION
#### by Prof. Dr. habil. Dennis Klinkhammer (2025)

![title](img/voice1.png)
# About this Tutorial
In this tutorial you will learn how a neural network can recognize voices and learns to **differentiate between young and old** voices.
# What is a Neural Network?
A neural network is a type of computer program that tries to learn things in a way that’s **a bit like how our brains work**. It looks at examples, finds patterns in the data, and then uses those **patterns to make decisions or predictions**. The network is made up of **layers**, and each layer has small units called neurons. These **neurons are connected to each other**, and each connection has a **weight**, which tells the network how important a piece of information is. Each neuron also has a **bias**, which helps shift the result a bit. When information moves through the network, it goes from one layer to the next, and at each step an **activation function** decides whether a neuron should be activated or not, kind of like how our brain decides **which signals to pay attention** to. This whole system works together so the network can learn and improve by adjusting the weights and biases as it practices.

# From Measurements to Data
When we train a neural network, we need to give it two things: **input data (X)** and the **correct answers (y)**, so it can learn to make predictions. **X** is a **list of examples**, where each example is a list of numbers that describe something. These numbers are called **features**. For example, if you're training a neural network to recognize a specific voice, the features might be things like the peak amplitude or the wave period. So each inner list in X represents a different voice, and each number in that list is a measurement regarding the different voices. Furthermore, **y** is a list of **labels that tell the network what the correct answer** is for each example. So, if the first case in X is a young voice, the first number in y is 0. If the second case is a old voice, the second number in y is 1, and so on.

In [None]:
# Features (X = [V, N, M, C]) and Labels (y = [0 / 1)
X = [[ 3.4, 1.3, 6.9, 7.1],
     [ 2.6, 0.6, 6.4 ,7.6],
     [ 3.3, 1.3, 6.8, 7.0],
     [ 2.5, 0.6, 6.3, 7.5],
     [ 3.2, 1.2, 6.7, 7.0],
     [ 2.4, 0.5, 6.3, 7.4],
     [ 3.1, 1.2, 6.6, 6.9],
     [ 2.3, 0.5, 6.3, 7.3],
     [ 3.0, 1.1, 6.5, 6.8],
     [ 2.3, 0.5, 6.2, 7.3]]

y = [0,1,0,1,0,1,0,1,0,1]

# Building and Training a Neural Network
In a programming language like MicroPython, a **library** is a collection of **pre-written code** that you can use to make your own programs easier to write. Instead of starting from scratch every time, you can **import a library** that already knows how to do certain tasks. The libraries **random** and **math** are imported to simplify the execution of some functions required for self-learning neural networks. Furthermore, neural networks not only require **activation functions** in order to activate the neurons, but also their derivates. The derivative of a function represents its instantaneous rate of change at a specific point, which enables the training of neural networks. In neural networks, **forward propagation** is the process of passing input data through the network's layers to generate a prediction and **backward propagation**, on the other hand, is the mechanism used to train the network by calculating the error between the prediction and the actual output, and then adjusting the network's weights to minimize that error. This important for the learning ability of a neural network. Furthermore, a **loss function** quantifies the difference between a deep learning model's prediction and the actual outcome, essentially acting as a measure of the model's error. Cross-entropy, a specific type of loss function, is commonly used for classification problems, especially when the model outputs probabilities. Finally, the number of **epochs and the learning rate** need to be specified in MicriPython. In neural networks, an epoch represents one **complete pass of the entire training dataset** through the model. Learning rate determines **how much the model's weights are adjusted** during each update step in the training process. Both are crucial hyperparameters that influence training and model performance.

In [None]:
# --- (1) Hidden Code ---
import micropip
await micropip.install('ipywidgets')
import random, math
import ipywidgets as widgets
from IPython.display import display, clear_output, HTML

# --- Standardization ---
def normalize(X):
    transposed = list(zip(*X))
    mins = [min(col) for col in transposed]
    maxs = [max(col) for col in transposed]
    return [[(x_i - min_i) / (max_i - min_i + 1e-9) 
             for x_i, min_i, max_i in zip(x_row, mins, maxs)] for x_row in X]

X = normalize(X)

# --- Activation & Loss ---
def sigmoid(x): return 1 / (1 + math.exp(-x))
def sigmoid_derivative(out): return out * (1 - out)
def relu(x): return max(0, x)
def relu_derivative(out): return 1 if out > 0 else 0
def leaky_relu(x): return x if x > 0 else 0.01 * x
def leaky_relu_derivative(out): return 1 if out > 0 else 0.01
def tanh(x): return math.tanh(x)
def tanh_derivative(out): return 1 - out**2
def binary_cross_entropy(pred, y): 
    epsilon = 1e-7
    return - (y * math.log(pred + epsilon) + (1 - y) * math.log(1 - pred + epsilon))
def binary_cross_entropy_derivative(pred, y): 
    epsilon = 1e-7
    return -(y / (pred + epsilon)) + (1 - y) / (1 - pred + epsilon)

# --- Dense Layer ---
def dense_forward(x, w, b, act='relu'):
    pres = [sum(x[i] * w[j][i] for i in range(len(x))) + b[j] for j in range(len(w))]
    outs = []
    for z in pres:
        if act == 'sigmoid':
            outs.append(sigmoid(z))
        elif act == 'relu':
            outs.append(relu(z))
        elif act == 'leaky_relu':
            outs.append(leaky_relu(z))
        elif act == 'tanh':
            outs.append(tanh(z))
    return outs, pres

def dense_backward(x, grad_out, out, pre, w, b, act='relu', lr=0.01):
    grad_in = [0] * len(x)
    for j in range(len(w)):
        if act == 'sigmoid':
            delta = grad_out[j] * sigmoid_derivative(out[j])
        elif act == 'relu':
            delta = grad_out[j] * relu_derivative(pre[j])
        elif act == 'leaky_relu':
            delta = grad_out[j] * leaky_relu_derivative(pre[j])
        elif act == 'tanh':
            delta = grad_out[j] * tanh_derivative(out[j])
        for i in range(len(x)):
            grad_in[i] += w[j][i] * delta
            w[j][i] -= lr * delta * x[i]
        b[j] -= lr * delta
    return grad_in

# --- Visualisierung ---
def html_network_with_connections(layer_sizes, input_dim):
    all_layers = [input_dim] + layer_sizes + [1]
    neuron_size = 20
    margin = 10
    column_spacing = 80
    row_spacing = neuron_size + 20
    radius = neuron_size // 2
    total_width = len(all_layers) * column_spacing
    total_height = max(all_layers) * row_spacing + 2 * margin
    svg = f'<svg width="{total_width}" height="{total_height}" style="position:absolute; top:0; left:0;">'
    positions = []
    for li, n in enumerate(all_layers):
        layer_x = li * column_spacing + column_spacing // 2
        layer = []
        total_layer_height = (n - 1) * row_spacing
        offset_y = (total_height - total_layer_height) // 2
        for ni in range(n):
            layer_y = offset_y + ni * row_spacing
            layer.append((layer_x, layer_y))
        positions.append(layer)
    for i in range(len(positions)-1):
        for x1, y1 in positions[i]:
            for x2, y2 in positions[i+1]:
                svg += f'<line x1="{x1}" y1="{y1}" x2="{x2}" y2="{y2}" stroke="gray" stroke-width="1" />'
    svg += '</svg>'
    html = f'''
    <style>
        .network-wrapper {{
            position: relative;
            width: {total_width}px;
            height: {total_height}px;
        }}
        .network {{
            position: absolute;
            top: 0; left: 0;
            display: flex;
            flex-direction: row;
            justify-content: center;
            height: 100%;
        }}
        .layer {{
            display: flex;
            flex-direction: column;
            justify-content: center;
            align-items: center;
            width: {column_spacing}px;
        }}
        .neuron {{
            width: {neuron_size}px;
            height: {neuron_size}px;
            border: 2px solid black;
            border-radius: 50%;
            background-color: #f2f2f2;
            margin: 10px 0;
            box-sizing: border-box;
        }}
    </style>
    <div class="network-wrapper">
        {svg}
        <div class="network">
    '''
    for n_neurons in all_layers:
        html += '<div class="layer">'
        for _ in range(n_neurons):
            html += '<div class="neuron"></div>'
        html += '</div>'
    html += '</div></div>'
    display(HTML(html))

# --- Speicher ---
trained_model = {}

# --- Training ---
def train_model(X, y, layer_sizes, activations, epochs, lr):
    dims = [len(X[0])] + layer_sizes + [1]
    weights, biases = [], []
    for i in range(len(dims)-1):
        w = [[random.uniform(-0.5, 0.5) for _ in range(dims[i])] for _ in range(dims[i+1])]
        b = [random.uniform(-0.5, 0.5) for _ in range(dims[i+1])]
        weights.append(w)
        biases.append(b)
    loss_trace = []
    for epoch in range(epochs):
        total_loss = 0
        for xi, yi in zip(X, y):
            x = xi
            acts, pres = [], []
            for i in range(len(weights)):
                act = 'sigmoid' if i == len(weights)-1 else activations[i]
                x, pre = dense_forward(x, weights[i], biases[i], act)
                acts.append(x)
                pres.append(pre)
            loss = binary_cross_entropy(acts[-1][0], yi)
            total_loss += loss
            grad = [binary_cross_entropy_derivative(acts[-1][0], yi)]
            for i in reversed(range(len(weights))):
                act = 'sigmoid' if i == len(weights)-1 else activations[i]
                inp = xi if i == 0 else acts[i-1]
                grad = dense_backward(inp, grad, acts[i], pres[i], weights[i], biases[i], act, lr)
        loss_trace.append(total_loss)
    return {
        "weights": weights,
        "biases": biases,
        "loss_trace": loss_trace
    }

# --- Widgets ---
layers_slider = widgets.IntSlider(value=2, min=1, max=5, step=1, description='Layers:')
neuron_and_activation_controls = widgets.VBox()
epochs_slider = widgets.IntSlider(value=100, min=10, max=500, step=10, description='Epochs:')
lr_slider = widgets.FloatSlider(value=0.05, min=0.001, max=1.0, step=0.01, description='L-Rate:')
train_button = widgets.Button(description="Train")
train_output = widgets.Output()

def update_neuron_sliders(*args):
    count = layers_slider.value
    controls = []
    for i in range(count):
        neuron_slider = widgets.IntSlider(value=4, min=1, max=10, step=1, description=f'{i+1}. Layer:')
        activation_dropdown = widgets.Dropdown(
            options=['relu', 'leaky_relu', 'sigmoid', 'tanh'],
            value='relu',
            description='with function:'
        )
        controls.append(widgets.HBox([neuron_slider, activation_dropdown]))
    neuron_and_activation_controls.children = controls

layers_slider.observe(update_neuron_sliders, names='value')
update_neuron_sliders()

def on_train_click(b):
    train_output.clear_output()
    layer_sizes = []
    activations = []
    for control in neuron_and_activation_controls.children:
        neuron_slider, activation_dropdown = control.children
        layer_sizes.append(neuron_slider.value)
        activations.append(activation_dropdown.value)
    epochs = epochs_slider.value
    lr = lr_slider.value
    model = train_model(X, y, layer_sizes, activations, epochs, lr)
    trained_model.clear()
    trained_model.update({
        "weights": model["weights"],
        "biases": model["biases"],
        "loss_trace": model["loss_trace"],
        "layer_sizes": layer_sizes
    })
    with train_output:
        html_network_with_connections(layer_sizes, input_dim=len(X[0]))

train_button.on_click(on_train_click)

display(widgets.VBox([
    widgets.HTML("<b>1. Number of Layers (N)</b>"),
    layers_slider,
    widgets.HTML("<b>2. Number of Neurons (N) and their functions</b>"),
    neuron_and_activation_controls,
    widgets.HTML("<b>3. Number of Epochs (N)</b>"),
    epochs_slider,
    widgets.HTML("<b>4. Learning Rate (%)</b>"),
    lr_slider,
    train_button,
    train_output
]))

# Evaluating the Performance of a Neural Network
A **confusion matrix** can be used to evaluate the performance of the neural network. A confusion matrix is a simple table used to check **how well a classification model is working**, especially in binary tasks where there are only two possible outcomes, like “cow” or “rabbit”. It shows how many predictions the model got right and wrong by comparing the predicted labels to the true labels. The table has four parts: true positives (the model correctly said “old voice”), true negatives (it correctly said “young voice”), false positives (it said “old voice” but it was actually “young voice”), and false negatives (it said “young voice” but it was actually “old voice”). This helps us see not just how often the model is right, but also what kinds of mistakes it makes. The **accuracy** is a number that tells us **how often the model makes the correct prediction**. Based on a confusion matrix, accuracy is calculated by adding up all the correct predictions (the true positives and true negatives) and dividing that by the total number of predictions made.

In [None]:
# --- (2) Hidden Code ---
from IPython.display import display
eval_button = widgets.Button(description="Evaluate")
eval_output = widgets.Output()

def predict(x, weights, biases):
    for i in range(len(weights)):
        act = 'sigmoid' if i == len(weights)-1 else 'relu'
        x, _ = dense_forward(x, weights[i], biases[i], act)
    return 1 if x[0] > 0.5 else 0

def on_eval_click(b):
    eval_output.clear_output()
    if not trained_model:
        with eval_output:
            print("Please train the model first!")
        return

    weights = trained_model["weights"]
    biases = trained_model["biases"]
    losses = trained_model["loss_trace"]

    ypred = [predict(xi, weights, biases) for xi in X]
    TP = TN = FP = FN = 0
    for true, pred in zip(y, ypred):
        if true == pred:
            if true == 1: TP += 1
            else: TN += 1
        else:
            if true == 1: FN += 1
            else: FP += 1
    acc = (TP + TN) / len(y)

    with eval_output:
        print("Loss (each 10. Epoch):")
        for i in range(9, len(losses), 10):
            print(f"Epoch {i+1:>3}: Loss = {losses[i]:.4f}")
        print("\nConfusion Matrix:")
        print(f"TN: {TN}  FP: {FP}")
        print(f"FN: {FN}  TP: {TP}")
        print(f"Accuracy: {acc:.2f}")

eval_button.on_click(on_eval_click)
display(widgets.VBox([eval_button, eval_output]))

![image.png](img/voice2.png)