### Copyright (C) Infineon Technologies AG 2025
 
Copyright (c) 2025, Infineon Technologies AG, or an affiliate of Infineon Technologies AG. All rights reserved.
This software, associated documentation and materials ("Software") is owned by Infineon Technologies AG or one of its affiliates ("Infineon") and is protected by and subject to worldwide patent protection, worldwide copyright laws, and international treaty provisions. Therefore, you may use this Software only as provided in the license agreement accompanying the software package from which you obtained this Software. If no license agreement applies, then any use, reproduction, modification, translation, or compilation of this Software is prohibited without the express written permission of Infineon.

Disclaimer: UNLESS OTHERWISE EXPRESSLY AGREED WITH INFINEON, THIS SOFTWARE IS PROVIDED AS-IS, WITH NO WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ALL WARRANTIES OF NON-INFRINGEMENT OF THIRD-PARTY RIGHTS AND IMPLIED WARRANTIES SUCH AS WARRANTIES OF FITNESS FOR A SPECIFIC USE/PURPOSE OR MERCHANTABILITY. Infineon reserves the right to make changes to the Software without notice. You are responsible for properly designing, programming, and testing the functionality and safety of your intended application of the Software, as well as complying with any legal requirements related to its use. Infineon does not guarantee that the Software will be free from intrusion, data theft or loss, or other breaches ("Security Breaches"), and Infineon shall have no liability arising out of any Security Breaches. Unless otherwise explicitly approved by Infineon, the Software may not be used in any application where a failure of the Product or any consequences of the use thereof can reasonably be expected to result in personal injury.

# MNIST MLP Digit Classification

This notebook demonstrates how to build and train a Multi-Layer Perceptron (MLP) for handwritten digit classification using the MNIST dataset and the helper functions from `modelling_helper.py`.

### Notebook Structure

1. **Import Libraries and Helper Functions**
2. **Load and Explore the MNIST Dataset**
3. **Visualize Sample Images**
4. **Data Preprocessing**
5. **Create and Build MLP Model**
6. **Train the Model**
7. **Evaluate Model Performance**
8. **Model Summary**
9. **Compiling the Model for AURIX&trade;**

## Import Libraries and Helper Functions

In [None]:
import sys
import os

parent_dir = os.path.dirname(os.getcwd())
if parent_dir not in sys.path:
    sys.path.insert(0, parent_dir)

from CentralScripts.helper_functions import *
from modelling_helper import *

origin = "tf"  # "tf" stands for TensorFlow

## Load and Explore the MNIST Dataset

This project uses the **MNIST Handwritten Digit Dataset**, a classic benchmark in machine learning and computer vision. The dataset contains:

- **Training data**: 60,000 handwritten digit images (28x28 pixels)
- **Test data**: 10,000 handwritten digit images for evaluation
- **10 classes**: Digits 0 through 9
- **Grayscale images**: Single channel, normalized pixel values

### Dataset Details

- **Image size**: 28×28 pixels, grayscale
- **Total samples**: 70,000 images
- **Classes**: 10 digit classes (0-9)
- **Format**: Preprocessed and normalized pixel values
- **Data split**: 54,000 training, 6,000 validation, 10,000 test samples

The MNIST dataset provides a controlled environment for demonstrating fundamental computer vision techniques that scale to more complex automotive applications. The small image size and computational requirements make it ideal for embedded system deployment validation.

**Data Source**: The dataset is automatically downloaded through TensorFlow/Keras and is part of the standard machine learning benchmarks.

In [None]:
# Load MNIST data
data = load_mnist_data(val_size=0.1)

x_train = data["x_train"]
y_train = data["y_train"]
x_val = data["x_val"]
y_val = data["y_val"]
x_test = data["x_test"]
y_test = data["y_test"]

print(f"Training set: {x_train.shape[0]} samples")
print(f"Validation set: {x_val.shape[0]} samples")
print(f"Test set: {x_test.shape[0]} samples")
print(f"Image shape: {x_train.shape[1:]}")
print(f"Number of classes: {len(np.unique(y_train))}")

## Visualize Sample Images

In [None]:
# Plot sample images
plot_sample_images(x_train, y_train, n_samples=10)

## Data Preprocessing

In [None]:
# Normalize pixel values to 0-1 range
x_train_norm, x_val_norm, x_test_norm = normalize_data(x_train, x_val, x_test)

print(f"Original pixel range: [{x_train.min()}, {x_train.max()}]")
print(f"Normalized pixel range: [{x_train_norm.min()}, {x_train_norm.max()}]")

## Create and Build MLP Model

### Multi-Layer Perceptron (MLP)
- **Architecture**: Fully connected layers with dropout regularization
- **Input**: Flattened 28×28 pixel images (784 features)
- **Hidden layers**: Configurable dense layers with ReLU activation
- **Output**: 10-class softmax classification
- **Use case**: Baseline model, resource-constrained environments

- Tweak the `hidden_units` list to test the effect of the model architecture on predictions

In [None]:
input_shape, output_shape = get_input_output_shapes(x_train, y_train)

hidden_units = [32, 32, 16]  # Three hidden layers
dropout_rate = 0.1

model = create_mlp_model(
    input_shape=input_shape,
    output_shape=output_shape,
    units=hidden_units,
    p_drop=dropout_rate,
)

# Display model summary
model.summary()

# Train the model

- The training process uses early stopping to avoid overfitting.
- To ensure proper convergence learning rate reduction is used when training stalls.

In [None]:
# Train the model
history = train_model(
    model=model,
    x_train=x_train_norm,
    y_train=y_train,
    x_val=x_val_norm,
    y_val=y_val,
    batch_size=64,
    epochs=10,
)

plot_training_history(history, model_name="MLP Classifier")

## Evaluate Model Performance

- Classification accuracy as overall statistical metric
- Confusion matrix to see which digits the model has difficulties to classify correctly

In [None]:
# Evaluate on test set
test_loss, test_accuracy = evaluate_model(model, x_test_norm, y_test)

print(f"Test Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")

# Plot confusion matrix
plot_confusion_matrix(
    model, x_test_norm, y_test, class_names=[str(i) for i in range(10)]
)

# Export Model to ONNX

- Generate a model name based on its architecture and loss
- Save the model together with a test in- & output

In [None]:
model_name = generate_model_name(model, test_accuracy, test_loss)

# Prepare input and output for model export (following RUL pattern)
input_target = x_train_norm[0]  # Shape: (28, 28) - unbatched like RUL example
output_target = get_predictions(origin, model, input_target)

save_all(model_name, input_target, output_target, model, origin)

# Convert Model for AURIX&trade; TC3x and TC4x deployment

- Make sure the docker container with conversion tools is running
- Submit the model together with test data to the container and download the generated code, binary and log files.
- Results are saved in the `out/<model_name>/test_<model_name>/<target>/` folder

In [None]:
ensure_docker_container()

In [None]:
from CentralScripts.python_flask_client import CallTools

model_folder, onnx_model_file = get_output_paths(model_name)

for target in ["TC3", "TC4"]:
    tool = CallTools(
        folder=model_folder, url="http://localhost:8080/convert", target=target
    )
    tool.convert_model()

## Simulated instruction counts

- For each node in the neural network the number of instruction counts is extracted from the log file and plotted.
- You can inspect which node is a computational bottleneck and adjust your network.

In [None]:
plot_instruction_counts(model_name)