<a href="https://colab.research.google.com/github/czaacza/Neural-Networks-for-Machine-Learning-Applications/blob/master/case_0_learning_basics_Mateusz_Czarnecki.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Case 0. Learning basics
**Neural Networks for Machine Learning Applications**<br>
09.01.2023<br>
Mateusz Czarnecki<br>
[Information Technology, Bachelor's Degree](https://www.metropolia.fi/en/academics/bachelors-degrees/information-technology)<br>
[Metropolia University of Applied Sciences](https://www.metropolia.fi/en)

- **v3**: Simplified version based on discussion with JK.
- **v4**: Added conversion of labels [to_categorical](https://www.tensorflow.org/api_docs/python/tf/keras/utils/to_categorical) and changed the loss function to [categorical crossentropy](https://www.tensorflow.org/api_docs/python/tf/keras/metrics/categorical_crossentropy).
- **v5**: Changes in instructions wordings.

## 1. Introduction

The main objectives of the following notebook is to get familiar with the basics of neural networks, using a specific example - **classifying black and white handwritten digits**. We are going to create as simple neural model as possible and train it in a small number of epochs, just to get the basics of how the neural network training looks in code and what library is useful to perform a task like this.



## 2. Setup

To take on the challenge of creating and training our machine learning model, we are going to use a Python machine learning library called **Tensorflow**

Tensorflow not only includes all the functions we need when creating a simple neural network, but also provides us the hand-written digits images.

In [None]:
import tensorflow as tf
print(f'tensorflow: {tf.__version__}')

tensorflow: 2.9.2


## 3. Dataset

The dataset we're using is the **MNIST** digits classification dataset. The data consists of 60 000 training and 10 000 testing elements representing 28x28 pixel images, each containing a handwritten digit (0-9). 

The elements are in fact **not images**, but arrays of integers between 0 and 255, where every number represents an amount of white color in a pixel. (0 - a white pixel and 255 - a black pixel).

We are collecting the dataset using load_data() function from tensorflow library. The data is being assigned to variables named **x_train**, **y_train**, **x_test** and **y_test**, where:

- **x_train** - input images for training
- **y_train** - output labels for training
- **x_test** - input images for testing
- **y_test** - output labels for training


In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
print(f'x_train: shape {x_train.shape} and ndim {x_train.ndim}')
print(f'x_test:  shape {x_test.shape} and ndim {x_test.ndim}')

print(f'y_train: shape {y_train.shape} and ndim {y_train.ndim}')
print(f'y_test:  shape {y_test.shape} and ndim {y_test.ndim}')

x_train: shape (60000, 28, 28) and ndim 3
x_test:  shape (10000, 28, 28) and ndim 3
y_train: shape (60000,) and ndim 1
y_test:  shape (10000,) and ndim 1


## 4. Preprocessing

To use our data with neural network, firstly, we need to preprocess it.

To make our input data easier to work with, we are gonna scale it, by dividing its value by 255. This way, instead of numbers 0-255 (white-black) for each pixel, we're getting the numbers **from 0 to 1**. 

Our output labels also need modification. For now, they are treated as integer numbers from 0 to 9. Instead of it, we want them to be seen as **separate categories** of data without any mathematical connection between them. If we show our network the handwritten number 5, we don't want it to think of it as something that is closer to 6 than to 1. For this reason, we use a tensorflow function **to_categorical()** to categorize our data.

In [None]:
x_train = x_train / 255.0 
x_test = x_test / 255.0

y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)

## 5. Modeling

Using the **Sequential()** constructor, we are now going to build our neural network model. I tried to make it as simple as possible to get 0.97 accuracy, and that is why the following model consists of 45 hidden neurons. The activation function we are using in this model is **ReLU**. It's going to reduce the outputs coming from neuron to 0 if the value < 0.

In [None]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(45, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])

In [None]:
model.compile(optimizer = 'adam',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

## 6. Training

The training proccess is being done using a fit() function on our model. We're passing there the inputs and outputs of our training data. On top of that, we are passing the number of epochs, which defines how many iterations over our data we want. In this case, to reduce the complexity and get 0.97 accuracy for our model, I'm using 6 epochs.


In [None]:
model.fit(x_train, y_train, epochs = 6)

Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6


<keras.callbacks.History at 0x7f6fb45876d0>

## 7. Performance and evaluation

After training our data, we're going to check its performance on the test data. As we can see, it's accuracy on the test data reaches 0.97.

In [None]:
model.evaluate(x_test,  y_test, verbose=2)

313/313 - 1s - loss: 0.1028 - accuracy: 0.9708 - 522ms/epoch - 2ms/step


[0.10282813757658005, 0.97079998254776]

## 8. Discussion and conclusions

The settings I tested were a number of dense units, which is now set to 45, and the number of epochs, which is set to 6. The final achieved performance for the test data was 0.9708 accuracy and 0.1028 of loss function indicator.

To improve the model in the future, we could look into the learning algorithm and consider implementing a batch normalization. The other way to imporve our model's performance is to increase its depth by adding more convolutional and pooling layers.

Creating this model was my first experience with neural networks, and therefore I have many observations about the following topic. I learned how to create a simple neural network and how how the neural networks' layers are built of multiple neuron models. I want to explore deep learning in more detail and create more advanced models in the future.

To learn more, see [Tensorflow tutorials](https://www.tensorflow.org/tutorials).