# Setting up the environment
We will import keras, the API that acts as a higher level on top on libraries like TensorFlow.

To improve the training performance, we will use a GPU in google colab. To do so, click on Connect (Top-right) and then "Change runtime type" to T4 GPU.

In [None]:
import keras
keras.__version__

# Introduction to ConvNets: Classifying handwritten numbers


Let's take a look at a simple example of a convnet. We will use it to classify the MNIST dataset (Modified National Institute of Standards and Technology database), which is an open dataset containing handwritten numbers. The MNIST dataset consists of 70,000 grayscale images of handwritten digits (0-9).

![Handwritten numbers from the MNIST dataset](https://corochann.com/wp-content/uploads/2021/09/mnist_plot.png)

We are now going to train our network with the images from the MNIST dataset.

We then load the dataset and put it into vectors: train_images, train_labels, test_images, test_labels

Before you continue to design the network, print:

- What is the size of the training dataset?
- What does the training dataset look like? Print the distribution of images in each category.
- What do the training labels look like? List them.
- Print the first four images of the training dataset

In [None]:
from keras.datasets import mnist
from keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

## Network architecture
Our network will be desing with different parts that we will detail. You can see a draft in the next picture (MORE OR LESS what we will define):

![Architecture](https://i.ibb.co/tL8HSPb/2025-01-16-13-45.png)

Let's create a first basic convnet. It's a stack of 'Conv2D' and 'MaxPooling2D' layers.
The important thing to note is that a convnet takes as input tensors of size `(image_height, image_width, image_channels)`.
To do this we must first find out the size of the images in our dataset.

### **Convolutional Layer (Conv2D)**
- **Purpose:** Detects features like edges and textures by applying filters to the input.
- **Key Parameters:**
  - `filters`: Number of filters (feature detectors).
  - `kernel_size`: Size of the filter (commonly (3,3)).
  - `activation`: Non-linear function to introduce complexity (commonly ReLU).

```python
model.add(layers.Conv2D(filters???, kernel_size???, activation='', input_shape=input_shape))
```


## **Pooling Layer (MaxPooling2D)**
- **Purpose:** Reduces the spatial size of the feature maps, decreasing computation and preventing overfitting.
- **Key Parameters:**
  - `pool_size`: The size of the window to take the maximum value.

```python
model.add(layers.MaxPooling2D(pool_size=????))
```

## **Flatten Layer**
- **Purpose:** Converts 2D matrices into a 1D vector to feed into the dense layer.

```python
model.add(layers.????)
```

## **Dense (Fully Connected) Layer**
- **Purpose:** Performs classification based on the features extracted by convolutional and pooling layers.
- **Key Parameters:**
  - `units`: Number of neurons.
  - `activation`: Activation function (commonly ReLU for hidden layers, Softmax for output).

```python
model.add(layers.Dense(units=???, activation=''))  # Hidden layer
model.add(layers.Dense(units=???how many classes???, activation=''))
```


## **Activation Functions**
- **ReLU (Rectified Linear Unit):** Common for hidden layers, it speeds up training and reduces the risk of vanishing gradients.
- **Softmax:** Used in the output layer for multi-class classification, converting outputs into probabilities.


## **Loss Function**
- **Purpose:** Measures how well the model's predictions match the true labels.


```python
model.compile(loss='???')
```

## **Optimizer**
- **Purpose:** Updates the network's weights to minimize the loss function.

```python
model.compile(optimizer='???')
```

🔧 **Adjustable Parameters:**
- Adjust learning rate: `optimizer=???(learning_rate=0.001)`

## **Model Training**
- **Purpose:** Teach the model to recognize digits.

```python
model.fit(x_train_data, y_train_data, epochs=????, batch_size=???, validation_split=???)
```

🔧 **Adjustable Parameters:**
- Increase `epochs` to allow more training cycles.
- Change `batch_size` for different update frequencies.

---


The network must have the following layers:

- A convolutional layer (Conv2D) with 32 3x3 filters and relu activation. In this first layer you must indicate the size of the input (input_shape).
- A second layer of Max Pooling (MaxPooling2D) of 2x2
- A third convolutional layer with 64 3x3 filters and relu activation
- A fourth layer of 2x2 Max Pooling (MaxPooling2D)
- A fifth convolutional layer of 64 3x3 filters and relu activation


You'll know you've done it right when the model.summary() output is:

![imagen_output.png](https://github.com/laramaktub/MachineLearningI/blob/master/imagen_output.png?raw=true)


In [None]:
from keras import layers
from keras import models

input_shape = ()

model = models.Sequential()
model.add(layers.Conv2D(, , activation='', input_shape=input_shape))
model.add...

You can see above that the output of each Conv2D and MaxPooling2D layer is a 3D tensor of dimensions (height, width, channels). The width and height tend to decrease as we go deeper into the network. The number of channels is controlled by the first argument passed to the Conv2D layers (e.g. 32 or 64), it means, the number of applied filters.

The next step would be to give our last tensor (of dimensions (3, 3, 64)) as input to a densely connected network. These classifiers process vectors, which are 1D, while our output is a 3D tensor. So first we will have to flatten our 3D output and convert it to 1D and then add a few dense layers:

- Add a first layer of 64 neurons
- Add a last layer of 10 neurons (as many as you can sort) and softmax activation
- You'll know you've done well when the summary looks like this:

![imagen_output_flat.png](https://github.com/laramaktub/MachineLearningI/blob/master/imagen_output_flat.png?raw=true)

In [None]:
model.add(layers...)


As you can see, our dimensional output `(3, 3, 64)` has been flattened into a vector of dimension `(576,)`, before entering the two dense layers.

We are now going to train our network with the images from the MNIST dataset.

Remember that we loaded the dataset and put it into vectors: train_images, train_labels, test_images, test_labels

Next you will give the appropriate shape to the training and test datasets in order to put them into the neural network. Convert the labels, which right now are numbers, into their categorical form. Check the keras utils manual.


In [None]:
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels =
test_labels =



Compile the model indicating what the training data and its labels are. Using the optimizer "rmsprop" and as a loss function use the categorical cross entropy.
Then train the model for 5 epochs and a batch size of 64.

In [None]:
model.compile( , , metrics=['accuracy'])
model.fit(, , , )

Let's evaluate the model with the test images:

In [None]:
test_loss, test_acc = model.evaluate(,)

Create an image with a your own handwritting number and check the prediction.

In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing import image
import numpy as np
import matplotlib.pyplot as plt

img_width = 28
img_height = 28

# Cargar la imagen con TensorFlow (use the image generated by you)
img = tf.keras.preprocessing.image.load_img('siete.jpg', target_size=(img_width, img_height), color_mode="grayscale")
# Convertir la imagen a un array numpy
x = tf.keras.preprocessing.image.img_to_array(img)
# Expandir las dimensiones para que tenga la forma (1, img_width, img_height, 1)
x = np.expand_dims(x, axis=0)

# Mostrar la imagen
plt.imshow(x[0, :, :, 0], cmap='gray')
plt.axis('off')  # Desactivar los ejes
plt.show()


In [None]:
model.save('net_numbers.h5')

Load the model that you just saved and make a prediction with the number you just generated. Try with several numbers ...does it work properly? Explain why do you think this happens.
