### 🚀 Deep Learning Insights from Week 8

This week, I dived deep into the fascinating world of Convolutional Neural Networks (CNNs), a powerful tool for analyzing visual imagery. Here are some key takeaways:

Layers of Learning: The journey through CNNs took me through four integral layers - the convolution layer, the ReLU layer, the pooling layer, and the dense layer. Each layer plays a unique role in extracting and interpreting features from images.

Leveraging Transfer Learning: I explored the concept of transfer learning, where I retain the knowledge gained from the convolutional layers and train new dense layers to adapt to our specific task. This approach allows me to harness pre-existing neural networks to accelerate my learning process.

Dropout Technique: I implemented the Dropout technique, a robust method that enhances the generalization of our neural networks. By randomly dropping nodes of a layer during training, we prevent overfitting and improve the model’s performance on unseen data.

Data Augmentation: Lastly, I delved into Data Augmentation, a technique that artificially increases our dataset size by generating new images from existing ones. By introducing minor alterations such as flipping, cropping, and adjusting brightness/contrast, we can enrich our dataset and improve model robustness.

Stay tuned for more exciting updates as I continue our deep learning journey! 💡

Dataset
In this homework, we'll build a model for predicting if we have an image of a bee or a wasp. For this, we will use the "Bee or Wasp?" dataset that was obtained from [Kaggle](https://www.kaggle.com/datasets/jerzydziewierz/bee-vs-wasp) and slightly rebuilt.

Saturn Cloud to run the code
Model
For this homework we will use Convolutional Neural Network (CNN). Like in the lectures, we'll use Keras.

You need to develop the model with following structure:

- The shape for input should be `(150, 150, 3)` 
- Next, create a convolutional layer `(Conv2D)`: 
    - Use 32 filters 
    - Kernel size should be `(3, 3)` (that's the size of the filter) 
    - Use `'relu'` as activation 
- Reduce the size of the feature map with max pooling `(MaxPooling2D)` 
    - Set the pooling size to `(2, 2)` 
- Turn the multi-dimensional result into vectors using a `Flatten` layer 
- Next, add a `Dense` layer with 64 neurons and `'relu'` activation 
- Finally, create the `Dense` layer with 1 neuron - this will be the output 
    - The output layer should have an activation - use the appropriate activation for the binary classification case 
- As optimizer use `SGD` with the following parameters:
    - `SGD(lr=0.002, momentum=0.8)`

***Question 1***
Since we have a binary classification problem, what is the best loss function for us?

- mean squared error
- **binary crossentropy**
- categorical crossentropy
- cosine similarity

Note: since we specify an activation for the output layer, we don't need to set `from_logits=True`

***Question 2***
What's the number of parameters in the convolutional layer of our model? You can use the `summary` method for that.

- 1
- 65
- **896**
- 11214912

### Generators and Training
For the next two questions, use the following data generator for both train and test sets:

`ImageDataGenerator(rescale=1./255)`
- We don't need to do any additional pre-processing for the images.
- When reading the data from train/test directories, check the `class_mode` parameter. Which value should it be for a binary classification problem?
- Use `batch_size=20`
- Use `shuffle=True` for both training and test sets.
For training use `.fit()` with the following params:

```python
model.fit(
    train_ds,
    epochs=10,
    validation_data=test_ds
)
```


***Question 3***
What is the median of training accuracy for all the epochs for this model?

- 0.20
- 0.40
- 0.60
- **0.80**


***Question 4***
What is the standard deviation of training loss for all the epochs for this model?

- 0.031
- 0.061
- **0.091**
- 0.131

### Data Augmentation
For the next two questions, we'll generate more data using data augmentations.

Add the following augmentations to your training data generator:

- rotation_range=50,
- width_shift_range=0.1,
- height_shift_range=0.1,
- zoom_range=0.1,
- horizontal_flip=True,
- fill_mode='nearest'

***Question 5***
Let's train our model for 10 more epochs using the same code as previously.

Note: make sure you don't re-create the model - we want to continue training the model we already started training.

What is the mean of test loss for all the epochs for the model trained with augmentations?

- 0.18
- **0.48**
- 0.78
- 0.108


***Question 6***
What's the average of test accuracy for the last 5 epochs (from 6 to 10) for the model trained with augmentations?

- 0.38
- 0.58
- **0.78**
- 0.98