# AlexNet Model
A. Krizhevsky et al. described their architecture as:
* "The first convolutional layer filters the 224x224 input image with 96 kernels of size 11x11 wit a stride of 4 pixels."
* Then, "the second convolutional layer takes as input the (response-pooled) output of the first convolutional layer and it filters it with 256 kernels if size 5x5.
* "The third, fourth, and fifth convolutional layers are connected to one another without any intervening pooling layers"
    * "The third convolutional layer has 384 kernels of size 3x3 connected to the outputs of the second layer"
    * "The fourth convolutional layer has 384 kernels of size 3x3"
    * "The fifth layer has 256 layers of size 3x3"
* The fully-connected layers have 4096 neurons each.

A. Krizhevsky et al. also, defined that after each convolutional layer a ReLu function is applied to the outputs.

**Pooling** is applied to the after the first, second and fifth layers (A. Krizhevsky et al. 2017).

**Dropout** is also applied before and after the first fully connected layer to avoid *overfitting* (A. Krizhevsky et al. 2017).

The model called "AlexNet" is defined in Python with Pytorch in the following manner:

In [1]:
from torch import nn


class AlexNet(nn.Module):
    """
    This is the definition of the AlexNet which will be compared to the Phi-LetNet-5.
    This AlexNet will be modified in order to classify just two classes.
    """
    def __init__(self) -> None:
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=96, kernel_size=11, stride=4),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3,stride=2),
            nn.Conv2d(in_channels=96, out_channels=256, kernel_size=5, stride=1, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(in_channels=256, out_channels=384, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channels=384, out_channels=384, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channels=384, out_channels=256, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2)
        )
        self.flatten = nn.Flatten()
        self.classifier = nn.Sequential(
            nn.Dropout(p=0.5),
            nn.Linear(in_features=6400, out_features=4096),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(in_features=4096,out_features=4096),
            nn.ReLU(inplace=True),
            nn.Linear(in_features=4096, out_features=2),
            nn.LogSoftmax(dim=1)
        )

    def forward(self, x_parameter):
        """
        Data processing method
        """
        x_parameter = self.features(x_parameter)
        x_parameter = self.flatten(x_parameter)
        x_parameter = self.classifier(x_parameter)
        return x_parameter

  from .autonotebook import tqdm as notebook_tqdm


In the class constructor, the class defines 3 properties called features, flatten and classifier.

**Features** contains the definition for the convolutional and max pooling layers.

**Flatten** defines a flattening layer before the fully connected layers.

**Classifier** contains the three fully-connected and dropout layers with the specified inputs. And in order to classify the images in the two categories of the experiment (HAS_CACTUS, NO_CACTUS), a LogSoftmax function that is applied to the output of the fully connected layers to provide the normalized probability distribution as a result like in the definition of E. López-Jiménez, et al.

The *forward* method contains the flow in which the inputs will be processed:
* First, the inputs are processed by the convolutional and max pooling layers (features)
* Then, the output will be flatten to then
* Be fed to the classifier to determine the category they will be in.


## Execution Parameters
For this model, the hyperparameters defined by E. López-Jiménez, et al. were tested for training, but when the training was executed, Pytorch warned that it was not possible to use a batch size of 2500, instead the batch size was reduced to 250 units. The epoch number remained at 150, and the learning rate was diminished to 0.001. This reduction will be explained in the results section. 

In [2]:
LEARNING_RATE = 0.001
EPOCH_COUNT = 150
BATCH_SIZE = 250

## Execution Times
This model posseses execution times between 2 hours and 2 hours and 30 minutes.

# References
* López-Jiménez, Efren; Vasquez-Gomez, Juan Irving; Sanchez-Acevedo, Miguel Angel; Herrera-Lozada, Juan Carlos; Uriarte-Arcia, Abril Valeria (2019); “Columnar Cactus Recognition in Aerial Images using a Deep Learning Approach”. Ecological Informatics. 52. 131-138.
* Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E. (2017-05-24); "ImageNet classification with deep convolutional neural networks" (PDF). Communications of the ACM. 60 (6): 84–90. doi:10.1145/3065386. ISSN 0001-0782.