In [None]:
"""
Notes from the paper:

The Alexnet paper used Convolutional Neural Networks to win the ImageNet competition in 2012.

Goal:
Image Classification

Dataset Used:
Imagenet-1000
Imagenet is a 15 million labelled high-resolution (Relatively speeaking, compared to NIST which was 28 x28, this is 256 x 256) images in 22,000 categories. 
The 1000 category subset was used for this paper.

Method Used:
Convolution layers, occasionally followed by max-pooling layers. The final layers are fully connected layers, with Dropout layers in between.
Ends with a 1000-way softmax layer.

Architecture:
Input (32, 32)
-> Convolution (5x5, 6 filters) (6, 28, 28)
-> Sub Sampling (6, 14, 14)
-> Sigmoid
-> Convolution (5x5, 16 filters) (16, 10, 10)
-> Sub Sampling (16, 5, 5)
-> Sigmoid
-> Convolution (5x5, 120 filters) (120, 1, 1)
-> Sigmoid
-> Fully Connected (120)
-> Sigmoid
-> Fully Connected (84)
-> Sigmoid
-> RBF (10)

Training Parameters / Hyperparamters:
- Data Augmentation: Randomly cropped 224x224 patches from the 256x256 images, and horizontally mirroring them.
    - This means that on test time, the image is resized to 256x256, and then 5 224x224 patches are cropped from it, and mirrored, and the network is run on all of them. The final prediction is the average of the 10 predictions.
- They wrote a Cuda ConvNet from scratch to train the network. BASED
- SGD with momentum 0.9 and weight decay 0.0005
- Batch Size: 128

Metrics Defined:
Error Rate
- Number of misclassified test samples / Total number of test samples

Top 1 vs top 5 error rate
- Top 1 error rate is the number of test samples for which the correct label is not among the top 1 predicted labels
- Top 5 error rate is the number of test samples for which the correct label is not among the top 5 predicted labels

Results:
- Top-1 error rate: 37.5%
- Top-5 error rate: 17.0%
"""

# Getting ImageNet

In [None]:
!mkdir -p ./data && mkdir -p ./data/Imagenet
!wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_devkit_t12.tar.gz -O ./data/Imagenet/ILSVRC2012_devkit_t12.tar.gz
# !tar -xvf ./data/Imagenet/ILSVRC2012_devkit_t12.tar.gz
!wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar -O ./data/Imagenet/ILSVRC2012_img_train.tar
# !tar -xvf ./data/Imagenet/ILSVRC2012_img_train.tar
!wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar -O ./data/Imagenet/ILSVRC2012_img_val.tar
# !tar -xvf ./data/Imagenet/ILSVRC2012_img_val.tar

--2025-02-13 11:40:03--  https://image-net.org/data/ILSVRC/2012/ILSVRC2012_devkit_t12.tar.gz
Resolving image-net.org (image-net.org)... 171.64.68.16
Connecting to image-net.org (image-net.org)|171.64.68.16|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2568145 (2.4M) [application/x-gzip]
Saving to: ‘./data/Imagenet/ILSVRC2012_devkit_t12.tar.gz’


2025-02-13 11:40:07 (893 KB/s) - ‘./data/Imagenet/ILSVRC2012_devkit_t12.tar.gz’ saved [2568145/2568145]

--2025-02-13 11:40:07--  https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar
Resolving image-net.org (image-net.org)... 171.64.68.16
Connecting to image-net.org (image-net.org)|171.64.68.16|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 147897477120 (138G) [application/x-tar]
Saving to: ‘./data/Imagenet/ILSVRC2012_img_train.tar’

g_train.tar           0%[                    ]  51.15M  1.16MB/s    eta 34h 52m^C
--2025-02-13 11:40:54--  https://image-net.org/data/ILSVRC/2012/IL

In [None]:
from torchvision.datasets import ImageNet

train_data = ImageNet(root='./data/Imagenet', split='train')
