## **Extract features from image using ResNet**

### **Import necessary libraries**

In [1]:
import os 
import platform
import sys

import pandas as pd 
import numpy as np 

import matplotlib.pyplot as plt 
%matplotlib inline 
import seaborn as sns 

import warnings 
warnings.filterwarnings('ignore')

from PIL import Image 

import torch
import torchvision 
import torchvision.transforms as transforms 
import torchvision.models as models 
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn 
from torch.autograd import Variable
from Img2Vec import Img2Vec


In [2]:
current_cwd = os.getcwd()

In [3]:
if platform.system() == 'Windows':
    lst = current_cwd.split('\\', 2)
    new_cwd = '/'.join(lst[:2])
    os.chdir(new_cwd)
    sys.path.append(new_cwd + '/code')
    IMAGES_PATH = 'data/flickr8k/Images'
    CAPTIONS_PATH = 'data/flickr8k/captions.txt'
else:
    IMAGES_PATH = os.path.join('../../', 'data', 'flickr8k', 'Images')
    CAPTIONS_PATH = os.path.join('../../', 'data', 'flickr8k', 'captions.txt')
print(platform.system())

Windows


In [4]:
data = pd.read_csv(CAPTIONS_PATH)
data

Unnamed: 0,image,caption
0,1000268201_693b08cb0e.jpg,A child in a pink dress is climbing up a set o...
1,1000268201_693b08cb0e.jpg,A girl going into a wooden building .
2,1000268201_693b08cb0e.jpg,A little girl climbing into a wooden playhouse .
3,1000268201_693b08cb0e.jpg,A little girl climbing the stairs to her playh...
4,1000268201_693b08cb0e.jpg,A little girl in a pink dress going into a woo...
...,...,...
40450,997722733_0cb5439472.jpg,A man in a pink shirt climbs a rock face
40451,997722733_0cb5439472.jpg,A man is rock climbing high in the air .
40452,997722733_0cb5439472.jpg,A person in a red shirt climbing up a rock fac...
40453,997722733_0cb5439472.jpg,A rock climber in a red shirt .


In [5]:
images_data = data.drop_duplicates(subset=['image'])
images_data

Unnamed: 0,image,caption
0,1000268201_693b08cb0e.jpg,A child in a pink dress is climbing up a set o...
5,1001773457_577c3a7d70.jpg,A black dog and a spotted dog are fighting
10,1002674143_1b742ab4b8.jpg,A little girl covered in paint sits in front o...
15,1003163366_44323f5815.jpg,A man lays on a bench while his dog sits by him .
20,1007129816_e794419615.jpg,A man in an orange hat starring at something .
...,...,...
40430,990890291_afc72be141.jpg,A man does a wheelie on his bicycle on the sid...
40435,99171998_7cc800ceef.jpg,A group is sitting around a snowy crevasse .
40440,99679241_adc853a5c0.jpg,A grey bird stands majestically on a beach whi...
40445,997338199_7343367d7f.jpg,A person stands near golden walls .


### **Introduction to ResNet**
ResNet, or Residual Network, is a deep neural network architecture that uses residual (skip) connections to enable very deep networks by overcoming the vanishing gradient problem. Key elements include:

* Residual Connections: These bypass certain layers, making it easier to learn identity mappings and simplifying optimization.
* Building Blocks: The network is composed of residual blocks (basic or bottleneck), each containing a few convolutional layers.
* Variants: ResNet has multiple versions (e.g., ResNet-18, ResNet-50,...), with deeper versions using bottleneck blocks for efficiency.

ResNet is highly effective for image tasks like classification and segmentation and is widely used in fields like object detection and image captioning due to its strong performance and scalability.

### **Load the ResNet18**

![ResNet18 Architecture](https://www.researchgate.net/publication/382725662/figure/fig2/AS:11431281264192717@1722451506224/ResNet-18-Architecture.ppm)

In [6]:
resnet18_image = {}
cnt = 0
img2vec = Img2Vec('resnet18')
for img in images_data.image:
    resnet18_image[img] = img2vec.get_vector(IMAGES_PATH + '/' + img)
    cnt += 1
    if cnt % 1000 == 0:
        print(f"Completed {cnt} images")

Completed 1000 images
Completed 2000 images
Completed 3000 images
Completed 4000 images
Completed 5000 images
Completed 6000 images
Completed 7000 images
Completed 8000 images


In [7]:
tensor = resnet18_image[images_data.image[0]]
tensor, len(tensor)

(tensor([3.8345e-01, 1.1888e+00, 2.0343e+00, 6.4980e-01, 1.0371e+00, 5.0759e-01,
         2.0290e+00, 1.4654e+00, 8.8523e-02, 8.8243e-01, 4.3587e-01, 8.8620e-01,
         7.2078e-02, 1.2284e+00, 1.0852e+00, 6.1441e-01, 1.5507e+00, 1.3863e+00,
         1.5646e+00, 1.2428e+00, 1.8518e+00, 2.4511e+00, 2.6280e-01, 6.4997e-01,
         1.5667e+00, 3.0497e-01, 1.6401e+00, 1.1432e+00, 3.0494e-01, 1.4645e-01,
         6.5222e-01, 1.1068e+00, 1.0753e-01, 7.5800e-01, 5.9176e-01, 1.1517e+00,
         2.8855e+00, 2.1484e+00, 3.3196e-01, 1.9821e+00, 1.1292e+00, 9.7843e-01,
         3.4590e-01, 9.2510e-01, 4.8257e-01, 5.1201e-01, 1.0385e+00, 7.7162e-01,
         8.1492e-01, 1.0990e+00, 2.0924e+00, 4.1963e-01, 1.7711e+00, 1.8155e+00,
         5.3877e-01, 1.2503e-01, 2.1425e-01, 3.5798e-01, 7.5806e-01, 1.0883e+00,
         1.1352e+00, 2.4458e+00, 8.2191e-02, 1.6267e+00, 3.0328e-01, 7.6934e-01,
         3.2321e+00, 3.9877e-01, 1.6922e+00, 9.7383e-02, 1.1044e+00, 5.6156e-01,
         9.7783e-01, 6.5980e

### **Load the ResNet34**

![ResNet34 Architecture](https://www.researchgate.net/publication/368590488/figure/fig2/AS:11431281120814346@1676648971853/Architecture-of-ResNet34-29.png)

In [8]:
resnet34_image = {}
cnt = 0
img2vec = Img2Vec('resnet34')
for img in images_data.image:
    resnet34_image[img] = img2vec.get_vector(IMAGES_PATH + '/' + img)
    cnt += 1
    if cnt % 1000 == 0:
        print(f"Completed {cnt} images")

Completed 1000 images
Completed 2000 images
Completed 3000 images
Completed 4000 images
Completed 5000 images
Completed 6000 images
Completed 7000 images
Completed 8000 images


In [9]:
tensor = resnet34_image[images_data.image[0]]
tensor, len(tensor)

(tensor([1.3280e+00, 7.1229e-01, 1.1002e+00, 3.3689e+00, 8.4398e-01, 5.5503e-01,
         1.2430e+00, 6.7631e-02, 1.3021e+00, 5.3909e-01, 2.0122e+00, 1.3379e-01,
         4.4341e-02, 4.5915e-01, 1.1854e-01, 2.9338e+00, 1.7253e-01, 4.3192e-01,
         1.0882e-01, 7.8633e-01, 1.0270e+00, 1.2848e+00, 1.5129e+00, 2.6812e+00,
         1.7159e+00, 8.1330e-01, 5.1902e-01, 9.1251e-01, 2.4330e+00, 8.3247e-01,
         5.3569e-01, 1.8152e+00, 2.5882e+00, 1.3565e+00, 1.3696e+00, 5.4149e-01,
         8.7416e-01, 1.2207e+00, 5.2889e-02, 1.3354e-01, 9.8121e-01, 1.0766e+00,
         2.9996e-01, 9.4546e-01, 7.6718e-02, 1.9805e+00, 3.1442e-01, 1.0759e+00,
         1.7863e-01, 9.8355e-01, 9.7632e-01, 1.5471e+00, 1.1189e+00, 9.9649e-01,
         3.9684e-01, 4.4689e-01, 6.5092e-01, 1.0906e+00, 9.9826e-01, 1.5802e-01,
         5.3982e-01, 7.3766e-01, 6.2483e-01, 1.0603e+00, 6.5619e-01, 9.6288e-01,
         3.9365e-01, 5.3019e-01, 2.2195e+00, 7.2485e-01, 4.0142e-01, 7.9710e-01,
         1.1819e-02, 4.1698e

### **Load the ResNet50**

![ResNet50 Architecture](https://www.researchgate.net/publication/350421671/figure/fig1/AS:1005790324346881@1616810508674/An-illustration-of-ResNet-50-layers-architecture.png)

In [10]:
resnet50_image = {}
cnt = 0
img2vec = Img2Vec('resnet50')
for img in images_data.image:
    resnet50_image[img] = img2vec.get_vector(IMAGES_PATH + '/' + img)
    cnt += 1
    if cnt % 1000 == 0:
        print(f"Completed {cnt} images")

Completed 1000 images
Completed 2000 images
Completed 3000 images
Completed 4000 images
Completed 5000 images
Completed 6000 images
Completed 7000 images
Completed 8000 images


In [11]:
tensor = resnet50_image[images_data.image[0]]
tensor, len(tensor)

(tensor([0.7025, 0.0000, 0.1237,  ..., 0.0520, 0.0000, 0.0122]), 2048)

### **Load the ResNet101**

![ResNet101 Architecture](https://www.researchgate.net/publication/354871169/figure/fig4/AS:1072734629011458@1632771274994/The-modified-architecture-of-Resnet-101.ppm)

In [12]:
resnet101_image = {}
cnt = 0
img2vec = Img2Vec('resnet101')
for img in images_data.image:
    resnet101_image[img] = img2vec.get_vector(IMAGES_PATH + '/' + img)
    cnt += 1
    if cnt % 1000 == 0:
        print(f"Completed {cnt} images")

Completed 1000 images
Completed 2000 images
Completed 3000 images
Completed 4000 images
Completed 5000 images
Completed 6000 images
Completed 7000 images
Completed 8000 images


In [13]:
tensor = resnet101_image[images_data.image[0]]
tensor, len(tensor)

(tensor([0.0000, 0.0000, 0.0000,  ..., 0.6880, 0.0000, 0.0162]), 2048)

### **Load the ResNet152**

![ResNet152 Architecture](https://www.researchgate.net/publication/355839262/figure/fig1/AS:1085510193819652@1635817206566/ResNet152V2-Architecture.png)

In [14]:
resnet152_image = {}
cnt = 0
img2vec = Img2Vec('resnet152')
for img in images_data.image:
    resnet152_image[img] = img2vec.get_vector(IMAGES_PATH + '/' + img)
    cnt += 1
    if cnt % 1000 == 0:
        print(f"Completed {cnt} images")

Completed 1000 images
Completed 2000 images
Completed 3000 images
Completed 4000 images
Completed 5000 images
Completed 6000 images
Completed 7000 images
Completed 8000 images


In [15]:
tensor = resnet152_image[images_data.image[0]]
tensor, len(tensor)

(tensor([0.0000e+00, 4.1305e-04, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         2.8977e+00]),
 2048)