# About This Notebook

This implementation is based on a vanilla **swin_large_patch4_window12_384** in Pytorch for the Pawpularity Competition.  
This model uses **both images and dense features** for score prediction.  
**This scores around 18.1 LB.**

Training Params: -
1. **Dataset**: - 3-channel RGB Images (384x384) with separate dense features
2. **Augmentations**: - Resize, Normalize, HorizontalFlip, VerticalFlip, RandomBrightness, RandomResizedCrop, HueSaturationValue, RandomBrightnessContrast
3. **Optimizer**: - AdamW
4. **Scheduler**: - CosineAnnealingLR
5. **Model**: - swin_large_patch4_window12_384
6. **Initial Weights**: - Imagenet
5. **Max Epochs**: - 8 (~23 min per epoch on P100 PCIE GPU)
6. **Saved Weights**: - 10-fold ensemble. Weights having highest OOF score on RMSE metric were saved.

This notebook only contains the inference for the model as described above.

If you are looking for a starter training notebook please follow the link below.
- Baseline Model Notebook:- https://www.kaggle.com/manabendrarout/pawpularity-score-starter-image-dense-train

**NB:-** This training notebook uses a different NN architecture. Not the exact architecture used for this notebook. But apart from the architecture, everything else (training parameters, optimizers, schedulers, etc) is same. I had to use a different architecture for demonstration because Kaggle has a timeout limit which is not possible to adhere with the transformer model.

![SETI](https://www.petfinder.my/images/cuteness_meter.jpg)  

**If you find this notebook useful and use parts of it in your work, please don't forget to show your appreciation by upvoting this kernel. That keeps me motivated and inspires me to write and share these public kernels. 😊**

# Get GPU Info

In [1]:
!nvidia-smi

Thu Oct 21 12:48:33 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.119.04   Driver Version: 450.119.04   CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   35C    P0    28W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+---------------------------------------------------------------------------

# Import

In [2]:
import sys

sys.path.append('../input/pytorch-image-models/pytorch-image-models-master')

In [3]:
# Asthetics
import warnings
import sklearn.exceptions

warnings.filterwarnings('ignore', category=DeprecationWarning)
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings("ignore", category=sklearn.exceptions.UndefinedMetricWarning)

# General
from tqdm.auto import tqdm
import pandas as pd
import numpy as np
import os
import glob
import random
import cv2

pd.set_option('display.max_columns', None)

# Image Aug
import albumentations
from albumentations.pytorch.transforms import ToTensorV2

# Deep Learning
import torch
import torchvision
import timm
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader

# Random Seed Initialize
RANDOM_SEED = 42


def seed_everything(seed=RANDOM_SEED):
	os.environ['PYTHONHASHSEED'] = str(seed)
	np.random.seed(seed)
	random.seed(seed)
	torch.manual_seed(seed)
	torch.cuda.manual_seed(seed)
	torch.backends.cudnn.deterministic = True
	torch.backends.cudnn.benchmark = True


seed_everything()

# Device Optimization
if torch.cuda.is_available():
	device = torch.device('cuda')
else:
	device = torch.device('cpu')
print(f'Using device: {device}')

Using device: cuda


In [4]:
csv_dir = '../input/petfinder-pawpularity-score'
test_dir = '../input/petfinder-pawpularity-score/test'
models_dir = '../input/swin-transformenrs-pet-net'

test_file_path = os.path.join(csv_dir, 'test.csv')
sample_sub_file_path = os.path.join(csv_dir, 'sample_submission.csv')
print(f'Test file: {test_file_path}')
print(f'Models path: {models_dir}')

Test file: ../input/petfinder-pawpularity-score/test.csv
Models path: ../input/swin-transformenrs-pet-net


In [5]:
test_df = pd.read_csv(test_file_path)
sample_df = pd.read_csv(sample_sub_file_path)

In [6]:
def return_filpath(name, folder):
	path = os.path.join(folder, f'{name}.jpg')
	return path

In [7]:
test_df['image_path'] = test_df['Id'].apply(lambda x: return_filpath(x, folder=test_dir))

In [8]:
test_df.head()

Unnamed: 0,Id,Subject Focus,Eyes,Face,Near,Action,Accessory,Group,Collage,Human,Occlusion,Info,Blur,image_path
0,4128bae22183829d2b5fea10effdb0c3,1,0,1,0,0,1,1,0,0,1,0,1,../input/petfinder-pawpularity-score/test/4128...
1,43a2262d7738e3d420d453815151079e,0,1,0,0,0,0,1,1,0,0,0,0,../input/petfinder-pawpularity-score/test/43a2...
2,4e429cead1848a298432a0acad014c9d,0,0,0,1,0,1,1,1,0,1,1,1,../input/petfinder-pawpularity-score/test/4e42...
3,80bc3ccafcc51b66303c2c263aa38486,1,0,1,0,0,0,0,0,0,0,1,0,../input/petfinder-pawpularity-score/test/80bc...
4,8f49844c382931444e68dffbe20228f4,1,1,1,0,1,1,0,1,0,1,1,0,../input/petfinder-pawpularity-score/test/8f49...


# CFG

In [9]:
params = {
	'model': 'swin_large_patch4_window12_384_in22k',
	'dense_features': ['Subject Focus', 'Eyes', 'Face', 'Near',
	                   'Action', 'Accessory', 'Group', 'Collage',
	                   'Human', 'Occlusion', 'Info', 'Blur'],
	'pretrained': False,
	'inp_channels': 3,
	'im_size': 384,
	'device': device,
	'batch_size': 16,
	'num_workers': 2,
	'out_features': 1,
	'debug': False
}

In [10]:
if params['debug']:
	test_df = test_df.sample(frac=0.1)

# Augmentations

In [11]:
def get_test_transforms(DIM=params['im_size']):
	return albumentations.Compose(
		[
			albumentations.Resize(DIM, DIM),
			albumentations.Normalize(
				mean=[0.485, 0.456, 0.406],
				std=[0.229, 0.224, 0.225],
			),
			ToTensorV2(p=1.0)
		]
	)

# Dataset

In [12]:
class CuteDataset(Dataset):
	def __init__(self, images_filepaths, dense_features, targets, transform=None):
		self.images_filepaths = images_filepaths
		self.dense_features = dense_features
		self.targets = targets
		self.transform = transform

	def __len__(self):
		return len(self.images_filepaths)

	def __getitem__(self, idx):
		image_filepath = self.images_filepaths[idx]
		image = cv2.imread(image_filepath)
		image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

		if self.transform is not None:
			image = self.transform(image=image)['image']

		dense = self.dense_features[idx, :]
		label = torch.tensor(self.targets[idx]).float()
		return image, dense, label

# CNN Model

In [13]:
class PetNet(nn.Module):
	def __init__(self, model_name=params['model'], out_features=params['out_features'],
	             inp_channels=params['inp_channels'],
	             pretrained=params['pretrained'], num_dense=len(params['dense_features'])):
		super().__init__()
		self.model = timm.create_model(model_name, pretrained=pretrained, in_chans=inp_channels)
		n_features = self.model.head.in_features
		self.model.head = nn.Linear(n_features, 128)
		self.fc = nn.Sequential(
			nn.Linear(128 + num_dense, 64),
			nn.ReLU(),
			nn.Linear(64, out_features)
		)
		self.dropout = nn.Dropout(0.2)

	def forward(self, image, dense):
		embeddings = self.model(image)
		x = self.dropout(embeddings)
		x = torch.cat([x, dense], dim=1)
		output = self.fc(x)
		return output


class Model(nn.Module):
	def __init__(self):
		super(Model, self).__init__()
		self.model = timm.create_model("vit_base_patch32_384", pretrained=True)
		n_features = self.model.head.in_features
		self.model.head = nn.Linear(n_features, 128)
		self.fc = nn.Sequential(
			nn.Linear(128, 64),
			nn.ReLU(),
			nn.Linear(64, 1),
			nn.Sigmoid()
		)

	def forward(self, x):
		x = self.model(x)
		x = self.fc(x)
		return x


model = Model()
model.load_state_dict(torch.load("../input/dogs-classfier/classfier.pth"))

model.to(device)


# Prediction

In [14]:
import gc

predicted_labels = None
for model_name_cat, model_name_dog in zip(
		glob.glob("../input/d/mithilsalunkhe/swin-transformenrs-pet-net/Models/Cat" + '/*.pth'),
		glob.glob("../input/d/mithilsalunkhe/swin-transformenrs-pet-net/Models/Dog" + '/*.pth')):
	model_cat = PetNet()
	model_cat.load_state_dict(torch.load(model_name_cat))
	model_cat = model_cat.to(params['device'])
	model.eval()
	model_cat.eval()
	model_dog = PetNet()
	model_dog.load_state_dict(torch.load(model_name_dog))
	model_dog = model_dog.to(params['device'])
	model_dog.eval()

	test_dataset = CuteDataset(
		images_filepaths=test_df['image_path'].values,
		dense_features=test_df[params['dense_features']].values,
		targets=sample_df['Pawpularity'].values,
		transform=get_test_transforms()
	)
	test_loader = DataLoader(
		test_dataset, batch_size=params['batch_size'],
		shuffle=False, num_workers=params['num_workers'],
		pin_memory=True
	)

	temp_preds = None
	with torch.no_grad():
		for (images, dense, target) in tqdm(test_loader, desc=f'Predicting. '):
			images = images.to(params['device'], non_blocking=True)
			dense = dense.to(params['device'], non_blocking=True)
			for i in images:
				i = torch.unsqueeze(i, 0)
				with torch.no_grad():
					which_one = model(i).to('cpu').numpy()
				if which_one <= 0.5:
					with torch.no_grad():
						predictions = torch.sigmoid(model_cat(images, dense)).to('cpu').numpy() * 100

				else:
					with torch.no_grad():
						predictions = torch.sigmoid(model_dog(images, dense)).to('cpu').numpy() * 100
		del images
		del model_dog
		del model_cat
		gc.collect()

		if temp_preds is None:
			temp_preds = predictions
		else:
			temp_preds = np.vstack((temp_preds, predictions))

	if predicted_labels is None:
		predicted_labels = temp_preds
	else:
		predicted_labels += temp_preds

	torch.cuda.empty_cache()

predicted_labels /= (len(glob.glob(models_dir + '/*.pth')))

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

Predicting. :   0%|          | 0/1 [00:00<?, ?it/s]

In [15]:
sub_df = pd.DataFrame()
sub_df['Id'] = test_df['Id']
sub_df['Pawpularity'] = predicted_labels

In [16]:
sub_df.head()

Unnamed: 0,Id,Pawpularity
0,4128bae22183829d2b5fea10effdb0c3,46.967113
1,43a2262d7738e3d420d453815151079e,46.794594
2,4e429cead1848a298432a0acad014c9d,47.196003
3,80bc3ccafcc51b66303c2c263aa38486,47.597878
4,8f49844c382931444e68dffbe20228f4,47.858513


In [17]:
sub_df.to_csv('submission.csv', index=False)

**If you find this notebook useful and use parts of it in your work, please don't forget to show your appreciation by upvoting this kernel. That keeps me motivated and inspires me to write and share these public kernels. 😊**