## Objective

The objective of this task is to build a Convolutional-Recurrent Neural Network
architecture capable of classifying paintings based on:

- Artist
- Style
- Genre

The CNN component extracts spatial features such as brush strokes,
texture, and color composition, while the RNN component captures
relationships between extracted feature representations.

We use the WikiArt dataset for training and evaluation.


Code in COLAB

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
!pip install datasets



In [3]:
from datasets import load_dataset

dataset = load_dataset(
    "huggan/wikiart",
    split="train",
    streaming=True
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md: 0.00B [00:00, ?B/s]

Resolving data files:   0%|          | 0/72 [00:00<?, ?it/s]



dataset_infos.json: 0.00B [00:00, ?B/s]

In [8]:
label_names = dataset.features["style"].names

for i, name in enumerate(label_names):
    print(i, name)

0 Abstract_Expressionism
1 Action_painting
2 Analytical_Cubism
3 Art_Nouveau
4 Baroque
5 Color_Field_Painting
6 Contemporary_Realism
7 Cubism
8 Early_Renaissance
9 Expressionism
10 Fauvism
11 High_Renaissance
12 Impressionism
13 Mannerism_Late_Renaissance
14 Minimalism
15 Naive_Art_Primitivism
16 New_Realism
17 Northern_Renaissance
18 Pointillism
19 Pop_Art
20 Post_Impressionism
21 Realism
22 Rococo
23 Romanticism
24 Symbolism
25 Synthetic_Cubism
26 Ukiyo_e


In [6]:
from datasets import load_dataset

dataset = load_dataset(
    "huggan/wikiart",
    split="train",
    streaming=True
)

styles_set = set()

for i, item in enumerate(dataset):

    styles_set.add(item["style"])

    if i > 2000:
        break

print(styles_set)

Resolving data files:   0%|          | 0/72 [00:00<?, ?it/s]

{0, 2, 3, 4, 7, 9, 10, 12, 15, 17, 18, 20, 21, 23, 24, 25}


In [9]:
styles = [4, 12, 18, 2, 21]   #styles = ["Impressionism","Cubism","Realism","Surrealism","Baroque"]

subset = (
    example for example in dataset
    if example["style"] in styles
)

In [10]:
import os

label_names = dataset.features["style"].names

save_dir = "/content/drive/MyDrive/ArtExtract/wikiart_subset"

count = 0

for item in dataset:

    if item["style"] in [4, 12, 18, 2, 21]:

        style = label_names[item["style"]]
        img = item["image"]

        style_path = os.path.join(save_dir, style)
        os.makedirs(style_path, exist_ok=True)

        img.save(f"{style_path}/{count}.jpg")

        count += 1

    if count >= 2500:
        break

print("Saved:", count)

Saved: 2500


TASK-1:Convolutional-Recurrent Architectures

In [11]:
!pip install torch torchvision scikit-learn



In [12]:
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor()
])

data_dir = "/content/drive/MyDrive/ArtExtract/wikiart_subset"

dataset = datasets.ImageFolder(
    root=data_dir,
    transform=transform
)

train_loader = DataLoader(
    dataset,
    batch_size=16,
    shuffle=True
)

num_classes = len(dataset.classes)

print(dataset.classes)

['Analytical_Cubism', 'Baroque', 'Impressionism', 'Pointillism', 'Realism']


In [13]:
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


In [14]:
import torch.nn as nn
from torchvision import models

cnn = models.resnet50(pretrained=True)

for param in cnn.parameters():
    param.requires_grad = False

cnn = nn.Sequential(*list(cnn.children())[:-1])



Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth


100%|██████████| 97.8M/97.8M [00:00<00:00, 186MB/s]


In [17]:
class CNN_RNN(nn.Module):

    def __init__(self, num_classes):
        super(CNN_RNN, self).__init__()

        self.cnn = cnn
        self.rnn = nn.LSTM(2048, 256, batch_first=True)
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):

        x = self.cnn(x)
        x = x.view(x.size(0), -1)
        x = x.unsqueeze(1)

        out,_ = self.rnn(x)
        out = self.fc(out[:, -1, :])

        return out

In [18]:
model = CNN_RNN(num_classes).to(device)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

In [19]:
num_epochs = 5

for epoch in range(num_epochs):

    for images, labels in train_loader:

        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Epoch [1/5], Loss: 0.5775
Epoch [2/5], Loss: 0.8560
Epoch [3/5], Loss: 0.2366
Epoch [4/5], Loss: 0.3571
Epoch [5/5], Loss: 0.4671


In [20]:
from sklearn.metrics import classification_report

y_true = []
y_pred = []

model.eval()

with torch.no_grad():

    for images, labels in train_loader:

        images = images.to(device)

        outputs = model(images)
        _, predicted = torch.max(outputs, 1)

        y_true.extend(labels.cpu().numpy())
        y_pred.extend(predicted.cpu().numpy())

print(classification_report(
    y_true,
    y_pred,
    target_names=dataset.classes
))

                   precision    recall  f1-score   support

Analytical_Cubism       1.00      0.36      0.53        11
          Baroque       0.85      0.91      0.88       194
    Impressionism       0.82      0.97      0.89      1368
      Pointillism       0.00      0.00      0.00        14
          Realism       0.92      0.69      0.79       913

         accuracy                           0.85      2500
        macro avg       0.72      0.59      0.62      2500
     weighted avg       0.86      0.85      0.84      2500



  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


The classification report indicates reduced performance
for minority classes such as Analytical Cubism and
Pointillism due to class imbalance within the selected
subset of the WikiArt dataset.

The model tends to prioritize dominant styles such as
Impressionism and Realism, resulting in undefined
precision values for underrepresented classes.

Evaluation metrics such as precision, recall, and
F1-score were used to assess model performance
across stylistic classes.

Weighted averages were considered to account for
class imbalance within the dataset.

In [21]:
import torch.nn.functional as F

outliers = []

model.eval()

with torch.no_grad():

    for images, labels in train_loader:

        images = images.to(device)

        outputs = model(images)

        probs = F.softmax(outputs, dim=1)
        confidence, _ = torch.max(probs, 1)

        for i, conf in enumerate(confidence):

            if conf.item() < 0.4:
                outliers.append(conf.item())

print("Number of Outliers:", len(outliers))

Number of Outliers: 11


Outliers were identified based on low prediction
confidence scores, which may indicate stylistic
ambiguity or mislabelled paintings.

In [22]:
torch.save(
    model.state_dict(),
    "/content/drive/MyDrive/ArtExtract/cnn_rnn_model.pth"
)