# Predicting painting authors

In this notebook, I will build a model to predict authors of paintings. I will use the [dataset](https://www.kaggle.com/datasets/ikarus777/best-artworks-of-all-time) from Kaggle that has 8,000+ paintings by 50 most famous artists to train the model.

Let's start with importing the necessary libraries and exploring the information about the paintings in the dataset.

In [2]:
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt

In [2]:
image_size = (256, 256)
batch_size = 32

train_ds = tf.keras.utils.image_dataset_from_directory(
    'data\images',
    validation_split=0.2,
    subset='training',
    seed=17,
    image_size=image_size,
    batch_size=batch_size,
)

valid_ds = tf.keras.utils.image_dataset_from_directory(
    'data\images',
    validation_split=0.2,
    subset='validation',
    seed=17,
    image_size=image_size,
    batch_size=batch_size,
)

Found 8446 files belonging to 50 classes.
Using 6757 files for training.
Found 8446 files belonging to 50 classes.
Using 1689 files for validation.


In [3]:
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
valid_ds = valid_ds.cache().prefetch(buffer_size=AUTOTUNE)

In [None]:
model = tf.keras.Sequential([
    tf.keras.layers.Rescaling(1./255),
    tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Conv2D(64, 3, padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(50)
])

In [None]:
model.compile(
  optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['accuracy'])

In [None]:
model.fit(
  train_ds,
  validation_data=valid_ds,
  epochs=10
)

In [None]:
base_model = tf.keras.applications.vgg16.VGG16(input_shape = (256, 256, 3), include_top = False, weights = 'imagenet')
base_model.trainable = False

global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = tf.keras.layers.Dense(50)

model = tf.keras.Sequential([
  base_model,
  global_average_layer,
  prediction_layer
])

In [4]:
base_model = tf.keras.applications.ResNet50(input_shape=(256, 256,3), include_top=False, weights="imagenet")
base_model.trainable = False

global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = tf.keras.layers.Dense(50)

model = tf.keras.Sequential([
  base_model,
  global_average_layer,
  prediction_layer
])

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5


In [5]:
model.compile(
  optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['accuracy'])

In [6]:
model.fit(
  train_ds,
  validation_data=valid_ds,
  epochs=10
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1bbfb93fd00>