Intro

In that notebook you will guided into a step by step pipeline that will generate a model able to produce embeddings on a specific dataset, MNIST.

In [28]:
import tensorflow as tf

print('TensorFlow version: ', tf.__version__)
print('Devices', tf.config.list_physical_devices())

TensorFlow version:  2.6.0
Devices [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]


Here there are some definitions of folder paths that will be usefull later

In [29]:
import os

DATA_DIR = os.path.join(os.getcwd(), 'data')
DATA_OUTPUT_DIR = os.path.join(DATA_DIR, 'output')
MODELS_DIR = os.path.join(os.getcwd(), 'models')
LOGS_DIR = os.path.join(os.getcwd(), 'logs')
INFERENCE_DIR = os.path.join(os.getcwd(), 'inference')

Here follows preparation of the dataset consisting of convert data into npy format. In CHANNELS_MAP you could define which channels to use in case of multi-band images, an example will be provide with a notebook on the eurosat dataset.

In [30]:
from utils.preparation import preparation

IMAGE_FORMAT = "image"
CHANNELS_MAP = { "r": 0, "g": 0, "b": 0}
preparation(DATA_DIR, IMAGE_FORMAT, CHANNELS_MAP)


Detected files inside /home/thomas/Desktop/latent-space-explorer/generator/data/output directory
Detected a metadata file
Detected the same preparation config. Skipping preparation step...


The next step is the data preprocessing, note that the processing will be computed only when the training start. Here follows only the definition of the functions to be added to the tensorflow graph (in order to optimize the computation)

In [31]:
from utils.preprocessing import tf_numpy_load, tf_preprocessing

# Dataset
pattern = os.path.join(DATA_OUTPUT_DIR, '*.npy')
dataset = tf.data.Dataset.list_files(pattern)

dataset = dataset.map(
    lambda file: tf_numpy_load(file),
    num_parallel_calls=tf.data.AUTOTUNE
)

NORMALIZATION_TYPE = "default"
IMAGE_DIM = 28

# Preprocessing
dataset = dataset.map(
    lambda image: tf_preprocessing(
        image,
        tf.constant(IMAGE_DIM, tf.uint16),
        tf.constant(NORMALIZATION_TYPE, tf.string)
    ),
    num_parallel_calls=tf.data.AUTOTUNE
)

length = tf.data.experimental.cardinality(dataset).numpy()
print(f'Dataset: {length}')

Dataset: 10000


A subset of the training set is used to check if the model generalize or overfit. Here you could change the parameter SPLIT_THRESHOLD.

Training and test are further splitted into batches in order to fit the available memory and optimize the training. The parameter to adjust here is BATCH_SIZE

In [32]:
SPLIT_THRESHOLD = 0.8
index = round(length * SPLIT_THRESHOLD)
train_set = dataset.take(index)
test_set = dataset.skip(index + 1)

print('Training set: {}'.format(
    tf.data.experimental.cardinality(train_set).numpy()))
print('Test set: {}'.format(
    tf.data.experimental.cardinality(test_set).numpy()))

train_set = train_set.cache()
test_set = test_set.cache()

BATCH_SIZE = 256

train_set = train_set.shuffle(
    len(train_set)).batch(BATCH_SIZE)
test_set = test_set.batch(BATCH_SIZE)

Training set: 8000
Test set: 1999


The augmentation step is useful to transform the data in a realistic way to increase the variability of the dataset. The operation is random and in order to let the model analyze the original data, AUGMENTATION_THRESHOLD parameter refer to the probability to perform the augmentation. The other parameters activate specific augmentations, that will be executed with a probability of 50%.

In [33]:
from utils.augmentation import tf_augmentation

AUGMENTATION_THRESHOLD = 0.7
AUGMENTATION_FLIP_X = False
AUGMENTATION_FLIP_Y = False
AUGMENTATION_ROTATE = True
AUGMENTATION_ROTATE_DEGREES = 5
AUGMENTATION_SHIFT = True
AUGMENTATION_SHIFT_PERCENTAGE = 10

train_set = train_set.map(
    lambda images:
        tf.cond(
            tf.random.uniform([], 0, 1) < AUGMENTATION_THRESHOLD,
            lambda: tf_augmentation(
                images,
                tf.constant(AUGMENTATION_FLIP_X, tf.bool),
                tf.constant(AUGMENTATION_FLIP_Y, tf.bool),
                tf.constant(AUGMENTATION_ROTATE, tf.bool),
                tf.constant(
                    AUGMENTATION_ROTATE_DEGREES, tf.float32),
                tf.constant(AUGMENTATION_SHIFT, tf.bool),
                tf.constant(
                    AUGMENTATION_SHIFT_PERCENTAGE, tf.float32)
            ),
            lambda: images
        ),
    num_parallel_calls=tf.data.AUTOTUNE
)

train_set = train_set.prefetch(buffer_size=tf.data.AUTOTUNE)
test_set = test_set.prefetch(buffer_size=tf.data.AUTOTUNE)

Here follows definition of the model, in that case a Convolutional AutoEncoder. 

The depth and width of the model is defined through FILTERS constant, each entry in the array is a convolutional layer with the specified amount of kernels/filters. The dimension of the latent vector (the real output of the autoencoder) is definable in the LATENT_DIM constant.

In [34]:
from architectures.cae import CAE

IMAGE_DIM = 28
CHANNELS_NUM = 3 
LATENT_DIM = 8
FILTERS = [32, 64]

model = CAE(
    image_dim=IMAGE_DIM,
    channels_num=CHANNELS_NUM,
    latent_dim=LATENT_DIM,
    filters=FILTERS
)

OPTIMIZER = "Adam"
LEARNING_RATE = 0.001
LOSS = "MeanSquaredError"
model.compile(
    optimizer=OPTIMIZER,
    learning_rate=LEARNING_RATE,
    loss=LOSS
)

Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
down_sampling_0 (DownSamplin (None, 14, 14, 32)        1024      
_________________________________________________________________
down_sampling_1 (DownSamplin (None, 7, 7, 64)          18752     
_________________________________________________________________
flatten_2 (Flatten)          (None, 3136)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 8)                 25096     
Total params: 44,872
Trainable params: 44,680
Non-trainable params: 192
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_5 (Dense)              (None, 3136)              28224     
____________________________________

Experiment folder creation is needed to save models and logs

In [35]:
import datetime

EXPERIMENT_NAME = "MNIST-ConvAutoEncoder-Small"

experiment_dir = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
experiment_dir = "{}-{}".format(experiment_dir, EXPERIMENT_NAME)

# Create model dir
model_dir = os.path.join(MODELS_DIR, experiment_dir)
os.makedirs(model_dir)

# Set logger
log_dir = os.path.join(LOGS_DIR, experiment_dir)
summary_writer = tf.summary.create_file_writer(log_dir)

Finally training could be runned. Note that the preprocessing step is now performed and it could be a delay in the start of the training. So be patients.

A mechanism for saving only the best model is implemented. After the middle of the EPOCHS...

In [36]:
from tqdm import tqdm

EPOCHS = 10

for epoch in tqdm(range(EPOCHS)):

    # Train set
    for batch, train_batch in enumerate(train_set):
        model.train_step(train_batch)

    # Test set
    for batch, test_batch in enumerate(test_set):
        model.test_step(test_batch)

    # Save best model
    if epoch > (EPOCHS / 2):
        model.save_best_model(model_dir)

    # Log
    with summary_writer.as_default():
        model.log(epoch, train_batch, test_batch)

    # Reset losses
    model.reset_losses_state()

 51%|█████     | 51/100 [11:12<09:43, 11.91s/it] 

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 52%|█████▏    | 52/100 [11:27<10:24, 13.01s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 57%|█████▋    | 57/100 [12:45<10:12, 14.25s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 58%|█████▊    | 58/100 [13:07<11:38, 16.64s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 66%|██████▌   | 66/100 [15:26<08:55, 15.76s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 67%|██████▋   | 67/100 [15:45<09:06, 16.56s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 79%|███████▉  | 79/100 [18:33<04:26, 12.68s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 90%|█████████ | 90/100 [20:53<02:00, 12.10s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 93%|█████████▎| 93/100 [21:43<01:46, 15.23s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 94%|█████████▍| 94/100 [22:03<01:40, 16.82s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


 96%|█████████▌| 96/100 [22:41<01:10, 17.64s/it]

INFO:tensorflow:Assets written to: /home/thomas/Desktop/latent-space-explorer/generator/models/20220322-180654-MNIST-ConvAutoEncoder-Small/assets


100%|██████████| 100/100 [23:33<00:00, 14.14s/it]


The last step is to transform input images into embeddings and saves those well organized, ready to be uploaded into Nextcloud.

In [None]:
# Gestire il fatto di non avere il metadata.json
# Lo creo dalle variabili?
# Uso le variabili ma le sovrascrivo leggendole dal metadata?
# Tipo:
# IMAGE_DIM = 1
# IMAGE_DIM = experiment_config['image']['dim']
# Oppure gli faccio fare solo il training da 10 epoche su mnist
# E poi a parte gli faccio fare la stessa pipeline usando il config_file

experiment_dir = os.path.join(INFERENCE_DIR, experiment_dir)
images_dir = os.path.join(experiment_dir, 'images')
generated_dir = os.path.join(experiment_dir, 'generated')
reductions_dir = os.path.join(experiment_dir, 'reductions')
clusters_dir = os.path.join(experiment_dir, 'clusters')
metadata_path = os.path.join(experiment_dir, 'metadata.json')
embeddings_path = os.path.join(experiment_dir, 'embeddings.json')
labels_path = os.path.join(experiment_dir, 'labels.json')

os.makedirs(experiment_dir)
os.makedirs(images_dir)
os.makedirs(generated_dir)
os.makedirs(reductions_dir)
os.makedirs(clusters_dir)