**General description**

This notebook has been developed by Ana Teresa Lopez Jimenez @ LSHTM

It has been used in the preprint: High-content high-resolution microscopy and deep learning assisted analysis reveals host and bacterial heterogeneity during *Shigella* infection. Ana T. López-Jiménez, Dominik Brokatzky, Kamla Pillay, Tyrese Williams, Gizem Özbaykal Güler and Serge Mostowy (2024)

This notebook was used to classify SEPT7 recruitment to *S. flexneri*.

**Importing packages**


In [None]:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
from tensorflow import keras
print(tf.__version__)
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
import numpy as np
import matplotlib.pyplot as plt

**Data loading**


In [None]:
from google.colab import drive
root = '/content/gdrive/'
drive.mount( root )

Input here the folder path containing the folders with annotated images to train the CNN (training and validation)

In [None]:
septin_dir_path = r'/My Drive/folder' # Path of folder containing annotated data
os.makedirs(root+septin_dir_path, exist_ok=True)
os.listdir(root+septin_dir_path)


**Setting Model Parameters**


In [None]:
BATCH_SIZE = 32
IMG_SHAPE  = 128 # Our training data consists of images with width of 128 pixels and height of 128 pixels

In [None]:
train_data_gen.class_indices # Indices for each category or class

**Creating a validation data generator**

In [None]:
image_gen_val = ImageDataGenerator(
    rescale=1./255,
    fill_mode='nearest',)

val_data_gen = image_gen_val.flow_from_directory(batch_size=BATCH_SIZE,
                                                 directory=root+septin_dir_path + '/validation', # change name if necessary for folder containing validation data
                                                 target_size=(IMG_SHAPE, IMG_SHAPE),
                                                 class_mode='binary')

**Generating a Model**

In [None]:
from tensorflow.keras import Model
from tensorflow.keras import Sequential # Model type to be used
from tensorflow.keras.layers import Dense,Activation, Dropout, Conv2D, MaxPooling2D, Flatten, BatchNormalization

In [None]:
model_septin = Sequential([

     Conv2D(4,(3,3), activation = 'relu', input_shape = (128,128,3)),
     BatchNormalization(),
     MaxPooling2D(pool_size=(2,2)),

     Conv2D(8,(3,3), activation = 'relu'),
     BatchNormalization(),
     MaxPooling2D(pool_size=(2,2)),

     Conv2D(16,(3,3), activation = 'relu'),
     Conv2D(16,(3,3), activation = 'relu'),
     BatchNormalization(),
     MaxPooling2D(pool_size=(2,2)),

     Conv2D(32,(3,3), activation = 'relu'),
     Conv2D(32,(3,3), activation = 'relu'),
     BatchNormalization(),
     MaxPooling2D(pool_size=(2,2)),

     Conv2D(64,(3,3), activation = 'relu'),
     BatchNormalization(),
     MaxPooling2D(pool_size=(2,2)),
     Flatten(),
     Dense(300),
     Activation('relu'),
     Dense(1),
     Activation('sigmoid'),
  ])


In [None]:
model_septin.summary()

**Compiling the model**

In [None]:
optimizer = keras.optimizers.Adam(learning_rate=0.01)

model_septin.compile(optimizer=optimizer,
              loss='binary_crossentropy',
              metrics=['accuracy'])


**Training the model**


In [None]:
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass

# Load the TensorBoard notebook extension
%load_ext tensorboard

In [None]:
import datetime, os

In [None]:
logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)

In [None]:
%tensorboard --logdir logs

Model will be saved when minima for validation loss is found in the path specified below

In [None]:
from tensorflow.keras.callbacks import ModelCheckpoint
model_path = '/content/gdrive/My Drive/folder/model_septin.hdf5' # specify folder to save model.
best_model = ModelCheckpoint(filepath = model_path,
                             monitor='val_loss',
                             save_weights_only=False,
                             mode='min',
                             save_best_only=True,
                             verbose=1)

In [None]:
epochs=500
history = model_septin.fit(
    train_data_gen,
    steps_per_epoch=train_data_gen.samples
    epochs=epochs,
    validation_data=val_data_gen,
    validation_steps=val_data_gen.samples
    callbacks=[tensorboard_callback, best_model]
)