[Tensorflow](https://rapids.ai/) is a popular, powerful framework for deep learning used by data scientists across industries.

In this example, you'll train Resnet50 architecture to identify different species of birds. This dataset constitutes 40,000+ birds and has been taken from [kaggle](https://www.kaggle.com/gpiosenka/100-bird-species).


First we import necessary libraries.

In [None]:
import tensorflow as tf
import keras
import time

We will be using [Weights & Biases](https://wandb.ai/site) to monitor GPU performance. Users will need to set their own Saturn Cloud env credential for wandb. Check [here](https://saturncloud.io/docs/examples/python/weights-and-biases/qs-wandb/) on more information on creating and connecting W&B account to Saturn Cloud. First, we log into Weights & Biases. 


In [None]:
import wandb
from wandb.keras import WandbCallback

wandb.login()

### Extracting Data
The dataset originally had 285 classes. We have taken subset of this data which has 61 classes . The data is stored in AWS S3.The first time you run this job, you'll need to download the training and test data which will be saved at path `dataset/birds/`.

In [None]:
import s3fs

s3 = s3fs.S3FileSystem(anon=True)
_ = s3.get(
    rpath="s3://saturn-public-data/100-bird-species/100-bird-species/*/*/*.jpg",
    lpath="dataset/birds/",
)

 Run the code below to ensure that TensorFlow just absorbs memory as needed, instead of absorbing all the RAM it has access to on your GPUs.

In [None]:
gpus = tf.config.list_physical_devices("GPU")
if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices("GPU")
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)

### Training
In code below we are constructing Keras data object for training and validation set using `keras.preprocessing.image_dataset_from_directory` . We have chosen Adam optimizer and have set learning rate to 0.02.  We are training our classifier with  ResNet50 architecture, which has 48 Convolution layers along with 1 MaxPool and 1 Average Pool layer. The model is being compiled,trained and saved at path 'model/keras_single/'.

In [None]:
def train_model_fit(n_epochs, base_lr, batchsize, classes):

    model = tf.keras.applications.ResNet50(include_top=True, weights=None, classes=classes)

    # --------- Start wandb --------- #
    wandb.init(config=wbargs, project="wandb_saturn_demo")

    # Data
    train_ds = (
        tf.keras.preprocessing.image_dataset_from_directory(
            "dataset/birds/train", image_size=(224, 224), batch_size=batchsize
        )
        .prefetch(2)
        .cache()
        .shuffle(1000)
    )

    valid_ds = tf.keras.preprocessing.image_dataset_from_directory(
        "dataset/birds/valid", image_size=(224, 224), batch_size=batchsize
    ).prefetch(2)

    optimizer = keras.optimizers.Adam(lr=base_lr)
    model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
    start = time.time()

    model.fit(train_ds, epochs=n_epochs, validation_data=valid_ds, callbacks=[WandbCallback()])
    end = time.time() - start
    print("model training time", end)
    wandb.log({"training_time": end})

    # Close your wandb run
    wandb.run.finish()

    tf.keras.models.save_model(model, "model/keras_single/")

In code below we are setting up necessary parameters . We are only running 2 epochs, to save time, but once you've got this working you'll have all the information you need to build and run bigger Tensorflow models on Saturn Cloud. A single GPU reviews all our batches every epoch. All the model parameters, as well as some extra elements like Notes and Tags are tracked by Weights & Biases.

In [None]:
model_params = {"n_epochs": 2, "base_lr": 0.02, "classes": 285, "batchsize": 64}

wbargs = {
    **model_params,
    "Notes": "tf_v100_2x",
    "Tags": ["single", "gpu", "tensorflow"],
    "dataset": "Birds",
    "architecture": "ResNet50",
}

Now run the model training process, and save your trained model object to the Jupyter instance memory. 

In [None]:
tester = train_model_fit(**model_params)