In [7]:
import dryml
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
import dryml.data.tf
from dryml.data.tf import TFDataset

# DRYML Tutorial 3

## DRYML Trainables

With `Object`s, `Repo`s, `Dataset`s, and `dryml.context`, we are now ready to do some machine learning!

DRYML machine learning model components are all stored within `dryml.models`. Most important of which is `Trainable`. A `Trainable` is the base class which defines DRYML's machine learning API. Any 'trainable' object must inherit from `Trainable`. It also contains `Pipe` which is an analogue to an sklearn pipe. This allows us to chain `Trainable`s together forming a data pipeline.

`Trainable` is a subclass of `Object` and that means any `Trainable` can be serialized and loaded later.

DRYML provides basic support for major ML frameworks in submodules which you must import.
* `dryml.models.tf` - tensorflow
* `dryml.models.torch` - pytorch
* `dryml.models.sklearn` - sklearn
* `dryml.models.xgb` - xgboost

Each submodule provides classes which implement basic functionality for serialization, and training. While it is possible to build a monolithic class which implements all of these methods, it is recommended (and the base implementations do this) to use an approach more in line with the Entity Component System (ECS) pattern. In this pattern, `Object`s implement different functionality like training program or model architecture, and are combined in a larger `Object` (the `Trainable`). This larger object can then be customized with different components extending its functionality and reducing the number of classes you need to write.

We'll go over some of the sklearn, tensorflow, and pytorch classes as well as the `Trainable` API.

First, let's enable all contexts for this notebook. (feel free to allocate a gpu if your machine has one)

In [2]:
dryml.context.set_context({'default': {}, 'tf': {'gpu/1': 1.}, 'torch': {}})

2022-09-26 13:36:44.738695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 7369 MB memory:  -> device: 1, name: NVIDIA GeForce GTX 1080, pci bus id: 0000:03:00.0, compute capability: 6.1


## Trainable API

DRYML `Trainable` objects require the user to implement just four methods.

* `prep_train(self)`: This method should perform any necessary preparation for an `Object` to be trained. This is needed in some ML frameworks for example pytorch.
* `prep_eval(self)`: This method should perform any necessary prepareation for an `Object` to be evaluated. This is needed in some ML frameworks.
* `train(self, data, train_spec=None, train_callbacks=[])`: This method governs the training of the `Trainable`. The api here is meant to be resumable, as well as allow custom callbacks to be called at each step during the training process.
* `eval(self, data)`: This method evaluates the model on the data. Typically, the model accepts a `Dataset`, and calls the `apply_X` method with an appropriate lambda function.

## Traditional ML training with using example

Let's train a simple model on the traditional mnist digits dataset. We'll use the `tensorflow_datasets` module to get the data.

In [3]:
# Load mnist data
(ds_train, ds_test), ds_info = tfds.load(
    'mnist',
    split=['train', 'test'],
    shuffle_files=True,
    as_supervised=True,
    with_info=True)

In [4]:
# Create a simple model with a couple dense layers
mdl = tf.keras.Sequential([
    tf.keras.layers.Conv2D(16, 3 , input_shape=(28, 28, 1), activation='relu'),
    tf.keras.layers.Conv2D(16, 3, activation='relu'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(10, activation='linear')
])
# prepare loss and optimizer
mdl.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=tf.keras.optimizers.Adam(),
)

In [5]:
# train the model
mdl.fit(ds_train.batch(32), epochs=1)

2022-09-26 13:36:46.765687: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8500




<keras.callbacks.History at 0x7fb362903fa0>

In [9]:
# Compute model accuracy
total_correct = 0
total_num = 0
for x, y in ds_test.batch(32):
    y_pred = tf.argmax(mdl(x), axis=1).numpy()
    total_correct += np.sum(y_pred == y.numpy())
    total_num += y_pred.shape[0]

print(f"accuracy: {total_correct/total_num}")

accuracy: 0.9695


## Basic tensorflow training with DRYML

Now, how do we train such a model in DRYML? Well, we want to create a `Trainable` with the same capability. DRYML offers some pre-built tensorflow functionality. We'll use the generic `Trainable`: `dryml.models.tf.keras.Trainable`. This takes a `model`, an `optimizer`, a `loss`, and a `train_fn`. So, `model` represents a tensorflow model and handles the loading/unloading of the network for compute mode, and save/restore of the object. The `optimizer` object contains a tensorflow optimizer, and `loss` contains a tensorflow loss. Finally, `train_fn` refers to a `dryml.models.tf.TrainFunction` object which defines the training method. The `train_fn` object can store hyperparameters about the training procedure, and `model` can save hyperparameters about the network. This means we can mix and match models and training methods without having to create new classes to contain them. Let's see this in action.

We'll use `dryml.models.tf.keras.BasicTraining` which implements a basic training regimine for keras models for `train_fn`, and we'll use `dryml.models.tf.keras.SequentialFunctionalModel` for the model.

In [10]:
import dryml.models.tf

In [22]:
# Create Object to hold model
model = dryml.models.tf.keras.SequentialFunctionalModel(
    input_shape=(28, 28, 1),
    layer_defs=[
        ['Conv2D', {'filters': 16, 'kernel_size': 3, 'activation': 'relu'}],
        ['Conv2D', {'filters': 16, 'kernel_size': 3, 'activation': 'relu'}],
        ['Flatten', {}],
        ['Dense', {'units': 10, 'activation': 'linear'}],
    ]
)

# Create Object to hold the training algorithm
train_fn = dryml.models.tf.keras.BasicTraining(
    epochs=1
)

# Create final trainable
mdl = dryml.models.tf.keras.Trainable(
    model=model,
    optimizer=dryml.models.tf.ObjectWrapper(tf.keras.optimizers.Adam),
    loss=dryml.models.tf.ObjectWrapper(tf.keras.losses.SparseCategoricalCrossentropy, obj_kwargs={'from_logits': True}),
    train_fn=train_fn,
)

In [23]:
# Create TFDatasets to wrap the mnist dataset
train_ds = TFDataset(
    ds_train,
    supervised=True
)

test_ds = TFDataset(
    ds_test,
    supervised=True
)

In [24]:
# Prepare the model for training
mdl.prep_train()
# Train the model
mdl.train(train_ds)



In [28]:
# Compute accuracy of model, we eval, then use .numpy to transform them into numpy arrays we can compute on like before
total_correct = 0
total_num = 0
for mdl_out, y in mdl.eval(test_ds.batch(batch_size=32)).numpy():
    # We have to compute the argmax of the model to get the prediction labels
    y_pred = np.argmax(mdl_out, axis=1)
    # Now we can compute the accuracy
    total_correct += np.sum(y_pred == y)
    total_num += y_pred.shape[0]
print(f"Model accuracy: {total_correct/total_num}")

Model accuracy: 0.9659455128205128


## `Pipe` and data processing

Now, we had to do some extra processing there at the last step. That's where `Pipe` comes in handy. If we need to do some concrete steps to pre or post process the data, we can create more `Trainable`s (which may not need training) to do that processing. Let's create a `Pipe`, and add a `dryml.data.transforms.BestCat` `Trainable` after the model.

In [29]:
pipe = dryml.models.Pipe(
    mdl,
    dryml.data.transforms.BestCat()
)

In [31]:
# Compute accuracy of model, we eval, then use .numpy to transform them into numpy arrays we can compute on like before
total_correct = 0
total_num = 0
for y_pred, y in pipe.eval(test_ds.batch(batch_size=32)).numpy():
    # Now we can compute the accuracy
    total_correct += np.sum(y_pred == y)
    total_num += y_pred.shape[0]
print(f"Model accuracy: {total_correct/total_num}")

Model accuracy: 0.9659455128205128


## DRYML metrics

DRYML also provides a few common metrics which can be computed on a `Dataset`. DRYML provides a categorical accuracy metric we can just use!

In [None]:
import dryml.metrics

In [34]:
dryml.metrics.categorical_accuracy(pipe, test_ds)

0.9659455128205128

## Sklearn model

Now that we've had some experience using DRYML `Trainable`s, Let's look at using an `sklearn` model using the reference implementations in `dryml.models.sklearn`. We'll use `sklearn.neighbors.KNeighborsClassifier` first. One thing to remember about these sklearn methods is that the data needs to have 2 dimensions, so we need to flatten the data before this gets to the model. Thankfully, we have the data transform `dryml.data.transforms.Flatten()`. We'll add that in front of the model in the `Pipe`.

In [38]:
import dryml.models.sklearn
import sklearn.neighbors

In [45]:
# Build sklearn pipe

model_2 = dryml.models.sklearn.ClassifierModel(
    sklearn.neighbors.KNeighborsClassifier,
    n_neighbors=5,
)

mdl2 = dryml.models.sklearn.Trainable(
    model=model_2,
    train_fn=dryml.models.sklearn.BasicTraining(num_examples=1000)
)

pipe2 = dryml.models.Pipe(
    dryml.data.transforms.Flatten(),
    mdl2,
    dryml.data.transforms.BestCat(),
)

In [47]:
# Train the pipe!
pipe2.train(train_ds)

In [49]:
# Instantly compute model accuracy!
dryml.metrics.categorical_accuracy(pipe2, test_ds)

0.8818108974358975

## Wrap-up

This lesson introduced the `Trainable`, the `Pipe` `Trainable`, data transforms like `dryml.data.transforms.Flatten` and `dryml.data.transforms.BestCat`, and metrics like `dryml.metrics.categorical_accuracy`. While users are free to write monolithic `Trainable`s, they are encouraged to write in the ECS style where methods like training function are separated into reusable `Object`s.