# Lab 07 - Trends in Deep Learning

<a target="_blank" href="https://colab.research.google.com/github/andrew-nash/CS6421-labs-2025/blob/main/CS6421_Lab_07.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>


For our final lab, I will provide pointers to some useful and important libraries that will be relevant to working in the area of deep learning going forwards.

I will also provide some pointers to some interesting current topics in DL, and pointers towards reference impementations.



## Deep Learning Libraries in Practice - Pytorch

In these labs, we have worked exclusively in TensorFlow.


Taken from: https://github.com/pytorch/examples/blob/main/mnist/main.py

You will find many people, particularly in academic research, work in and publish models in Pytorch

In [1]:
import argparse
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.optim.lr_scheduler import StepLR

In [2]:
transform=transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))
        ])

dataset1 = datasets.MNIST('./data', train=True, download=True,
                    transform=transform)
dataset2 = datasets.MNIST('./data', train=False,
                    transform=transform)

train_loader = torch.utils.data.DataLoader(dataset1)
test_loader = torch.utils.data.DataLoader(dataset2)

100%|██████████| 9.91M/9.91M [00:00<00:00, 22.2MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 599kB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 4.85MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 5.22MB/s]


In [4]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output


In [None]:
device = torch.device("cpu")

model = Net().to(device)
optimizer = optim.Adadelta(model.parameters(), lr=0.001)
loss_f = nn.CrossEntropyLoss()

epochs = 10



def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))


scheduler = StepLR(optimizer, step_size=1)
log_interval = 1000
for epoch in range(1, epochs + 1):
    # here, train() only acts as in indicatoin that we are training the model
    # we need to be more verbose in applying backpropogation than TF
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = loss_f(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))

    test(model, device, test_loader)
    scheduler.step()



torch.save(model.state_dict(), "mnist_cnn.pt")


Key differences between TensorFlow and Keras are that:

1. Many consider Pytorch is easier for prototyping - pytorch syntax is closer to standard Python syntax than TensorFlow, and computation graphs are dynamically computed (compared to TensorFlow's static computation graphs)
2. Customizing, deploying, scaling and vizualising models is more effective in TensorFlow

More information is available on their website: https://pytorch.org/tutorials/ .

It is sometimes possible to convert models that were trained in Pytorch to be usable in TensorFlow and vice-versa (depending on the specific operations and layers used), using standardized models such as ONNX - https://github.com/onnx/onnx .



## Accelerating Operations in Python on GPU and TPU - JAX

As the prevelance of Deep Learning has grown, and the size and complexity of models in academia and industry, the importance of efficient utilization of GPU and TPU hardware acceleration of linear algebra operations has grown accordingly.

As we have seen in these labs, the basis of most Deep Learning with Python, and scientific computing using Python in general, relies on NumPy. While more efficient than base Python, Numpy operations only utilize CPU resources.

In 2018, Google released JAX which in effect is an implementation of the Numpy API (and some of SciPy's API) with:

1. Operation vectorization
2. Auto-differentiation
3. Parrallel computation

While fully utilizing GPU and TPU accelaration, with the capability of a Just-in-time (JIT) compiler.


Since its intorduction, JAX has grown in popularity and will likely to become an increasingly important component of scientific computing in Python. I would expect that at some point in the future, data processing and defining custom operations in TensorFlow, Keras and other libraries will rely on working in JAX.


Example taken from: https://docs.jax.dev/en/latest/quickstart.html

If you are interested, there are good tutorials available on JAX's own website.


In [11]:
import jax.numpy as jnp

In [12]:
def selu(x, alpha=1.67, lmbda=1.05):
  return lmbda * jnp.where(x > 0, x, alpha * jnp.exp(x) - alpha)

x = jnp.arange(5.0)
print(selu(x))


[0.        1.05      2.1       3.1499999 4.2      ]


Just0in-time compilation can be enabled simply:

In [13]:
from jax import jit
selu_jit = jit(selu)

selu_jit(x)

Array([0.       , 1.05     , 2.1      , 3.1499999, 4.2      ], dtype=float32)

Auto-differeentiation is built-in, and slightly more straightforward than in TensorFlow

In [14]:
from jax import grad

def sum_logistic(x):
  return jnp.sum(1.0 / (1.0 + jnp.exp(-x)))

x_small = jnp.arange(3.)
derivative_fn = grad(sum_logistic)
print(derivative_fn(x_small))

[0.25       0.19661197 0.10499357]


JAX is not a silver bullet, however - for small datasets, the overhead may mean that JAX runs slower than straightforward NumPy operations on CPU.

There also some noticeable limitations to JAX operations compared to NumPy (https://docs.jax.dev/en/latest/notebooks/Common_Gotchas_in_JAX.html):

1. Global variables are prohibited in JAX functions - functions must all be "pure", where all inputs are passed as arguments, and all the results are returned as the function's output - side effects, even those of print statements inside the functions can behave unexpectedly. Read the section on "Pure functions" above for more deails.
2. In-place updates, i.e. updating elements or ranges of elements by index are performed functionally, since JAX arrays are immutable `x.at[idx].set(y)`

# Current Topics in Deep Learning


The following is not exhuastive, but is a selection of some interesting important topics in current Deep LEarning research & common practice, that involve implementational details outside the scope of these labs. I have provided links to documentation and tutorials on each.


## Model Deployment

While the techniques you have seen so far are sufficient for prototyping, when developing models in industry, more steps are involved.

When using deep models for practical purposes, companies will develop pipelines for data processing, model definiton, hyper-parameter tuning and deployment. This broad process falls under the title of MLOps.

There are many MLOps libraries in use, but TFX https://www.tensorflow.org/tfx is a well-documented solution for this that is closely integrated with the models you have seen so far.

TensorFlow also provide some very specific case studies that I would encourage you to read (https://www.tensorflow.org/about/case-studies?filter=TFX) that show exactly how this process is applied in the real world.

## Edge AI, Model Optimization and TinyML

With the increasing size of models being applied in practice, there is much focus on methods to compress models for the most efficient and low-cost, low-latency inference possible - particular on mobile devices.

Common techniques for model optimization are:

1. Quantization, where weight and bias values are converted to a more compact (or efficiently utilizable) data type, either post- or during training
2. Pruning, where weight and bias values that contribute little to effetive classification are removed
3. ADVACNED - model distillation, where large models (teachers) are trained to high performance on a particular datasest and then smaller models (students) are trained against the larger models. https://keras.io/examples/vision/knowledge_distillation/

TensorFlow features a full toolkit for model optimizatoin: https://www.tensorflow.org/model_optimization


LiteRT (previously TFLite) https://ai.google.dev/edge/litert provides excellent resources for compressing deep models, and provides efficient runtimes and SDKs for various platforms and languages.

This includes LiteRT for Microcontrollers https://ai.google.dev/edge/litert/microcontrollers/overview a C++ library that enables models to be executed on microcontrollers.






## Responsible and Explainable AI, Fiarness & Privacy

New EU legistlation, https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai, has placed massive focus on AI givernanace, transperancy, safety & security and fairness.

TensorFlow provides a basic set of guides to introduce some of the important concepts in this area. https://www.tensorflow.org/responsible_ai

Of these, Federated Learning in particular for privacy-preserving applications has become a topic of much research lately.