# Intro to PyTorch
<figure style='float:right;max-width:30%;'>
<img src='https://upload.wikimedia.org/wikipedia/commons/thumb/c/c6/PyTorch_logo_black.svg/640px-PyTorch_logo_black.svg.png' style='padding:10px;background-color:white'>
<figcaption style='text-align:right'>Source: <a href=https://commons.wikimedia.org/wiki/File:PyTorch_logo_black.svg>Wikimedia Commons</a></figcaption>
</figure>

PyTorch is a machine learning framework with a major focus on neural networks used for computer vision, audio and natural language processing. The user-facing frontend is written in Python, but the number-crunching is handled by a more optimized C++ backend, including support for outsourcing computations to graphics cards (GPUs) for a substantial increase in speed. PyTorch was originally created by Meta (formerly known as facebook), but has always been open source, permissively licensed ([BSD-3](https://en.wikipedia.org/wiki/BSD_licenses#3-clause)), and since September 2022 is managed by the non-profit PyTorch Foundation, a subsidiary of the [Linux Foundation](https://en.wikipedia.org/wiki/Linux_Foundation).

The accessible interface, huge community, and optimized implementations have established PyTorch among the top choices for education, research, and production in the field of neural network design.

**Caveat emptor:** PyTorch is not really "better" or "worse" than other popular frameworks like [Keras](https://keras.io/) or [TensorFlow](https://www.tensorflow.org/). While each framework has its particular strengths, they differ more in their style, philosophy, and user base than in their feature lists and performance. You should absolutely explore other options available to you and find what you like best!


## Structure

The core package of PyTorch is called [`torch`](https://pypi.org/project/torch/). This package contains all the code required to setup and compute general purpose neural networks. It is extended by packages that offer more specialized functions and objects specific to various applications: [`torchvision`](https://pytorch.org/vision/stable/index.html) for Computer Vision (working with images or videos), [`torchaudio`](https://pytorch.org/audio/stable/index.html) for audio processing (e.g. speech recognition or synthesis), and [`torchtext`](https://pytorch.org/text/stable/index.html) for natural language processing. PyTorch is extended by various other packages that comprise the [*PyTorch Ecosystem*](https://pytorch.org/ecosystem/).

## Core components

A plethora of functions and objects can be found within PyTorch. But arguably the most important basic components are:

1. The [`Tensor`](https://pytorch.org/docs/stable/tensors.html#tensor-class-reference) class
2. The differentiation engine [`Autograd`](https://pytorch.org/docs/stable/autograd.html#module-torch.autograd)
3. The neural network building blocks (layers and activation functions) found in [`torch.nn`](https://pytorch.org/docs/stable/nn.html#module-torch.nn)

Before we build our first neural network from scratch, let us walk through these components one at a time:

### The `Tensor` class
<figure style='float:right;max-width=10%;'>
<img src=https://imgs.xkcd.com/comics/machine_learning.png style='padding-right:10px'>
<figcaption>Source: <a href=https://xkcd.com/license.html>XKCD</a> </figcaption>
</figure>

Neural networks are essentially a sequence of linear algebra operations. A [mathematical tensor](https://en.wikipedia.org/wiki/Tensor) is the most general algebraic object, of which simpler algebraic objects can be derived:

- A scalar is a tensor of rank 0:
$$
\left[ 0 \right]
$$
- A vector is a tensor of rank 1 (a.k.a. a collection of rank 0 tensors):
$$
\begin{bmatrix} \left[ 0 \right], \left[ 1 \right], \left[ 2 \right] \end{bmatrix}
$$
- A matrix is a tensor of rank 2 (a.k.a. a collection of rank 1 tensors):
$$
\begin{bmatrix} 
  \begin{bmatrix} \left[ 0 \right], \left[ 1 \right], \left[ 2 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 3 \right], \left[ 4 \right], \left[ 5 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 6 \right], \left[ 7 \right], \left[ 8 \right] \end{bmatrix} 
\end{bmatrix}
$$
- An $n$-dimensional array is a tensor of rank $n$ (a.k.a. a collection of rank $n-1$ tensors):
$$
\begin{bmatrix}
\begin{bmatrix} 
  \begin{bmatrix} \left[ 0 \right], \left[ 1 \right], \left[ 2 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 3 \right], \left[ 4 \right], \left[ 5 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 6 \right], \left[ 7 \right], \left[ 8 \right] \end{bmatrix} 
\end{bmatrix}, 
\begin{bmatrix} 
  \begin{bmatrix} \left[ 9 \right], \left[ 10 \right], \left[ 11 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 12 \right], \left[ 13 \right], \left[ 14 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 15 \right], \left[ 16 \right], \left[ 17 \right] \end{bmatrix} 
\end{bmatrix}, 
\ldots
\end{bmatrix}
$$

**Note:** Describing a mathematical tensor as a generalized matrix is not [the whole story](https://medium.com/@quantumsteinke/whats-the-difference-between-a-matrix-and-a-tensor-4505fbdc576c). For the purposes of this introduction, this simplified definition shall, however, suffice.

In PyTorch, everything runs on tensors: Your data is encoded in a tensor, the neural networks are expressed as tensors, sending the data through the network is a series of transformations on a tensor. All of these tensors are represented by a class named [`Tensor`](https://pytorch.org/docs/stable/tensors.html#torch-tensor) found in the core `torch` module. 

In [None]:
import torch
torch.set_default_dtype(torch.float64)

torch.Tensor([[0, 1, 2], [3, 4, 5]])

#### Creating a `Tensor`
There are [many ways to conveniently create tensors](https://pytorch.org/docs/stable/torch.html#creation-ops) from existing data, with specific initializations, or of specific shapes. Let's try out some examples:

In [None]:
print("A rank 3 tensor filled with zeros: \n", zeros := torch.zeros(2, 2, 2))
print("A tensor of the same shape but filled with ones: \n", torch.ones_like(zeros))
print("A rank 2 tensor representing an identity matrix:\n", torch.eye(4))

#### Indexing, slicing, operations

As the central data structure in `PyTorch`, `Tensor` objects support all features of a normal multi-dimensional array. You can access individual elements in a `Tensor` by using Python's or `numpy`'s regular indexing and slicing notation:


In [None]:
t = torch.tensor([[11, 12, 13], [21, 22, 23], [31, 32, 33]])
print(t)
print("Python-style indexing:", t[1][2])
print("Numpy-style indexing:", t[0, 1])
print("Python-style slicing (first row):", t[0][:])
print("Numpy-style slicing (last column):", t[:, -1])

We can perform basic calculations with tensors just as we would expect:

In [None]:
a = torch.tensor([1, 1, 1])
b = torch.tensor([2, 2, 2])

print(f'{a = }, {b = }')
print(f'Addition: {a + b = }')
print(f'Element-wise product: {a * b = }')
print(f'Element-wise division: {a / b = }')
print(f"Matrix multiplicaiton: {a @ b = }")

In addition to basic operations, a number of methods and functions are provided to mirror the functionality offered by `numpy`:

In [None]:
print("Find the maximum value in the tensor:", t.max())
print("Find the minimum value in the tensor:", t.min())
print("Calculate the sum along the first dimension:", t.sum(dim=0))
print("Calculate the sine of each element: ", t.sin())

What makes `Tensor`s special, however, is the added functionality specifically designed for machine learning workflows.

We can, for example, move a tensor to the computer's GPU. 

<div class="alert alert-block alert-info"> 
<b>Note:</b> If you tried to do anything GPU-related on a computer without GPU-access, your program would fail ungracefully. But you should always keep portability in mind: Machine-learning code is often developed and tested on a local machine (e.g. a laptop) and then moved to a cluster or other high-performance computer to do the actual number crunching. 
</div>

There is, however, a way to make sure we only use a device that is actually available on the current machine:

In [None]:
if torch.cuda.is_available():  # CUDA is usually the most desirable backend
    backend = 'cuda'
elif torch.backends.mps.is_available():  # MPS is supported by some MacOS devices
    backend = 'mps'
else:
    backend = 'cpu'

device = torch.device(backend)
print(f'Using {backend.upper()} backend!')

Now we can safely move a tensor to a different device:

In [None]:
t = t.to(device)
t.device

Now all subsequent operations involving the `Tensor` are going to run on the GPU (if available)!

Another major benefit of using `Tensor`s over regular `numpy` arrays is the automatic calculation of gradients, which we will learn more about in the next section. 

### The `Autograd` differentiation engine

<div class="alert alert-block alert-info"> 

This section is in large parts taken from the [`PyTorch` tutorial "The Fundamentals of Autograd"](https://pytorch.org/tutorials/beginner/introyt/autogradyt_tutorial.html#what-do-we-need-autograd-for)!
</div>

#### Gradients in neural network training
Calculating gradients is *the* most important computation when training neural networks. If you would like a quick reminder why, read on. Otherwise, you can skip ahead to the next section.

A machine learning model is a function, with inputs and outputs. For this discussion, we’ll treat the inputs as an i-dimensional vector $\vec{x}$, with elements $x_{i}$. We can then express the model, $M$, as a vector-valued function of the input: 
$$
\vec{y} = \vec{M}\left(\vec{x}\right) 
$$
(We treat the value of $M$’s output as a vector because in general, a model may have any number of outputs.)

Since we’ll mostly be discussing autograd in the context of training, our output of interest will be the model’s loss. The loss function 
$$
L\left(\vec{y}\right) = L\left(\vec{M}\right)
$$ 

is a single-valued scalar function of the model’s output. This function expresses how far off our model’s prediction was from a particular input’s ideal output. *Note:* After this point, we will often omit the vector sign where it should be contextually clear - e.g., $y$ instead of $\vec{y}$.

In training a model, we want to minimize the loss. In the idealized case of a perfect model, that means adjusting its learning weights - that is, the adjustable parameters of the function - such that loss is zero for all inputs. In the real world, it means an iterative process of nudging the learning weights until we see that we get a tolerable loss for a wide variety of inputs.

How do we decide how far and in which direction to nudge the weights? We want to minimize the loss, which means making its first derivative with respect to the input equal to 0: 

$$
\frac{\partial L}{\partial x} = 0 
$$

Recall, though, that the loss is not directly derived from the input, but a function of the model’s output (which is a function of the input directly):

$$
\frac{\partial L}{\partial x}  = \frac{\partial {L({\vec y})}}{\partial x} 
$$

By the chain rule of differential calculus, we have 

$$
\frac{\partial {L({\vec y})}}{\partial x} = \frac{\partial L}{\partial y}\frac{\partial y}{\partial x}  = \frac{\partial L}{\partial y}\frac{\partial M(x)}{\partial x}.$$

$\frac{\partial M(x)}{\partial x}$ is where things get complex. The partial derivatives of the model’s outputs with respect to its inputs, if we were to expand the expression using the chain rule again, would involve many local partial derivatives over every multiplied learning weight, every activation function, and every other mathematical transformation in the model. The full expression for each such partial derivative is the sum of the products of the local gradient of every possible path through the computation graph that ends with the variable whose gradient we are trying to measure.

In particular, the gradients over the learning weights are of interest to us - they tell us what direction to change each weight to get the loss function closer to zero.

Since the number of such local derivatives (each corresponding to a separate path through the model’s computation graph) will tend to go up exponentially with the depth of a neural network, so does the complexity in computing them. 

#### `Autograd` to the rescue

This is where `Autograd` comes in: It tracks the history of every computation performed on a tensor. Every computed tensor in your `PyTorch` model carries a history of its input tensors and the function used to create it. Combined with the fact that `PyTorch` functions meant to act on tensors each have a built-in implementation for computing their own derivatives, this greatly speeds the computation of the local derivatives needed for learning.

Let's look at a simple example: We will create a set of equidistant values between $0$ and $2\pi$ and then apply a few functions to it. Afterwards we will walk *backwards* through the sequence of calculations and differentiate every step long the way.

First up, we create a tensor of 25 linearly spaced values on the interval $[0, 2\pi]$. By default, `Autograd` will not track the gradient of tensors created in this way. We have to set `requires_grad` explicitly to `True`!

In [None]:
import math

a = torch.linspace(0., 2. * math.pi, steps=25, requires_grad=True)
a

We can now do a calculation on it:

In [None]:
b = torch.sin(a)
b

Notice that the result tensor `b` has a property called `grad_fn` that tells us that it is the result of a `sin` operation!

Let's do some more computations:

In [None]:
c = 2 * b
print(c)
d = c + 1
print(d)

Finally, let’s compute a single-element output, as is the case when computing a loss function.

In [None]:
out = d.sum()
out

Each `grad_fn` stored with our tensors allows you to walk the computation all the way back to its inputs with its `next_functions` property. We can drill down on this property to show us the gradient functions for all the prior tensors. 

In [None]:
print('out:')
print(out.grad_fn)
print(out.grad_fn.next_functions)
print(out.grad_fn.next_functions[0][0].next_functions)
print(out.grad_fn.next_functions[0][0].next_functions[0][0].next_functions)
print(out.grad_fn.next_functions[0][0].next_functions[0][0].next_functions[0][0].next_functions)
print(out.grad_fn.next_functions[0][0].next_functions[0][0].next_functions[0][0].next_functions[0][0].next_functions)
print('\nd:')
print(d.grad_fn)
print(d.grad_fn.next_functions)
print(d.grad_fn.next_functions[0][0].next_functions)
print(d.grad_fn.next_functions[0][0].next_functions[0][0].next_functions)
print(d.grad_fn.next_functions[0][0].next_functions[0][0].next_functions[0][0].next_functions)
print('\nc:')
print(c.grad_fn)
print('\nb:')
print(b.grad_fn)
print('\na:')
print(a.grad_fn)

Note that `a.grad_fn` is reported as `None`, indicating that this was an input to the function with no history of its own.

With all this machinery in place, how do we get derivatives out? You call the `backward()` method on the output, and check the input’s `grad` property to inspect the gradients:

In [None]:
out.backward()
print(a.grad)

We will visualize this in a second, but let's try to figure out what we *should* see here. Recall that the computations we did were the following:

$$
d = 2 \cdot \sin\left(a\right) + 1
$$

So the derivative with respect to $a$ should be:

$$
\frac{\partial d}{\partial a} = 2\cdot \cos\left(a\right)
$$

Let's check this by visualizing our result:

In [None]:
import matplotlib.pyplot as plt

# We need to call the method detach() to signal that the gradients should not be tracked from this point on
plt.plot(a.detach(), a.grad.detach())

🎉 Success! 👏

We will see more of `Autograd` in action later, but first we need to talk about how to build neural networks in `PyTorch`.

### Building blocks for neural networks

<figure style='float:right;width:30%'>

<img src="https://pytorch.org/assets/images/densenet1.png">
<figcaption>

The many layers of [DenseNet](https://pytorch.org/hub/pytorch_vision_densenet/)
</figcaption>
</figure>

Neural networks are made up of a sequence of layers. Unsurprisingly, `PyTorch` offers a great variety of layers as building blocks to string together any desired architecture. They can be find alongside various activation functions in the submodule [`torch.nn`](https://pytorch.org/docs/stable/nn.html).

You can find descriptions of each layer [in the documentation](https://pytorch.org/docs/stable/nn.html#module-torch.nn), but let's quickly mention some of the basic layers together:

- [`torch.nn.Linear`](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear): A linear transformation of the incoming data:
  $$ y = xA^\mathrm{T} + b$$
- [`torch.nn.Conv2d`](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#conv2d): A 2D convolution over the incoming data
- [`torch.nn.LSTM`](https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html#lstm): Applies a long short-term memory RNN to the input data
- [`torch.nn.Dropout`](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#dropout): A dropout layer to randomly zero some of the input values during training

The number of available layer types constantly increases as new architectures are developed in the field. In addition to these, `torch.nn` contains a number of containers to facilitate composing multiple layers into a neural network, which we will do in the next section.


## A neural network from scratch
Let's design a neural network to classify flowers from the [Iris dataset](https://archive.ics.uci.edu/ml/datasets/iris)! 

We can read the data from the provided CSV file and split it into train and test set using [`pandas`](https://pandas.pydata.org/):

In [None]:
import pandas as pd

dataset = pd.read_csv('data/iris.csv')
# Encode species
class_names = dataset['class'].unique()
dataset['class'] = dataset['class'].map({name: idx for idx, name in enumerate(class_names)})

# Split data randomly into training (90 %) and test (10 %) sets
training_data = dataset.sample(frac=0.9)
test_data = dataset.drop(training_data.index)

Now that we have some data, we can start designing a neural network for it. Identifying the optimal architecture for this problem is beyond the scope of this notebook. Our focus here is on understanding the building blocks of our neural network and its implementation. We could therefore try something like this:

<img src="img/iris_network.svg" style="background-color:white;padding:1em">

All neural networks (and, as a matter of fact, also all layers) are derived from the container `Module` found in `torch.nn`. We define the architecture in the subclass' `__init__` method. We also need to implement the `forward` method to define how our network is processing input data:

In [None]:
import torch.nn as nn

class MyNeuralNetwork(nn.Module):
    def __init__(self, input_length, hidden_layer_size, n_classes):
        # Initialize the superclass first
        super(MyNeuralNetwork, self).__init__()
        
        """ Now we can define the network's structure """
        # The network consists of a single, linear hidden layer...
        self.hidden_layer = nn.Linear(input_length, hidden_layer_size)
        # ...and an output layer
        self.output_layer = nn.Linear(hidden_layer_size, n_classes)

    def forward(self, x):
        """ Here we define the networks behavior as inputs are passed through it """
        # The output of the hidden layer is passed through a tanh activation function
        hidden_layer_activation = torch.tanh(self.hidden_layer(x))
        logits = self.output_layer(hidden_layer_activation)
        return logits

In [None]:
n_features = len(dataset.columns) - 1 
n_species = dataset['class'].nunique()
net = MyNeuralNetwork(input_length=n_features, hidden_layer_size=8, n_classes=n_species)

In [None]:
import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

In [None]:
print('Training', end='')
for epoch in range(100):
    if epoch % 10 == 0:
        print('.', end='')
    for _, observation in training_data.sample(frac=1).iterrows():
        inputs = torch.tensor(observation[['sepal length in cm', 'sepal width in cm', 'petal length in cm', 'petal width in cm']].values)
        label = torch.tensor(observation['class'].astype('long'))

        # Reset the optimizer so all gradients are equal to zero
        optimizer.zero_grad()

        # Generate predictions
        outputs = net(inputs)
        # Calculate the loss
        loss = criterion(outputs, label)
        # Computes the gradients
        loss.backward()
        # Optimize weights
        optimizer.step()

print('finished.')

In [None]:
predicted_proba = net.forward(torch.tensor(test_data.iloc[:, :-1].values))

_, predicted_class = predicted_proba.max(dim=1)

In [None]:
from sklearn.metrics import accuracy_score, classification_report
print(classification_report(test_data['class'], predicted_class, target_names=class_names))

## Using a pre-trained "off-the-shelf" neural network

Data used: 

@data{DVN/1ECTVN_2020,
author = {Tung, K},
publisher = {Harvard Dataverse},
title = {{Flowers Dataset}},
UNF = {UNF:6:z6JGwpi2tftxFU+tbVH/3g==},
year = {2020},
version = {V8},
doi = {10.7910/DVN/1ECTVN},
url = {https://doi.org/10.7910/DVN/1ECTVN}
}

In [None]:
from torchvision.models import mobilenet_v2, MobileNet_V2_Weights

weights = MobileNet_V2_Weights.DEFAULT
net = mobilenet_v2(weights=weights).float()
preprocess = weights.transforms()


In [None]:
from torchvision.io import read_image, ImageReadMode
import numpy as np

img = read_image('data/flowers/flower_photos/train/daisy/5547758_eea9edfd54_n.jpg', ImageReadMode.RGB)

def imshow(inp, title=None):
    """Imshow for Tensor."""
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title is not None:
        plt.title(title)
    plt.pause(0.001)  # pause a bit so that plots are updated
    
batch = preprocess(img).unsqueeze(0)

imshow(batch.squeeze(0))

prediction = net(batch).squeeze(0).softmax(0)
class_id = prediction.argmax().item()
score = prediction[class_id].item()
category_name = weights.meta["categories"][class_id]
print(f"{category_name}: {100 * score:.1f}%")


## Adapting an off-the-shelf neural network using transfer learning

In [None]:
from pathlib import Path
import numpy as np

# Create a dataset metadata file
dataset = pd.DataFrame([file for file in Path("data/flowers/flower_photos/").rglob('*.jpg')], columns=['path'])
dataset['species'] = dataset.path.apply(lambda x: str(x.parent.name))
dataset['subset'] = 'train'
# Put a one sample per species aside for testing
test_set = dataset.groupby(by='species').sample(n=1).index
dataset.loc[test_set, 'subset'] = 'test'

# Create a 0-based index of the class labels
class_names = ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']
dataset['class_id'] = dataset['species'].map({name: idx for idx, name in enumerate(class_names)}).astype('long')

In [None]:
import time

import matplotlib.pyplot as plt

def train_model(model, preprocess, training_data, n_batches, criterion, optimizer, scheduler=None, num_epochs=25):
    since = time.time()
    losses = []
    
    model.train()
    
    for epoch in range(num_epochs):
        print(f'Epoch {epoch}/{num_epochs - 1}')
        print('-' * 10)

        # Iterate over data in  batches
        for batch in np.array_split(training_data.sample(frac=1), n_batches):
            print('.', end='')
            inputs = []
            labels = []
            for _, meta in batch.iterrows():
                img = read_image(str(meta['path']), ImageReadMode.RGB)
                img = preprocess(img)
                inputs.append(img)
                labels.append(torch.tensor(meta['class_id']))
            inputs = torch.stack(inputs)    
            labels = torch.stack(labels)
            
            # zero the parameter gradients
            optimizer.zero_grad()

            # forward
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            losses.append(loss.detach())
            print(f"Loss: {loss}")

            # backward + optimize only if in training phase
            loss.backward()
            optimizer.step()
            if scheduler:
                scheduler.step()
        print("")

    time_elapsed = time.time() - since
    print(f'Training complete in {time_elapsed // 60:.0f}m {time_elapsed % 60:.0f}s')
    
    plt.plot(losses)

    return model

In [None]:
from torch.optim import lr_scheduler

# Replace the classifier layer
n_features = net.classifier[1].in_features
net.classifier[1] = nn.Linear(n_features, dataset['species'].nunique())
net = net.float()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.classifier.parameters(), lr=0.01)

net = train_model(model=net, preprocess=preprocess, training_data=dataset.query('subset == "train"'), n_batches=10, criterion=criterion, optimizer=optimizer, num_epochs=1)


In [None]:
net.eval()

for _, meta in dataset.query('subset == "test"').iterrows():
    img = read_image(str(meta['path']), ImageReadMode.RGB)
    batch = preprocess(img).unsqueeze(0)

    prediction = net(batch).squeeze(0).softmax(0)
    class_id = prediction.argmax().item()
    score = prediction[class_id].item()
    print(f"{class_names[class_id]}: {100 * score:.1f}%, True class: {meta['species']}")

## Next steps

[Official `PyTorch` tutorials](https://pytorch.org/tutorials/index.html)

<table >
<tbody>
  <tr>
    <td style="padding:0px;border-width:0px;vertical-align:center">    
    Created by Simon Stone for Dartmouth College Library under <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons CC BY-NC 4.0 License</a>.<br>For questions, comments, or improvements, email <a href="mailto:researchdatahelp@groups.dartmouth.edu">Research Data Services</a>.
    </td>
    <td style="padding:0 0 0 1em;border-width:0px;vertical-align:center"><img alt="Creative Commons License" src="https://i.creativecommons.org/l/by/4.0/88x31.png"/></td>
  </tr>
</tbody>
</table>