# Sonnet
<img src="https://camo.githubusercontent.com/60d40f8702f33a7f7427aec58ff28001668f51767b33032f959265a0b8ab6382/68747470733a2f2f736f6e6e65742e6465762f696d616765732f736f6e6e65745f6c6f676f2e706e67" width=300/>

### URLs:
* GitHub: https://github.com/deepmind/sonnet
* GitHub tags: https://github.com/deepmind/sonnet/tags
* PyPI: https://pypi.org/project/dm-sonnet
* PyPI release history: https://pypi.org/project/dm-sonnet/#history

### Installation:
**Note:** ⚠️ Latest version compatible with TF1 (as of 2021-07) is `v1.36`.
```bash
pip install 'dm-sonnet==1.36'
```

### Overview
Sonnet is an **object-oriented** library written in Python. It was released by DeepMind in 2017. 

Sonnet intends to cleanly separate the following two aspects of building computation graphs from objects:
* The configuration of objects called modules
* The connection of objects to computation graphs

The modules are defined as sub-classes of the abstract class `sonnet.AbstractModule`. 
At the time of writing this book, the following modules are available in Sonnet:

* **Basic modules:** `AddBias`, `BatchApply`, `BatchFlatten`, `BatchReshape`, `FlattenTrailingDimensions`, `Linear`, `MergeDims`, `SelectInput`, `SliceByDim`, `TileByDim`, `TrainableVariable`
* **Recurrent modules:** `DeepRNN`, `ModelRNN`, `VanillaRNN`, `BatchNormLSTM`, `GRU`, and `LSTM`
* **Recurrent + ConvNet modules:** `Conv1DLSTM` and `Conv2DLSTM`
* **ConvNet modules:** `Conv1D`, `Conv2D`, `Conv3D`, `Conv1DTranspose`, `Conv2DTranspose`, `Conv3DTranspose`, `DepthWiseConv2D`, `InPlaneConv2D`, and `SeparableConv2D`
* **ResidualNets:** `Residual`, `ResidualCore`, and `SkipConnectionCore`
* **Others:** `BatchNorm`, `LayerNorm`, `clip_gradient`, and `scale_gradient`

We can define our own new modules by creating a sub-class of `sonnet.AbstractModule.

### Sonnet Workflow
1. Create classes for the dataset and network architecture which inherit from `sonnet.AbstractModule`. In our example, we create an `MNIST` class and an `MLP` class.
2. Define the parameters and hyperparameters.
3. Define the test and train datasets from the dataset classes defined in the preceding step.
4. Define the model using the network class defined. As an example, `model = MLP([20, n_classes])` in our case creates an MLP network with two layers of 20 and the `n_classes` number of neurons each.
5. Define the `y_hat` placeholders for the train and test sets using the model.
6. Define the loss placeholders for the train and test sets.
7. Define the optimizer using the train loss placeholder.
8. Execute the loss function in a TensorFlow session for the desired number of epochs to optimize the parameters.

### Final Note:
This library is vaguely reminiscent of PyTorch.

In [1]:
import tensorflow as tf
tf.reset_default_graph()

In [2]:
import os
import sonnet as snt

In [3]:
from tensorflow.examples.tutorials.mnist import input_data

In [4]:
tf.logging.set_verbosity(tf.logging.INFO)

#### Note:
`tf.gather()`
> Gather slices from params axis axis according to indices. indices must be an integer tensor of any dimension (usually 0-D or 1-D).

```python
tf.gather(
    params, indices, validate_indices=None, name=None, axis=None, batch_dims=0
)
```

In [5]:
# Define MNIST class (Inherits from snt.AbstractModule!)
class MNIST(snt.AbstractModule):

    def __init__(self, mnist_part, batch_size, name='MNIST'):

        super(MNIST, self).__init__(name=name)

        self._X = tf.constant(mnist_part.images, dtype=tf.float32)
        self._Y = tf.constant(mnist_part.labels, dtype=tf.float32)
        self._batch_size = batch_size
        self._M = mnist_part.num_examples

    def _build(self):
        idx = tf.random_uniform([self._batch_size], 0, self._M, tf.int64)
        X = tf.gather(self._X, idx)  # See: https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gather
        Y = tf.gather(self._Y, idx)
        return X, Y

In [6]:
# Define MLP class (Inherits from snt.AbstractModule!)
class MLP(snt.AbstractModule):
    def __init__(self, output_sizes, name='mlp'):
        super(MLP, self).__init__(name=name)

        self._layers = []

        for output_size in output_sizes:
            self._layers.append(snt.Linear(output_size=output_size))

    def _build(self, X):

        # add the input layer
        model = tf.sigmoid(self._layers[0](X))

        # add hidden layers
        for i in range(1, len(self._layers) - 1):
            model = tf.sigmoid(self._layers[i](model))

        # add output layer
        model = tf.nn.softmax(self._layers[len(self._layers) - 1](model))

        return model

In [7]:
batch_size = 100
n_classes = 10
n_epochs = 10

In [8]:
mnist = input_data.read_data_sets(
    os.path.join('.', 'mnist'),
    one_hot=True
)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting ./mnist/t10k-images-idx3-ubyte.gz
Extracting ./mnist/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [9]:
train = MNIST(mnist.train, batch_size=batch_size)
test = MNIST(mnist.test, batch_size=batch_size)

X_train, Y_train = train()
X_test, Y_test = test()

In [10]:
model = MLP([20, n_classes])

In [11]:
# Note: **before any training**
Y_train_hat = model(X_train)
Y_test_hat = model(X_test)

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


In [12]:
def loss(Y_hat, Y):
    return -tf.reduce_sum(Y * tf.log(Y_hat))

In [13]:
L_train = loss(Y_train_hat, Y_train)
L_test = loss(Y_test_hat, Y_test)

In [14]:
print(Y_train_hat)
print(L_train)

Tensor("mlp/Softmax:0", shape=(100, 10), dtype=float32)
Tensor("Neg:0", shape=(), dtype=float32)


In [15]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(L_train)

In [16]:
with tf.Session() as tfs:
    tf.global_variables_initializer().run()
    
    for epoch in range(n_epochs):
        loss_val, _ = tfs.run((L_train, optimizer))
        print('Epoch : {} Training Loss : {}'.format(epoch, loss_val))

    loss_val = tfs.run(L_test)
    print('Test loss : {}'.format(loss_val))

Epoch : 0 Training Loss : 236.387939453125
Epoch : 1 Training Loss : 229.48268127441406
Epoch : 2 Training Loss : 222.2772216796875
Epoch : 3 Training Loss : 221.61489868164062
Epoch : 4 Training Loss : 219.23606872558594
Epoch : 5 Training Loss : 214.39312744140625
Epoch : 6 Training Loss : 203.90213012695312
Epoch : 7 Training Loss : 197.7329864501953
Epoch : 8 Training Loss : 197.83961486816406
Epoch : 9 Training Loss : 190.5469970703125
Test loss : 185.12721252441406
