In [11]:
from mxnet import nd
from mxnet.gluon import nn

x = nd.random.uniform(shape=(2, 20))

net = nn.Sequential()
net.add(nn.Dense(256, activation='relu'))
net.add(nn.Dense(10))
net.initialize()
net(x)


[[ 0.00284808  0.01469023  0.04312117 -0.02863533  0.03269168 -0.0081876
  -0.07919035  0.05432465  0.01863154  0.07643033]
 [ 0.05154045  0.01456386  0.00201678 -0.02624297  0.02174724  0.01724856
  -0.05465944  0.01346207 -0.01783044  0.0595474 ]]
<NDArray 2x10 @cpu(0)>

Common features of a block:
1. It needs to ingest data (the input).
2. It needs to produce a meaningful output. This is typically encoded in what we call the forward function. It allows us to invoke a block via `net(X)` to obtain the desired output.
3. It needs to produce a gradient with regard to its input when invoking `backward`.
4. It needs to store parameters that are inherent to the block.
5. It needs to initialize these parameters as needed.

# A Custom Block

The `nn.Block` class provides the functionality required for much of what we need. It is a model constructor provided in the `nn` module, which we can inherit to define the model we want.

In [3]:
class MLP(nn.Block):
    
    def __init__(self, **kwargs):
        super(MLP, self).__init__(**kwargs)
        self.hidden = nn.Dense(256, activation='relu')
        self.output = nn.Dense(10)
        
    def forward(self, x):
        return self.output(self.hidden(x))

In [4]:
net = MLP()
net.initialize()
net(x)


[[ 0.00362229  0.00633331  0.03201145 -0.01369375  0.10336448 -0.0350802
  -0.00032165 -0.01676024  0.06978628  0.01303309]
 [ 0.03871717  0.02608212  0.03544958 -0.02521311  0.11005436 -0.01430663
  -0.03052467 -0.03852826  0.06321152  0.0038594 ]]
<NDArray 2x10 @cpu(0)>

# A Sequential Block

The Sequential class is derived from the Block class. When the forward computation of the model is a simple concatenation of computations for each layer, we can define the model in a much simpler way by using `add` function.
We implement a `MySequential` class that has the same functionality as the Sequential class as follows:

In [5]:
class MySequential(nn.Block):
    
    def __init__(self, **kwargs):
        super(MySequential, self).__init__(**kwargs)
        
    def add(self, block):
        self._children[block.name] = block
        
    def forward(self, x):
        for block in self._children.values():
            x = block(x)
        return x

In [6]:
net = MySequential()
net.add(nn.Dense(256, activation='relu'))
net.add(nn.Dense(10))
net.initialize()
net(x)


[[ 0.07787763  0.00216403  0.01682201  0.03059879 -0.00702019  0.01668715
   0.04822846  0.0039432  -0.09300035 -0.04494302]
 [ 0.08891078 -0.00625484 -0.01619131  0.0380718  -0.01451489  0.02006172
   0.0303478   0.02463485 -0.07605448 -0.04389168]]
<NDArray 2x10 @cpu(0)>

# Blocks with Code

In [26]:
class FancyMLP(nn.Block):
    
    def __init__(self, **kwargs):
        super(FancyMLP, self).__init__(**kwargs)
        self.rand_weight = self.params.get_constant('rand_weight', nd.random.uniform(shape=(20, 20)))
        self.dense = nn.Dense(20, activation='relu')
        
    def forward(self, x):
        x = self.dense(x)
        x = nd.relu(nd.dot(x, self.rand_weight.data()) + 1)
        x = self.dense(x)
        while x.norm().asscalar() > 1:
            x /= 2
        if x.norm().asscalar() < 0.8:
            x *= 10
        return x.sum()

In [27]:
net = FancyMLP()
net.initialize()
net(x)


[2.6033711]
<NDArray 1 @cpu(0)>

In [15]:
class NestMLP(nn.Block):
    
    def __init__(self, **kwargs):
        super(NestMLP, self).__init__(**kwargs)
        self.net = nn.Sequential()
        self.net.add(nn.Dense(64, activation='relu'),
                    nn.Dense(32, activation='relu'))
        self.dense = nn.Dense(16, activation='relu')
        
    def forward(self, x):
        return self.dense(self.net(x))
    
chimera = nn.Sequential()
chimera.add(NestMLP(), nn.Dense(20), FancyMLP())

chimera.initialize()
chimera(x)


[3.176933]
<NDArray 1 @cpu(0)>

# Exercises

1. What kind of error message will you get when calling an `__init__` method whose parent class not in the `__init__` function of the parent class?

2. What kinds of problems will occur if you remove the `asscalar` function in the `FancyMLP` class?

3. What kinds of problems will occur if you change `self.net` defined by the Sequential instance in the `NestMLP` class to `self.net = [nn.Dense(64, activation='relu'), nn. Dense(32, activation='relu')]`?

4. Implement a block that takes two blocks as an argument, say net1 and net2 and returns the concatenated output of both networks in the forward pass (this is also called a parallel block).

5. Assume that you want to concatenate multiple instances of the same network. Implement a factory function that generates multiple instances of the same block and build a larger network from it.