In [3]:
import sys
sys.path.append("/workspace/server")
import warnings

from thera.python.mxnet import mxnet as mx
warnings.filterwarnings('ignore')

import random
import numpy as np
import mxnet as mx
from mxnet import gluon
import gluonnlp as nlp
# https://gluon-nlp.mxnet.io/master/examples/word_embedding/word_embedding.html
import re

In [4]:
def add(a, b):
    return a + b

def fancy_func(a, b, c, d):
    e = add(a, b)
    f = add(c, d)
    g = add(e, f)
    return g

fancy_func(1, 2, 3, 4)

10

As expected, Python will perform an addition when running the statement e = add(a, b), and will store the result as the variable e, thereby changing the program’s state. The next two statements f = add(c, d) and g = add(e, f) will similarly perform additions and store the results as variables.

Although imperative programming is convenient, it may be inefficient. On the one hand, even if the add function is repeatedly called throughout the fancy_func function, Python will execute the three function calling statements individually, one after the other. On the other hand, we need to save the variable values of e and f until all the statements in fancy_func have been executed. This is because we do not know whether the variables e and f will be used by other parts of the program after the statements e = add(a, b) and f = add(c, d) have been executed.

Contrary to imperative programming, symbolic programming is usually performed after the computational process has been fully defined. Symbolic programming is used by multiple deep learning frameworks, including Theano and TensorFlow. The process of symbolic programming generally requires the following three steps:

    Define the computation process.

    Compile the computation process into an executable program.

    Provide the required inputs and call on the compiled program for execution.

In the example below, we utilize symbolic programming to re-implement the imperative programming code provided at the beginning of this section.

In [5]:
def add_str():
    return '''
def add(a, b):
    return a + b
'''

def fancy_func_str():
    return '''
def fancy_func(a, b, c, d):
    e = add(a, b)
    f = add(c, d)
    g = add(e, f)
    return g
'''

def evoke_str():
    return add_str() + fancy_func_str() + '''
print(fancy_func(1, 2, 3, 4))
'''

prog = evoke_str()
print(prog)
y = compile(prog, '', 'exec')
exec(y)


def add(a, b):
    return a + b

def fancy_func(a, b, c, d):
    e = add(a, b)
    f = add(c, d)
    g = add(e, f)
    return g

print(fancy_func(1, 2, 3, 4))



def add(a, b):
    return a + b

def fancy_func(a, b, c, d):
    e = add(a, b)
    f = add(c, d)
    g = add(e, f)
    return g

print(fancy_func(1, 2, 3, 4))

10
10


Most deep learning frameworks choose either imperative or symbolic programming. For example, both Theano and TensorFlow (inspired by the latter) make use of symbolic programming, while Chainer and its predecessor PyTorch utilize imperative programming. When designing Gluon, developers considered whether it was possible to harness the benefits of both imperative and symbolic programming. The developers believed that users should be able to develop and debug using pure imperative programming, while having the ability to convert most programs into symbolic programming to be run when product-level computing performance and deployment are required This was achieved by Gluon through the introduction of hybrid programming.

In hybrid programming, we can build models using either the HybridBlock or the HybridSequential classes. By default, they are executed in the same way Block or Sequential classes are executed in imperative programming. When the hybridize function is called, Gluon will convert the program’s execution into the style used in symbolic programming. In fact, most models can make use of hybrid programming’s execution style.

Through the use of experiments, this section will demonstrate the benefits of hybrid programming.

Previously, we learned how to use the Sequential class to concatenate multiple layers. Next, we will replace the Sequential class with the HybridSequential class in order to make use of hybrid programming.

In [7]:


from mxnet import nd, sym
from mxnet.gluon import nn
import time

def get_net():
    net = nn.HybridSequential()  # Here we use the class HybridSequential.
    net.add(nn.Dense(256, activation='relu'),
            nn.Dense(128, activation='relu'),
            nn.Dense(2))
    net.initialize()
    return net

x = nd.random.normal(shape=(1, 512))
net = get_net()
net(x)




[[0.08827581 0.00505182]]
<NDArray 1x2 @cpu(0)>

By calling the hybridize function, we are able to compile and optimize the computation of the concatenation layer in the HybridSequential instance. The model’s computation result remains unchanged.



In [10]:
net.hybridize()
net(x)


[[0.08827581 0.00505182]]
<NDArray 1x2 @cpu(0)>

To demonstrate the performance improvement gained by the use of symbolic programming, we will compare the computation time before and after calling the hybridize function. Here we time 1000 net model computations. The model computations are based on imperative and symbolic programming, respectively, before and after net has called the hybridize function.


In [13]:

def benchmark(net, x):
    start = time.time()
    for i in range(1000):
        _ = net(x)
    nd.waitall()  # To facilitate timing, we wait for all computations to be completed.
    return time.time() - start

net = get_net()
print('before hybridizing: %.4f sec' % (benchmark(net, x)))
net.hybridize()
print('after hybridizing: %.4f sec' % (benchmark(net, x)))



before hybridizing: 25.3146 sec
after hybridizing: 16.0770 sec


In [14]:
x = sym.var('data')
net(x)

<Symbol dense8_fwd>

Similar to the correlation between the Sequential Block classes, the HybridSequential class is a HybridBlock subclass. Contrary to the Block instance, which needs to use the forward function, for a HybridBlock instance we need to use the hybrid_forward function.

Earlier, we demonstrated that, after calling the hybridize function, the model is able to achieve superior computing performance and portability. In addition, model flexibility can be affected after calling the hybridize function. We will demonstrate this by constructing a model using the HybridBlock class.

In [15]:

class HybridNet(nn.HybridBlock):
    def __init__(self, **kwargs):
        super(HybridNet, self).__init__(**kwargs)
        self.hidden = nn.Dense(10)
        self.output = nn.Dense(2)

    def hybrid_forward(self, F, x):
        print('F: ', F)
        print('x: ', x)
        x = F.relu(self.hidden(x))
        print('hidden: ', x)
        return self.output(x)



In [16]:
net = HybridNet()
net.initialize()
x = nd.random.normal(shape=(1, 4))
net(x)

F:  <module 'mxnet.ndarray' from '/workspace/server/thera/python/../../third_party/mxnet/python/mxnet/ndarray/__init__.py'>
x:  
[[-1.8077929   0.6785336  -0.43897748  0.7891813 ]]
<NDArray 1x4 @cpu(0)>
hidden:  
[[0.         0.         0.         0.         0.         0.15095195
  0.10973488 0.03116961 0.00184327 0.        ]]
<NDArray 1x10 @cpu(0)>



[[-0.0070825  -0.01219805]]
<NDArray 1x2 @cpu(0)>

In [19]:
net.hybridize()
net(x)

F:  <module 'mxnet.symbol' from '/workspace/server/thera/python/../../third_party/mxnet/python/mxnet/symbol/__init__.py'>
x:  <Symbol data>
hidden:  <Symbol hybridnet0_relu0>



[[-0.0070825  -0.01219805]]
<NDArray 1x2 @cpu(0)>

We can see that the three lines of print statements defined in the hybrid_forward function will not print anything. This is because a symbolic program has been produced since the last time net(x) was run by calling the hybridize function. Afterwards, when we run net(x) again, MXNet will no longer need to access Python code, but can directly perform symbolic programming at the C++ backend. This is another reason why model computing performance will be improve after the hybridize function is called. However, there is always the potential that any programs we write will suffer a loss in flexibility. If we want to use the three lines of print statements to debug the code in the above example, they will be skipped over and we would not be able to print when the symbolic program is executed. Additionally, in the case of a few functions not supported by Symbol (like asnumpy), and operations in-place like a += b and a[:] = a + b (must be rewritten as a = a + b). Therefore, we will not be able to use the hybrid_forward function or perform forward computation after the hybridize function has been called.

In [20]:


net(x)




[[-0.0070825  -0.01219805]]
<NDArray 1x2 @cpu(0)>