# Symbol Tutorial
Besides the tensor computation interface NDArray, another main object in MXNet is the Symbol provided by MXNet.Symbol. A symbol represents a multi-output symbolic expression. They are composited by operators, such as simple matrix operations (e.g. “+”), or a neural network layer (e.g. convolution layer). An operator can take several input variables, produce more than one output variables, and have internal state variables. A variable can be either free, which we can bind with value later, or an output of another symbol.


## Jupyter Scala kernel
Add mxnet scala jar which is created as a part of MXNet Scala package installation in classpath as follows:

**Note**: Process to add this jar in your scala kernel classpath can differ according to the scala kernel you are using.

We have used [jupyter-scala kernel](https://github.com/alexarchambault/jupyter-scala) for creating this notebook.

```
classpath.addPath(<path_to_jar>)

e.g
classpath.addPath("mxnet-full_2.11-osx-x86_64-cpu-0.1.2-SNAPSHOT.jar")
```

## Symbol Composition
### Basic Operators
The following example composites a simple expression a+b. We first create the placeholders a and b with names using Symbol.Variable, and then construct the desired symbol by using the operator +. When the string name is not given during creating, MXNet will automatically generate a unique name for the symbol, which is the case for c.

In [2]:
import ml.dmlc.mxnet._
import ml.dmlc.mxnet.Visualization

val a = Symbol.Variable("a")
val b = Symbol.Variable("b")
val c = a + b
(a, b, c)

log4j:WARN No appenders could be found for logger (MXNetJVM).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


[32mimport [36mml.dmlc.mxnet._[0m
[32mimport [36mml.dmlc.mxnet.Visualization[0m
[36ma[0m: [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mSymbol[0m = ml.dmlc.mxnet.Symbol@62b80558
[36mb[0m: [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mSymbol[0m = ml.dmlc.mxnet.Symbol@4565f56d
[36mc[0m: [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mSymbol[0m = ml.dmlc.mxnet.Symbol@345db42b
[36mres1_5[0m: ([32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mSymbol[0m, [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mSymbol[0m, [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mSymbol[0m) = [33m[0m(
  ml.dmlc.mxnet.Symbol@62b80558,
  ml.dmlc.mxnet.Symbol@4565f56d,
  ml.dmlc.mxnet.Symbol@345db42b
)

Most NDArray operators can be applied to Symbol, for example:


In [3]:
// elemental wise times
val d = a * b  
// matrix multiplication
val e = Symbol.dot()(a, b)()
// reshape
val f = Symbol.Reshape()(d+e)()  
// broadcast
val g = Symbol.broadcast_to()(f)()

[36md[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@7e72d295
[36me[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@7695a017
[36mf[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@253b4644
[36mg[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@2ac34e90

### Visualization:

MXNet Scala package uses a simplified implementation of the python-Graphviz library functionality based on: https://github.com/xflr6/graphviz/tree/master/graphviz. You can find the detailed [source code here](https://github.com/dmlc/mxnet/blob/master/scala-package/core/src/main/scala/ml/dmlc/mxnet/Visualization.scala).

To visualize the network, create a folder to save the images or pdfs and provide its path in `dot.render()` method as follows:

In [4]:
val dot = Visualization.plotNetwork(symbol = g)
dot.render(engine = "dot", fileName = "g", path = ".")

[36mdot[0m: [32mVisualization[0m.[32mDot[0m = ml.dmlc.mxnet.Visualization$Dot@3fc1c14

### Basic Neural Networks
Besides the basic operators, Symbol has a rich set of neural network layers. The following codes construct a two layer fully connected neural work and then visualize the structure by given the input data shape.

In [5]:
// Output may vary
val data = Symbol.Variable("data")
val fc1 = Symbol.FullyConnected(name = "fc1")()(Map("data" -> data, "num_hidden" -> 128))
val act1 = Symbol.Activation(name = "relu1")()(Map("data" -> fc1, "act_type" -> "relu"))
val fc2 = Symbol.FullyConnected(name = "fc2")()(Map("data" -> act1, "num_hidden" -> 10))
val net = Symbol.SoftmaxOutput(name = "out")()(Map("data" -> fc2))

[36mdata[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@4e147b90
[36mfc1[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@449a676
[36mact1[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@624e9263
[36mfc2[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@202a4b05
[36mnet[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@4d8f7c6b

To visualize the network:

In [6]:
val dot = Visualization.plotNetwork(symbol = net)
dot.render(engine = "dot", fileName = "net", path = ".")

[36mdot[0m: [32mVisualization[0m.[32mDot[0m = ml.dmlc.mxnet.Visualization$Dot@56525af2

### Modulelized Construction for Deep Networks
For deep networks, such as the Google Inception, constructing layer by layer is painful given the large number of layers. For these networks, we often modularize the construction. Take the Google Inception as an example, we can first define a factory function to chain the convolution layer, batch normalization layer, and Relu activation layer together:

In [7]:
 // Output may vary
def ConvFactory(data: Symbol, numFilter: Int, kernel: (Int, Int), stride: (Int, Int) = (1, 1),
      pad: (Int, Int) = (0, 0), name: String = "", suffix: String = ""): Symbol = {
    val conv = Symbol.Convolution(s"conv_${name}${suffix}")()(
        Map("data" -> data, "num_filter" -> numFilter, "kernel" -> s"$kernel",
            "stride" -> s"$stride", "pad" -> s"$pad"))
      
    val bn = Symbol.BatchNorm(s"bn_${name}${suffix}")()(Map("data" -> conv))
      
    val act = Symbol.Activation(s"relu_${name}${suffix}")()(
        Map("data" -> bn, "act_type" -> "relu"))
    act
  }

val prev = Symbol.Variable("PreviosOutput")
val convComp = ConvFactory(data = prev, numFilter = 64, kernel = (7, 7), stride=(2, 2))
val shape = Shape(128, 3, 28, 28)

defined [32mfunction [36mConvFactory[0m
[36mprev[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@a763473
[36mconvComp[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@251e3edb
[36mshape[0m: [32mShape[0m = (128,3,28,28)

To visualize the network:

In [8]:
val dot = Visualization.plotNetwork(symbol = convComp, title = "ConvFactory", shape = Map("PreviosOutput" -> shape), 
                                    nodeAttrs = Map("shape" -> "oval", "fixedsize" -> "false"))

dot.render(engine = "dot", fileName = "ConvFactory", path = ".")

[36mdot[0m: [32mVisualization[0m.[32mDot[0m = ml.dmlc.mxnet.Visualization$Dot@4958afa4

Then we define a function that constructs an Inception module based on ConvFactory


In [9]:
def InceptionFactoryA(data: Symbol, num1x1: Int, num3x3red: Int, num3x3: Int,
      numd3x3red: Int, numd3x3: Int, pool: String, proj: Int, name: String): Symbol = {
    // 1x1
    val c1x1 = ConvFactory(data = data, numFilter = num1x1,
        kernel = (1, 1), name = s"${name}_1x1")
    // 3x3 reduce + 3x3
    val c3x3r = ConvFactory(data = data, numFilter = num3x3red,
        kernel = (1, 1), name = s"${name}_3x3", suffix = "_reduce")
    val c3x3 = ConvFactory(data = c3x3r, numFilter = num3x3,
        kernel = (3, 3), pad = (1, 1), name = s"${name}_3x3")
    // double 3x3 reduce + double 3x3
    val cd3x3r = ConvFactory(data = data, numFilter = numd3x3red,
        kernel = (1, 1), name = s"${name}_double_3x3", suffix = "_reduce")
    var cd3x3 = ConvFactory(data = cd3x3r, numFilter = numd3x3,
        kernel = (3, 3), pad = (1, 1), name = s"${name}_double_3x3_0")
    cd3x3 = ConvFactory(data = cd3x3, numFilter = numd3x3,
        kernel = (3, 3), pad = (1, 1), name = s"${name}_double_3x3_1")
    // pool + proj
    val pooling = Symbol.Pooling(s"${pool}_pool_${name}_pool")()(
        Map("data" -> data, "kernel" -> "(3, 3)", "stride" -> "(1, 1)",
            "pad" -> "(1, 1)", "pool_type" -> pool))
    val cproj = ConvFactory(data = pooling, numFilter = proj,
        kernel = (1, 1), name = s"${name}_proj")
    // concat
    val concat = Symbol.Concat(s"ch_concat_${name}_chconcat")(c1x1, c3x3, cd3x3, cproj)()
    concat
  }


val prev = Symbol.Variable("PreviosOutput")
val in3a = InceptionFactoryA(prev, 64, 64, 64, 64, 96, "avg", 32, "in3a")


defined [32mfunction [36mInceptionFactoryA[0m
[36mprev[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@21f75c67
[36min3a[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@775e917d

To visualize the network:

In [10]:
val dot = Visualization.plotNetwork(symbol=in3a, shape = Map("PreviosOutput" -> shape), nodeAttrs = Map("shape" -> "oval", "fixedsize" -> "false"))

dot.render(engine = "dot", fileName = "InceptionFactoryA", path = ".")

[36mdot[0m: [32mVisualization[0m.[32mDot[0m = ml.dmlc.mxnet.Visualization$Dot@50a4da62

Finally we can obtain the whole network by chaining multiple inception modulas. A complete example is available at [visualization example](https://github.com/dmlc/mxnet/tree/master/scala-package/examples/src/main/scala/ml/dmlc/mxnet/examples/visualization)
### Group Multiple Symbols
To construct neural networks with multiple loss layers, we can use mxnet.Symbol.Group to group multiple symbols together. The following example group two outputs:

In [11]:
val data = Symbol.Variable("data")
val fc1 = Symbol.FullyConnected(name = "fc1")()(Map("data" -> data, "num_hidden" -> 128))
val net = Symbol.Activation(name = "relu1")()(Map("data" -> fc1, "act_type" -> "relu"))
val out1 = Symbol.SoftmaxOutput(name = "softmax")()(Map("data" -> act1))
val out2 = Symbol.LinearRegressionOutput("regression")()(Map("data" -> net))
val group = Symbol.Group(out1,out2)
group.listOutputs()


[36mdata[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@27b69901
[36mfc1[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@5b6d60d9
[36mnet[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@5535de50
[36mout1[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@3593b1c3
[36mout2[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@26fe58e1
[36mgroup[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@16a59f4f
[36mres10_6[0m: [32mIndexedSeq[0m[[32mString[0m] = [33mArrayBuffer[0m([32m"softmax_output"[0m, [32m"regression_output"[0m)

## Relations to NDArray
As can be seen now, both Symbol and NDArray provide multi-dimensional array operations, such as c=a+b in MXNet. Sometimes users are confused which way to use. We briefly clarify the difference here, more detailed explanation are available [here](http://mxnet.io/architecture/program_model.html).

The NDArray provides an imperative programming alike interface, in which the computations are evaluated sentence by sentence. While Symbol is closer to declarative programming, in which we first declare the computation, and then evaluate with data. Examples in this category include regular expression and SQL.

The pros for NDArray:

- straightforward
- easy to work with other language features (for loop, if-else condition, ..) and libraries (numpy, ..)
- easy to step-by-step debug

The pros for Symbol:

- provides almost all functionalities of NDArray, such as +, *, sin, and reshape
- provides a large number of neural network related operators such as Convolution, Activation, and BatchNorm
- provides automatic differentiation
- easy to construct and manipulate complex computations such as deep neural networks
- easy to save, load, and visualization
- easy for the backend to optimize the computation and memory usage

We will show on the mixed programming tutorial how these two interfaces can be used together to develop a complete training program. This tutorial will focus on the usage of Symbol.

## Symbol Manipulation *
One important difference of Symbol comparing to NDArray is that, we first declare the computation, and then bind with data to run.

In this section we introduce the functions to manipulate a symbol directly. But note that, most of them are wrapped nicely by the mx.module. One can skip this section safely.

### Shape Inference
For each symbol, we can query its inputs (or arguments) and outputs. We can also inference the output shape by given the input shape, which facilitates memory allocation.

In [12]:
val argName = c.listArguments()  // get the names of the inputs
val outName = c.listOutputs()    // get the names of the outputs
val (argShape, outShape, _) = c.inferShape(Map("a" -> Shape(2,3), "b" -> Shape(2,3)))


[36margName[0m: [32mIndexedSeq[0m[[32mString[0m] = [33mArrayBuffer[0m([32m"a"[0m, [32m"b"[0m)
[36moutName[0m: [32mIndexedSeq[0m[[32mString[0m] = [33mArrayBuffer[0m([32m"_plus0_output"[0m)
[36margShape[0m: [32mIndexedSeq[0m[[32mShape[0m] = [33mVector[0m((2,3), (2,3))
[36moutShape[0m: [32mIndexedSeq[0m[[32mShape[0m] = [33mVector[0m((2,3))

### Bind with Data and Evaluate
The symbol c we constructed declares what computation should be run. To evaluate it, we need to feed arguments, namely free variables, with data first. We can do it by using the bind method, which accepts device context and a dict mapping free variable names to NDArrays as arguments and returns an executor. The executor provides method forward for evaluation and attribute outputs to get all results.

In [13]:
val ex = c.bind(ctx=Context.cpu(), args=Map("a" -> NDArray.ones(2,3), 
                                "b" -> NDArray.ones(2,3)))
ex.forward()
println("number of outputs = "+ ex.outputs.length)
ex.outputs(0).toArray

number of outputs = 1


[36mex[0m: [32mExecutor[0m = ml.dmlc.mxnet.Executor@502139d3
[36mres12_3[0m: [32mArray[0m[[32mFloat[0m] = [33mArray[0m([32m2.0F[0m, [32m2.0F[0m, [32m2.0F[0m, [32m2.0F[0m, [32m2.0F[0m, [32m2.0F[0m)

We can evaluate the same symbol on GPU with different data


In [15]:
val ex_gpu = c.bind(ctx=Context.gpu(), args=Map("a" -> NDArray.ones(shape=Shape(3,4), Context.gpu(), dtype = DType.Float32)*2,
                                    "b" -> NDArray.ones(shape=Shape(3,4), Context.gpu(), dtype = DType.Float32)*3))
ex_gpu.forward()
ex_gpu.outputs(0).toArray

[36mex_gpu[0m: [32mExecutor[0m = ml.dmlc.mxnet.Executor@19e1e3a7
[36mres14_2[0m: [32mArray[0m[[32mFloat[0m] = [33mArray[0m([32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m, [32m5.0F[0m)

### Load and Save
Similar to NDArray, we can serialize a Symbol object by using save and load methods directly. Different to the binary format chosen by NDArray, Symbol uses the more readable json format for serialization. The toJson method returns the json string.

In [16]:
println(c.toJson)
c.save("symbol-c.json")
val c2 = Symbol.load("symbol-c.json")
c.toJson == c2.toJson

{
  "nodes": [
    {
      "op": "null", 
      "name": "a", 
      "inputs": []
    }, 
    {
      "op": "null", 
      "name": "b", 
      "inputs": []
    }, 
    {
      "op": "elemwise_add", 
      "name": "_plus0", 
      "inputs": [[0, 0, 0], [1, 0, 0]]
    }
  ], 
  "arg_nodes": [0, 1], 
  "node_row_ptr": [0, 1, 2, 3], 
  "heads": [[2, 0, 0]], 
  "attrs": {"mxnet_version": ["int", 904]}
}


[36mc2[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@234ea43a
[36mres15_3[0m: [32mBoolean[0m = [32mtrue[0m

## Customized Symbol *
Most operators such as Symbol.Convolution and Symbol.Reshape are implemented in C++ for better performance. MXNet also allows users to write new operators using any frontend language such as Python/Scala. It often makes the developing and debugging much easier.

To implement an operator in Python, we just need to define the two computation methods forward and backward with several methods for querying the properties, such as listArguments and inferShape.

NDArray is the default type of arguments in both forward and backward. Therefore we often also implement the computation with  NDArray operations. 

We first create a subclass of Operator.CustomOp and then define forward and backward.

In [17]:
  class Softmax(_param: Map[String, String]) extends CustomOp {

    override def forward(sTrain: Boolean, req: Array[String],
      inData: Array[NDArray], outData: Array[NDArray], aux: Array[NDArray]): Unit = {
      val xShape = inData(0).shape
      val x = inData(0).toArray.grouped(xShape(1)).toArray
      val yArr = x.map { it =>
        val max = it.max
        val tmp = it.map(e => Math.exp(e.toDouble - max).toFloat)
        val sum = tmp.sum
        tmp.map(_ / sum)
      }.flatten
      val y = NDArray.empty(xShape, outData(0).context)
      y.set(yArr)
      this.assign(outData(0), req(0), y)
      y.dispose()
    }

    override def backward(req: Array[String], outGrad: Array[NDArray],
      inData: Array[NDArray], outData: Array[NDArray],
      inGrad: Array[NDArray], aux: Array[NDArray]): Unit = {
      val l = inData(1).toArray.map(_.toInt)
      val oShape = outData(0).shape
      val yArr = outData(0).toArray.grouped(oShape(1)).toArray
      l.indices.foreach { i =>
        yArr(i)(l(i)) -= 1.0f
      }
      val y = NDArray.empty(oShape, inGrad(0).context)
      y.set(yArr.flatten)
      this.assign(inGrad(0), req(0), y)
      y.dispose()
    }
  }

defined [32mclass [36mSoftmax[0m

Here we use CustomOp.assign to assign the results to mxnet.NDArray based on the value of req, which could be "over write" or "add to".
Next we create a subclass of Operator.CustomOpProp for querying the properties.

In [18]:
 class SoftmaxProp(needTopGrad: Boolean = false)
    extends CustomOpProp(needTopGrad) {

    override def listArguments(): Array[String] = Array("data", "label")

    override def listOutputs(): Array[String] = Array("output")

    override def inferShape(inShape: Array[Shape]):
      (Array[Shape], Array[Shape], Array[Shape]) = {
      val dataShape = inShape(0)
      val labelShape = Shape(dataShape(0))
      val outputShape = dataShape
      (Array(dataShape, labelShape), Array(outputShape), null)
    }

    override def createOperator(ctx: String, inShapes: Array[Array[Int]],
      inDtypes: Array[Int]): CustomOp = new Softmax(this.kwargs)
  }

defined [32mclass [36mSoftmaxProp[0m

Finally, we can use Symbol.Custom with the register name to use this operator


```scala
val mlp = Symbol.Custom("softmax")()(Map("data" -> fc3,
        "label" -> label, "op_type" -> "softmax"))
```

## Advanced Usages *
### Type Cast
MXNet uses 32-bit float in default. Sometimes we want to use a lower precision data type for better accuracy-performance trade-off. For example, The Nvidia Tesla Pascal GPUs (e.g. P100) have improved 16-bit float performance, while GTX Pascal GPUs (e.g. GTX 1080) are fast on 8-bit integers.

We can use the Symbol.Cast operator to convert the data type.

In [19]:
val a = Symbol.Variable("data")
val b = Symbol.Cast()()(Map("data" -> a, "dtype" -> "float16"))
val (argb, outb, _) = b.inferType(Map("data" -> DType.Float32))
println(argb, outb)

val c = Symbol.Cast()()(Map("data" -> a, "dtype" -> "uint8"))
val (argc, outc, _) = c.inferType(Map("data" -> DType.Int32))
print(argc, outc)

(ListBuffer(Float32),ListBuffer(Float16))
(ListBuffer(Int32),ListBuffer(UInt8))

[36ma[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@550368a3
[36mb[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@63cbf626
[36margb[0m: [32mSeq[0m[[32mDType[0m.[32mDType[0m] = [33mListBuffer[0m(Float32)
[36moutb[0m: [32mSeq[0m[[32mDType[0m.[32mDType[0m] = [33mListBuffer[0m(Float16)
[36mc[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@182f0324
[36margc[0m: [32mSeq[0m[[32mDType[0m.[32mDType[0m] = [33mListBuffer[0m(Int32)
[36moutc[0m: [32mSeq[0m[[32mDType[0m.[32mDType[0m] = [33mListBuffer[0m(UInt8)

### Variable Sharing
Sometimes we want to share the contents between several symbols. This can be simply done by bind these symbols with the same array.

In [20]:
val a = Symbol.Variable("a")
val b = Symbol.Variable("b")
val c = Symbol.Variable("c")
val d = a + b * c

val data = NDArray.ones(2,3)*2
val ex = d.bind(ctx=Context.cpu(), args=Map("a" -> data, "b" -> data, "c" -> data))
ex.forward()
ex.outputs(0).toArray

[36ma[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@74892999
[36mb[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@33cf54a9
[36mc[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@52dcc5fe
[36md[0m: [32mSymbol[0m = ml.dmlc.mxnet.Symbol@3f5139ac
[36mdata[0m: [32mNDArray[0m = ml.dmlc.mxnet.NDArray@8b9a99a8
[36mex[0m: [32mExecutor[0m = ml.dmlc.mxnet.Executor@1b2182b8
[36mres19_7[0m: [32mArray[0m[[32mFloat[0m] = [33mArray[0m([32m6.0F[0m, [32m6.0F[0m, [32m6.0F[0m, [32m6.0F[0m, [32m6.0F[0m, [32m6.0F[0m)

## Further Readings


- [NDArray API](http://mxnet.io/api/scala/docs/index.html#ml.dmlc.mxnet.NDArray)
- [Symbol API](http://mxnet.io/api/scala/docs/index.html#ml.dmlc.mxnet.Symbol)
- [Visualization API](http://mxnet.io/api/scala/docs/index.html#ml.dmlc.mxnet.Visualization$)