## Background

In this article, we will use [softmax](https://en.wikipedia.org/wiki/Softmax_function) classifier to build a simple image classification neural network with an accuracy of 32%. In a Softmax classifier, binary logic is generalized and regressed to multiple logic. Softmax classifier will output the probability of the corresponding category.

We will first define a softmax classifier, then use the training set of [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html) to train the neural network, and finally use the test set to verify the accuracy of the neural network.

Let’s get started.

## Import dependencies

Like the previous course [GettingStarted](https://thoughtworksinc.github.io/DeepLearning.scala/demo/GettingStarted.html), we need to introduce each class of DeepLearning.scala.

In [1]:
// import $ivy.`org.nd4j::nd4s:0.9.1`
// import $ivy.`org.nd4j:nd4j-cuda-8.0-platform:0.9.1`
import $ivy.`org.nd4j::nd4s:0.8.0`
import $ivy.`org.nd4j:nd4j-native-platform:0.8.0`
import $ivy.`com.chuusai::shapeless:2.3.2`
import $ivy.`org.rauschig:jarchivelib:0.5.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-builtins:2.0.1`
import $ivy.`org.plotly-scala::plotly-jupyter-scala:0.3.2`
import $ivy.`com.thoughtworks.each::each:3.3.1`

import $ivy.`com.thoughtworks.each:each_2.11:3.3.1`

import $plugin.$ivy.`org.scalamacros:paradise_2.11.11:2.1.0`

// import scala.concurrent.ExecutionContext.Implicits.global



import org.nd4j.linalg.api.ndarray.INDArray
import org.nd4j.linalg.factory.Nd4j
import com.thoughtworks.deeplearning.DeepLearning
import com.thoughtworks.deeplearning.plugins._
import com.thoughtworks.feature.Factory
import plotly._
import plotly.element._
import plotly.layout._
import plotly.JupyterScala._
plotly.JupyterScala.init()

import com.thoughtworks.future._
import scala.concurrent.Await
import scala.concurrent.duration.Duration
import com.thoughtworks.each.Monadic._
import scalaz.std.stream._

[32mimport [39m[36m$ivy.$                     
[39m
[32mimport [39m[36m$ivy.$                                    
[39m
[32mimport [39m[36m$ivy.$                             
[39m
[32mimport [39m[36m$ivy.$                               
[39m
[32mimport [39m[36m$ivy.$                                                      
[39m
[32mimport [39m[36m$ivy.$                                             
[39m
[32mimport [39m[36m$ivy.$                                  

[39m
[32mimport [39m[36m$ivy.$                                      

[39m
[32mimport [39m[36m$plugin.$                                            

// import scala.concurrent.ExecutionContext.Implicits.global



[39m
[32mimport [39m[36morg.nd4j.linalg.api.ndarray.INDArray
[39m
[32mimport [39m[36morg.nd4j.linalg.factory.Nd4j
[39m
[32mimport [39m[36mcom.thoughtworks.deeplearning.DeepLearning
[39m
[32mimport [39m[36mcom.thoughtworks.deeplearning.plugins._
[39m
[32mimport [39m[36m

To reduce the line numbers outputted by `jupyter-scala` and to make sure that the page output will not be too long, we need to set `pprintConfig`.

In [2]:
pprintConfig() = pprintConfig().copy(height = 2)

## Build your own neural network.

### Set learning rate

Learning rate need to be set for the full connection layer. Learning rate visually describes the change rate of `weight`. A too-low learning rate will result in slow decrease of `loss`, which will require longer time for training; A too-high learning rate will result in rapid decrease of `loss` at first while fluctuation around the lowest point afterward.

In [4]:
val INDArrayLearningRatePluginUrl = "https://gist.githubusercontent.com/issimo-sakura/f06279e648e45bd574dc382abb4c44ac/raw/7bd7a871030988c58524108c5985f71002f82012/INDArrayLearningRate.sc"
interp.load(scala.io.Source.fromURL(new java.net.URL(INDArrayLearningRatePluginUrl)).mkString)

[36mINDArrayLearningRatePluginUrl[39m: [32mString[39m = [32m"https://gist.githubusercontent.com/issimo-sakura/f06279e648e45bd574dc382abb4c44ac/raw/7bd7a871030988c58524108c5985f71002f82012/INDArrayLearningRate.sc"[39m

In [6]:
val CNNsPluginUrl = "https://gist.github.com/Atry/15b7d9a4c63d95ad3d67e94bf20b4f69/raw/59f7ee4dff0dde3753f560633574265e950edc93/CNN.sc"
interp.load(scala.io.Source.fromURL(new java.net.URL(CNNsPluginUrl)).mkString)

[36mCNNsPluginUrl[39m: [32mString[39m = [32m"https://gist.github.com/Atry/15b7d9a4c63d95ad3d67e94bf20b4f69/raw/59f7ee4dff0dde3753f560633574265e950edc93/CNN.sc"[39m

In [8]:
val L2RegularizationPluginUrl = "https://gist.githubusercontent.com/TerrorJack/a60ff752270c40a6485ee787837390aa/raw/119cbacb29dc12d74ae676b4b02687a8f38b02e4/L2Regularization.sc"
interp.load(scala.io.Source.fromURL(new java.net.URL(L2RegularizationPluginUrl)).mkString)

[36mL2RegularizationPluginUrl[39m: [32mString[39m = [32m"https://gist.githubusercontent.com/TerrorJack/a60ff752270c40a6485ee787837390aa/raw/119cbacb29dc12d74ae676b4b02687a8f38b02e4/L2Regularization.sc"[39m

In [8]:
// val AdamPluginUrl = "https://gist.githubusercontent.com/issimo-sakura/0c2fc6ba4cfa536e4788112a94200b50/raw/9f68360023ba7db17e7437a9501739ffaf375ce2/Adam.sc"
// interp.load(scala.io.Source.fromURL(new java.net.URL(AdamPluginUrl)).mkString)

In [9]:
import java.util.concurrent.Executors
import scala.concurrent.ExecutionContext
val singleThreadExecutor = Executors.newSingleThreadExecutor()
implicit val singleThreadExecutionContext = ExecutionContext.fromExecutor(singleThreadExecutor, { throwable =>
    println("xxxxx")
    sys.exit(-1)
    throwable.getCause.printStackTrace()
})

[32mimport [39m[36mjava.util.concurrent.Executors
[39m
[32mimport [39m[36mscala.concurrent.ExecutionContext
[39m
[36msingleThreadExecutor[39m: [32mjava[39m.[32mutil[39m.[32mconcurrent[39m.[32mExecutorService[39m = java.util.concurrent.Executors$FinalizableDelegatedExecutorService@627d04ca
[36msingleThreadExecutionContext[39m: [32mconcurrent[39m.[32mExecutionContextExecutor[39m = scala.concurrent.impl.ExecutionContextImpl@5c1c5f1b

In [11]:
// `interp.load` is a workaround for https://github.com/lihaoyi/Ammonite/issues/649 and https://github.com/scala/bug/issues/10390

// with Adam
interp.load("""
  val hyperparameters = Factory[Builtins with CNNs with INDArrayLearningRate  with L2Regularization ].
  newInstance(learningRate = 0.0001, l2Regularization = 0.5)
""")

### Write softmax

To use `softmax` classifier (softmax classifier is a neural network combined by `softmax` and a full connection), we first need to write softmax function, formula: ![](https://www.zhihu.com/equation?tex=f_j%28z%29%3D%5Cfrac%7Be%5E%7Bz_j%7D%7D%7B%5Csum_ke%5E%7Bz_k%7D%7D)

In [12]:
import hyperparameters.implicits._

[32mimport [39m[36mhyperparameters.implicits._[39m

In [13]:
import hyperparameters.INDArrayLayer

def softmax(scores: INDArrayLayer): INDArrayLayer = {
  val expScores = hyperparameters.exp(scores)
  (expScores + 1e-8) / (expScores.sum(1) + 1e-8)
}

[32mimport [39m[36mhyperparameters.INDArrayLayer

[39m
defined [32mfunction[39m [36msoftmax[39m

In [14]:
val fileHandler = new java.util.logging.FileHandler("CNNsmall%g.log")
hyperparameters.logger.addHandler(fileHandler)

[36mfileHandler[39m: [32mjava[39m.[32mutil[39m.[32mlogging[39m.[32mFileHandler[39m = java.util.logging.FileHandler@41ae7095

### Compose your  neural network

Define a full connection layer and [initialize Weight](https://github.com/ThoughtWorksInc/DeepLearning.scala/wiki/Getting-Started#231--weight-intialization), `Weight` shall be a two-dimension `INDArray` of `NumberOfPixels × NumberOfClasses`. `scores` is the score of each image corresponding to each category, representing the feasible probability of each category corresponding to each image.

In [15]:
import $ivy.`com.thoughtworks.deeplearning.etl::cifar10:1.1.0`
import com.thoughtworks.deeplearning.etl.Cifar10
import com.thoughtworks.future._
val cifar10 = Cifar10.load().blockingAwait

[32mimport [39m[36m$ivy.$                                                 
[39m
[32mimport [39m[36mcom.thoughtworks.deeplearning.etl.Cifar10
[39m
[32mimport [39m[36mcom.thoughtworks.future._
[39m
[36mcifar10[39m: [32mcom[39m.[32mthoughtworks[39m.[32mdeeplearning[39m.[32metl[39m.[32mCifar10[39m = [33mCifar10[39m(
  [33mVector[39m(
[33m...[39m

In [16]:
//10 label of CIFAR10 images(airplane,automobile,bird,cat,deer,dog,frog,horse,ship,truck)
val NumberOfClasses: Int = 10
val NumberOfPixels: Int = 3072

[36mNumberOfClasses[39m: [32mInt[39m = [32m10[39m
[36mNumberOfPixels[39m: [32mInt[39m = [32m3072[39m

In [17]:
import hyperparameters.INDArrayWeight
// case class CnnLayer(numberOfFilters: Int, hasPooling: Boolean)

// // val cnnLayers: Array[CnnLayer] = Array(
// //     CnnLayer(16, hasPooling = true),
// //     CnnLayer(18, hasPooling = true),
// //     CnnLayer(20, hasPooling = false),
// //     CnnLayer(22, hasPooling = true),
// //     CnnLayer(24, hasPooling = true)
// // )

// val cnnLayers: Array[CnnLayer] = Array(
//     CnnLayer(64, hasPooling = false),
//     CnnLayer(64, hasPooling = true),
//     CnnLayer(128, hasPooling = false),
//     CnnLayer(128, hasPooling = true),
//     CnnLayer(256, hasPooling = false),
//     CnnLayer(256, hasPooling = true)
// )

// val lastCnnWidth = cnnLayers.foldLeft(defaultPixelSize) { (width, cnnLayerConfigure) =>
//     if (cnnLayerConfigure.hasPooling) {
//     assert(width > 1)
//         width / 2
//     } else {
//         width
//     }
// }
// val outputPixel = cnnLayers.last.numberOfFilters * lastCnnWidth * lastCnnWidth

// final case class Model(cnnLayerParameters: Seq[CnnLayerParameter],
//                        fullConnectedWeight: INDArrayWeight,
//                        fullConnectedBias: INDArrayWeight) 

// final case class CnnLayerParameter(weight: INDArrayWeight, bias: INDArrayWeight)

// def isExistVersion(version: Int): Boolean = {
//     import ammonite.ops._
//     val filePath = pwd / "backup" / version.toString
//     exists(filePath)
// }

[32mimport [39m[36mhyperparameters.INDArrayWeight
// case class CnnLayer(numberOfFilters: Int, hasPooling: Boolean)

// // val cnnLayers: Array[CnnLayer] = Array(
// //     CnnLayer(16, hasPooling = true),
// //     CnnLayer(18, hasPooling = true),
// //     CnnLayer(20, hasPooling = false),
// //     CnnLayer(22, hasPooling = true),
// //     CnnLayer(24, hasPooling = true)
// // )

// val cnnLayers: Array[CnnLayer] = Array(
//     CnnLayer(64, hasPooling = false),
//     CnnLayer(64, hasPooling = true),
//     CnnLayer(128, hasPooling = false),
//     CnnLayer(128, hasPooling = true),
//     CnnLayer(256, hasPooling = false),
//     CnnLayer(256, hasPooling = true)
// )

// val lastCnnWidth = cnnLayers.foldLeft(defaultPixelSize) { (width, cnnLayerConfigure) =>
//     if (cnnLayerConfigure.hasPooling) {
//     assert(width > 1)
//         width / 2
//     } else {
//         width
//     }
// }
// val outputPixel = cnnLayers.last.numberOfFilters * lastCnnWidth * lastCnnWidth

// fi

In [17]:
// def writeWeightsAndBias(version: Int, model: Model): Unit = {
//     import ammonite.ops._
    
//     def write(outputFilePath: Path, dataOfWeightOrBias: INDArrayWeight): Unit = {
//         import java.io.FileOutputStream
//         import java.io.ObjectOutputStream
//         val outputStream: ObjectOutputStream = new ObjectOutputStream(new FileOutputStream(outputFilePath.toIO))
//         try {
//             outputStream.writeObject(dataOfWeightOrBias.data)
//         } finally {
//             outputStream.close()
//         }
//     }
    
//     val Model(cnnLayerParameters, 
//           fullConnectedWeight,
//           fullConnectedBias) = model
    
        
//     for (CnnLayerParameter(weight, bias) <- cnnLayerParameters; index <- cnnLayerParameters.indices) {
//         //val CnnLayerParameter(weight, bias) = weightAndBias
        
//         val filePath = pwd / "backup" / version.toString
//         val weightFilePath = filePath / s"weight$index"
//         val biasFilePath = filePath / s"bais$index"
//         write(weightFilePath, weight)
//         write(biasFilePath, bias)

//     }
    
//     val fullConnectedIndex = cnnLayers.length
    
//     val fullConnectedfilePath = pwd / "backup" / version.toString
//     val fullConnectedweightFilePath = fullConnectedfilePath / s"weight$fullConnectedIndex"
//     val fullConnectedbiasFilePath = fullConnectedfilePath / s"bais$fullConnectedIndex"
    
//     write(fullConnectedweightFilePath, fullConnectedWeight)
//     write(fullConnectedbiasFilePath, fullConnectedBias)
// }



In [17]:
// def readWeightsAndBias(version: Int): Model = {
//     if (isExistVersion(version) == false) {
//         throw new IllegalArgumentException(s"The version$version isn't exist!")
//     }
    
//     import ammonite.ops._
    
//     def read(inputFilePath: Path): INDArrayWeight = {
//         import java.io.FileInputStream
//         import java.io.ObjectInputStream
//         val inputStream: ObjectInputStream = new ObjectInputStream(new FileInputStream(inputFilePath.toIO))
//         try {
//             INDArrayWeight(inputStream.readObject().asInstanceOf[INDArray])
//         } finally {
//             inputStream.close()
//         }
//     }
    
//     val cnnLayerParameter = for (index <- cnnLayers.indices) yield {
//         val filePath = pwd / "backup" / version.toString
//         val weightFilePath = filePath / s"weight$index"
//         val biasFilePath = filePath / s"bais$index"
//         val weight = read(weightFilePath)
//         val bias = read(biasFilePath)
//         CnnLayerParameter(weight, bias)
//     }
//     val fullConnectedIndex = cnnLayers.length
    
//     val fullConnectedfilePath = pwd / "backup" / version.toString
//     val fullConnectedweightFilePath = fullConnectedfilePath / s"weight$fullConnectedIndex"
//     val fullConnectedbiasFilePath = fullConnectedfilePath / s"bais$fullConnectedIndex"
    
//     val fullConnectedWeight = read(fullConnectedweightFilePath)
//     val fullConnectedBias = read(fullConnectedbiasFilePath)
    
//     Model(cnnLayerParameter, fullConnectedWeight, fullConnectedBias)
// }

In [17]:
// def initializeWeightAndBias(version: Int): Model = {
//     import org.nd4s.Implicits._
//     def NumberOfChannels = 3 // magic Number
//     def loadWeightAndBias() = {
//         readWeightsAndBias(version)
//     }

//     def randomlyInitializeWeightAndBias() = {
        
//         val cnnLayerParameter = for (i <- cnnLayers.indices) yield {
//             val inputDepth = if (i == 0) {
//                 NumberOfChannels
//             } else {
//                 cnnLayers(i - 1).numberOfFilters
//             }
            

//             val numberOfFilters = cnnLayers(i).numberOfFilters
//             val weight = INDArrayWeight(
//                 Nd4j.randn(
//                     Array(numberOfFilters, 
//                     inputDepth, 
//                     KernelWidth, 
//                     KernelHeight)
//                 ) / math.sqrt(inputDepth * KernelWidth * KernelHeight / 2))
            
//             val bias = INDArrayWeight(Nd4j.zeros(numberOfFilters))
            
//             CnnLayerParameter(weight, bias)
//         }
        
//         val numberOfFilters = cnnLayers.last.numberOfFilters
//         val fullConnectedWeight = INDArrayWeight(Nd4j.randn(Array(outputPixel, Cifar10.NumberOfClasses)) / math.sqrt(outputPixel / 2))
//         val fullConnectedBias = INDArrayWeight(Nd4j.zeros(Cifar10.NumberOfClasses))
//         Model(cnnLayerParameter, fullConnectedWeight, fullConnectedBias)
//     }
    
//     if (isExistVersion(version)) {
//         loadWeightAndBias()
//     } else {
//         randomlyInitializeWeightAndBias()
//     }
// }

In [17]:
// val currentVersion = 0
// val model = initializeWeightAndBias(currentVersion)

In [17]:
// for (CnnLayerParameter(weight, bias) <- model.cnnLayerParameters) {
//     hyperparameters.logger.info(s"${weight.data.shape.toSeq} ${bias.data.shape.toSeq}")
// }

In [17]:
// def dropOut(layer: INDArrayLayer, p: Double): INDArrayLayer = {
//     import com.thoughtworks.raii.asynchronous._
//     import scalaz.syntax.all._
//     import com.thoughtworks.deeplearning.DeepLearning.Tape
// //     import org.nd4j.linalg.api.ndarray.INDArray._
// //     import org.nd4s.Implicits._
//     val doTape: Do[Tape[INDArray, INDArray]] = layer.forward.flatMap { tape: Tape[INDArray, INDArray] =>
//         (layer * (Nd4j.randn(tape.data.shape) gt p)).forward
//     }
//     INDArrayLayer(doTape)
// }

In [17]:
// def dropOut(layer: INDArrayLayer, p: Double): INDArrayLayer = INDArrayLayer(monadic[Do] {
//     (layer * (Nd4j.randn(layer.forward.each.data.shape) > p)).forward.each
// })

In [18]:
val KernelSize: Int = 7
val KernelWidth: Int = KernelSize
val KernelHeight: Int = KernelSize

val NumFilters: Int = 32
val HiddenDim: Int = 100 //define hidden_layer->affineRuleOfCnnLayer
val WeightScale: Double = 1e-2
// def defaultPixelSize = Cifar10.Width
def PixelHeight = Cifar10.Height
def PixelWidth = Cifar10.Width
val Padding: Int = (KernelSize - 1) / 2
val Stride: Int = 1
val PoolSize: Int = 2

[36mKernelSize[39m: [32mInt[39m = [32m7[39m
[36mKernelWidth[39m: [32mInt[39m = [32m7[39m
[36mKernelHeight[39m: [32mInt[39m = [32m7[39m
[36mNumFilters[39m: [32mInt[39m = [32m32[39m
[36mHiddenDim[39m: [32mInt[39m = [32m100[39m
[36mWeightScale[39m: [32mDouble[39m = [32m0.01[39m
defined [32mfunction[39m [36mPixelHeight[39m
defined [32mfunction[39m [36mPixelWidth[39m
[36mPadding[39m: [32mInt[39m = [32m3[39m
[36mStride[39m: [32mInt[39m = [32m1[39m
[36mPoolSize[39m: [32mInt[39m = [32m2[39m

In [19]:
object AllWeightsAndBias {
    val cnnWeight: INDArrayWeight = INDArrayWeight(Nd4j.randn(Array(NumFilters, Cifar10.NumberOfChannels, KernelHeight, KernelWidth)) mul WeightScale)
    val cnnBias: INDArrayWeight = INDArrayWeight(Nd4j.zeros(NumFilters))
    val affineWeight: INDArrayWeight = INDArrayWeight(Nd4j.randn(Array(NumFilters * (PixelHeight / PoolSize) * (PixelWidth / PoolSize), HiddenDim)) mul WeightScale)
    val affineBias: INDArrayWeight = INDArrayWeight(Nd4j.zeros(HiddenDim))
    val affineLastWeight: INDArrayWeight = INDArrayWeight(Nd4j.randn(Array(HiddenDim, Cifar10.NumberOfClasses)) mul WeightScale)
    val affineLastBias: INDArrayWeight = INDArrayWeight(Nd4j.zeros(Cifar10.NumberOfClasses))
}



defined [32mobject[39m [36mAllWeightsAndBias[39m

In [20]:
import AllWeightsAndBias._

[32mimport [39m[36mAllWeightsAndBias._[39m

In [21]:
def affine(input: INDArrayLayer, weight: INDArrayWeight, bias: INDArrayWeight): INDArrayLayer = {
    input dot weight + bias
}

def relu(input: INDArrayLayer): INDArrayLayer = {
    import hyperparameters.max
    max(input, 0.0)
}

defined [32mfunction[39m [36maffine[39m
defined [32mfunction[39m [36mrelu[39m

In [22]:

// def __init__(self, input_dim=(3, 32, 32), num_filters=32, filter_size=7,
//                hidden_dim=100, num_classes=10, weight_scale=1e-3, reg=0.0,
//                dtype=np.float32):

//     C, H, W = input_dim
//     F, HH, WW = num_filters, filter_size, filter_size
//     self.params['W1'] = weight_scale * np.random.randn(F, C, HH, WW)
//     self.params['W2'] = weight_scale * np.random.randn(F*H/2*W/2, hidden_dim)
//     self.params['W3'] = weight_scale * np.random.randn(hidden_dim, num_classes)
//     self.params['b1'] = np.zeros(F)
//     self.params['b2'] = np.zeros(hidden_dim)
//     self.params['b3'] = np.zeros(num_classes)

// INDArrayWeight(Nd4j.randn(
//                     Array(numberOfFilters, 
//                     inputDepth, 
//                     KernelWidth, 
//                     KernelHeight)
//                 ) / math.sqrt(inputDepth * KernelWidth * KernelHeight / 2))

def myNeuralNetwork(input: INDArray):  INDArrayLayer = {
    import hyperparameters.max
    import hyperparameters.maxPool
    import hyperparameters.conv2d
    
    val cnnLayer = maxPool(relu(conv2d(input.reshape(input.shape()(0), Cifar10.NumberOfChannels, PixelHeight, PixelWidth), cnnWeight, cnnBias, (KernelHeight, KernelWidth), (Stride, Stride), (Padding, Padding))), (PoolSize, PoolSize))

    val affineRuleOfCnnLayer = relu(affine(cnnLayer.reshape(input.shape()(0), NumFilters * (PixelHeight / PoolSize) * (PixelWidth / PoolSize)), affineWeight, affineBias))

    val affineOfaffineRuleOfCnnLayer = affine(affineRuleOfCnnLayer.reshape(input.shape()(0), HiddenDim), affineLastWeight, affineLastBias)

    val softmaxValue = softmax(affineOfaffineRuleOfCnnLayer)

    softmaxValue
    
}

defined [32mfunction[39m [36mmyNeuralNetwork[39m

In [22]:
// def myNeuralNetwork(input: INDArray):  INDArrayLayer = {
//     import hyperparameters.max
//     import hyperparameters.maxPool
//     import hyperparameters.conv2d
    
//     val Model(cnnLayerParameters, fullConnectedWeight, fullConnectedBias) = model
    
//     def loop(i: Int): INDArrayLayer = {
//         val CnnLayerParameter(weight, bias) = cnnLayerParameters(i)
//         val cnnLayer = if (i == 0) {
//             max(conv2d(input, weight, bias, (3, 3), (1, 1), (1, 1)), 0.0)
//         } else {
//             max(conv2d(loop(i - 1), weight, bias, (3, 3), (1, 1), (1, 1)), 0.0)
//         }
//         if (cnnLayers(i).hasPooling) {
//             maxPool(cnnLayer, (2, 2))
//         } else {
//             cnnLayer
//         }
//     }
    
//     val layer6 = loop(5)
    
    
//     // ??? Width ?= Height
//     val layer6DropOut = dropOut(layer6, 0.5)
//     val layer7 = layer6DropOut.reshape(input.shape.head, outputPixel) dot fullConnectedWeight + fullConnectedBias
//     softmax(layer7)
// }


// def myNeuralNetwork(input: INDArray): INDArrayLayer = {
//     import hyperparameters.max
//     import hyperparameters.maxPool
//     import hyperparameters.conv2d
//     val layer1 = maxPool(max(conv2d(input.reshape(input.shape()(0), 3, 32, 32), weight1, bias1, (3, 3), (1, 1), (1, 1)), 0.0), (2, 2))
//     val layer2 = maxPool(max(conv2d(layer1, weight2, bias2, (3, 3), (1, 1), (1, 1)), 0.0), (2, 2))
//     val layer3 = maxPool(max(conv2d(layer2, weight3, bias3, (3, 3), (1, 1), (1, 1)), 0.0), (2, 2))
//     val layer4 = maxPool(max(conv2d(layer3, weight4, bias4, (3, 3), (1, 1), (1, 1)), 0.0), (2, 2))
//     val layer5 = maxPool(max(conv2d(layer4, weight5, bias5, (3, 3), (1, 1), (1, 1)), 0.0), (2, 2))

//     val layer6 = layer5.reshape(input.shape()(0), 24) dot weight6 + bias6
//     softmax(layer6)
// }

### Create LossFunction

To learn about the prediction result of the neural network, we need to write the loss function `lossFunction`. We use [cross-entropy loss](https://en.wikipedia.org/wiki/Cross_entropy) to make comparison between this result and the actual result before return the score. Formula:
![](https://zhihu.com/equation?tex=%5Cdisplaystyle+H%28p%2Cq%29%3D-%5Csum_xp%28x%29+logq%28x%29)

In [23]:
import hyperparameters.DoubleLayer

def lossFunction(input: INDArray, expectOutput: INDArray): DoubleLayer = {

    
    val probabilities = myNeuralNetwork(input)
    val result = -(hyperparameters.log(probabilities) * expectOutput).mean
    
    result
}

[32mimport [39m[36mhyperparameters.DoubleLayer

[39m
defined [32mfunction[39m [36mlossFunction[39m

## Prepare data

### Read data

To read the images and corresponding label information for test data from CIFAR10 database and process them, we need [`import $file.ReadCIFAR10ToNDArray`](https://github.com/ThoughtWorksInc/DeepLearning.scala-website/blob/master/ipynbs/ReadCIFAR10ToNDArray.sc). This is a script file containing the read and processed CIFAR10 data, provided in this course.

In [23]:
// import $url.{`https://raw.githubusercontent.com/ThoughtWorksInc/DeepLearning.scala-website/v1.0.0-doc/ipynbs/ReadCIFAR10ToNDArray.sc` => ReadCIFAR10ToNDArray}

// val trainNDArray = ReadCIFAR10ToNDArray.readFromResource("/cifar-10-batches-bin/data_batch_1.bin", 1000)

// val testNDArray = ReadCIFAR10ToNDArray.readFromResource("/cifar-10-batches-bin/test_batch.bin", 100)



### Process data

Before passing data to the softmax classifier, we first process label data with ([one hot encoding](https://en.wikipedia.org/wiki/One-hot)): transform INDArray of `NumberOfPixels × 1` into INDArray of `NumberOfPixels × NumberOfClasses`. The value of correct classification corresponding to each line is 1, and the values of other columns are 0. The reason for differentiating the training set and test set is to make it clear that whether the network is over trained which leads to [overfitting](https://en.wikipedia.org/wiki/Overfitting). While processing label data, we used [Utils](https://github.com/ThoughtWorksInc/DeepLearning.scala-website/blob/master/ipynbs/Utils.sc), which is also provided in this course.

In [23]:
// val trainData = trainNDArray.head
// val testData = testNDArray.head


// val trainExpectResult = trainNDArray.tail.head
// val testExpectResult = testNDArray.tail.head

// import $url.{`https://raw.githubusercontent.com/ThoughtWorksInc/DeepLearning.scala-website/v1.0.0-doc/ipynbs/Utils.sc` => Utils}

// val vectorizedTrainExpectResult = Utils.makeVectorized(trainExpectResult, NumberOfClasses)
// val vectorizedTestExpectResult = Utils.makeVectorized(testExpectResult, NumberOfClasses)

## Train your neural network

To observe the training process of the neural network, we need to output `loss`; while training the neural network, the `loss` shall be decreasing.

In [24]:
// var lossSeq: IndexedSeq[Double] = IndexedSeq.empty

// @monadic[Future]
// val trainTask: Future[Unit] = {
//   val lossStream = for (_ <- (1 to 2000).toStream) yield {
//     val loss = lossFunction(trainData, vectorizedTrainExpectResult).train.each
//     kernel.publish.markdown(s"loss: $loss")
//     loss
//   }
//   lossSeq = IndexedSeq.concat(lossStream)
// }


class Trainer(batchSize: Int, numberOfEpoches: Int = 5) {
    import scalaz.std.anyVal._
    import scalaz.syntax.all._
    @volatile
    private var isShuttingDown: Boolean = false

    private val lossBuffer = scala.collection.mutable.Buffer.empty[Double]
        
    def poltLoss(): Unit = Seq(Scatter(lossBuffer.indices, lossBuffer)).plot(title = "loss by time")
    
    def interrupt(): Unit = isShuttingDown = true

    def startTrain(): Unit = {

        @monadic[Future]
        def trainTask: Future[Unit] = {
            isShuttingDown = false
            var epoch = 0
            
            while (epoch < numberOfEpoches && !isShuttingDown) {
                val iterator = cifar10.epoch(batchSize).zipWithIndex
                while (iterator.hasNext && !isShuttingDown) {
                    val (Cifar10.Batch(labels, batch), i) = iterator.next()
                    val loss = lossFunction(batch, labels).train.each
                    lossBuffer += loss
                    hyperparameters.logger.info(s"epoch=$epoch iteration=$i batchSize=$batchSize loss=$loss")
                }
                epoch += 1
            }
            
            hyperparameters.logger.info("Done")
        }
            trainTask.onComplete { tryUnit: scala.util.Try[Unit] => tryUnit.get }
        
    }
}




defined [32mclass[39m [36mTrainer[39m

In [25]:
val trainBatchSize = 50

val trainer = new Trainer(batchSize = trainBatchSize, numberOfEpoches = 100)
trainer.startTrain()

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.


[36mtrainBatchSize[39m: [32mInt[39m = [32m50[39m
[36mtrainer[39m: [32mTrainer[39m = $sess.cmd23Wrapper$Helper$Trainer@19eae072

In [26]:
trainer.interrupt()

In [None]:
// Serialization
// writeWeightsAndBias(version = currentVersion, model = model)

## Predict  your Neural Network

We use the processed test data to verify the prediction result of the neural network and compute the accuracy. The accuracy shall be about 32%.

In [27]:
// val predictResult = Await.result(myNeuralNetwork(testData).predict.toScalaFuture, Duration.Inf)

// myNeuralNetwork

def findMaxItemIndex(iNDArray: INDArray): INDArray = {
    Nd4j.argMax(iNDArray, 1)
}

// def getAccuracy(score: INDArray, testExpectLabel: INDArray): Double = {
//     val scoreIndex = findMaxItemIndex(score)
//     val numberOfCorrectPrediction = (0 until scoreIndex.shape()(0)).count { row =>
//         scoreIndex.getDouble(row, 0) == testExpectLabel.getDouble(row, 0)
//     }
//     (numberOfCorrectPrediction / score.shape()(0)) * 100
// }

def getAccuracy(score: INDArray, testExpectLabel: INDArray): Double = {
    import org.nd4s.Implicits._
    val scoreIndex = findMaxItemIndex(score)
    if (testExpectLabel.shape().toSeq.last == 1) { //not vectorized
      val acc = for (row <- 0 until scoreIndex.shape()(0)) yield {
        if (scoreIndex.getDouble(row, 0) ==
              testExpectLabel.getDouble(row, 0)) {
          1.0
        } else 0.0
      }
      (acc.sum / score.shape()(0)) * 100
    } else if (testExpectLabel.shape().toSeq.last == 10) { //vectorized
      val expectResultIndex = findMaxItemIndex(testExpectLabel)
      val accINDArray = scoreIndex.eq(expectResultIndex)
      (accINDArray.sumT / score.shape()(0)) * 100
    } else
      throw new IllegalArgumentException("Unacceptable testExpectLabel")
}

val accuracyResultBuffer = scala.collection.mutable.Buffer.empty[Double]
val iterator = cifar10.testBatches(trainBatchSize)
while (iterator.hasNext) {
    val Cifar10.Batch(testDatalabels, testDataBatch) = iterator.next()
    val predictResult = Await.result(myNeuralNetwork(testDataBatch).predict.toScalaFuture, Duration.Inf)
    val accuracyResult = getAccuracy(predictResult ,testDatalabels)
    accuracyResultBuffer += accuracyResult
}

val accuracy = accuracyResultBuffer.sum / accuracyResultBuffer.length

println("The accuracy is " + accuracy + "%")

: 

In [None]:
trainer.poltLoss()

## Summary

We have learned the follows in this article:

* Prepare and process CIFAR10 data
* Write softmax classifier
* Use the prediction image of the neural network written by softmax classifier to match with the probability of each category.