# Training Neural Networks with MinPy Solver

This notebook demonstrates how to use MinPy's solver to train a 2-layer fully connected network. MinPy supports NumPy-style syntax, which enables researchers to specify network architectures in a flexible way. MinPy's solver significantly simplifies the training procedure. Once the model is specified, developers can start training the model instantly via built-in solver.

Below is the general pipeline of training neural networks through MinPy's solver architecture:
1. Define a class to specify the neural network architecture. The class is supposed to inherit from `minpy.nn.model.ModelBase`, which provides interfaces for model's compatibility with MinPy solver.
2. Load and convert the data to by a `minpy.nn.io.NDArrayIter` instance.
3. Pass `NDArrayIter` and neural network objects to an instance of `minpy.nn.solver.Solver`, initialize parameters (`solver.init()`) and start training (`solver.train()`).

This notebook will guide you through all the procedures.

## Some Imports
Import MinPy's core system: its NumPy interface. Also import MinPy's `nn` package for neural networks and data utility.

In [2]:
import minpy.numpy as np
from minpy.nn import layers
from minpy.nn.model import ModelBase
from minpy.nn.solver import Solver
from minpy.nn.io import NDArrayIter
from examples.utils.data_utils import get_CIFAR10_data

## Define Hyperparameters.
You may try different settings by changing hyperparameters below.

In [None]:
# Define hyperparameters regarding network architecture.
input_size           = (3, 32, 32)
flattened_input_size = 3 * 32 * 32
hidden_size          = 512
num_classes          = 10

# Define hyperparameters regarding training data.
batch_size = 128

# Define hyperparameters regarding optimizer.
learning_rate = 1e-4

## Specify Neural Network Architecture
To specify a model structure, one should define a class that inherits from `minpy.nn.model.ModelBase` and
1. In initializer, call the initializer of base class.
2. In initializer, call `self.add_param` to specify the names and shapes of trainable parameters.
3. Define `forward` function, which simply receives input data and mode and return outputs. It is recommended to use layers provided in `minpy.nn.layers` if possible.
4. Define `loss` function, which receives network prediction and ground truth and outputs loss. It is recommended to simplify this function by using loss functions defined in `minpy.nn.layers`.

In [None]:
class TwoLayerNet(ModelBase):
    def __init__(self):
        super(TwoLayerNet, self).__init__()
        # Define model parameters.
        self.add_param(name='w1', shape=(flattened_input_size, hidden_size)) \
            .add_param(name='b1', shape=(hidden_size,)) \
            .add_param(name='w2', shape=(hidden_size, num_classes)) \
            .add_param(name='b2', shape=(num_classes,))

    def forward(self, X, mode):
        # X is the input and mode is a string, which is either 'training' or 'test'.
        # Flatten the input data to matrix.
        X = np.reshape(X, (batch_size, 3 * 32 * 32))
        # First affine layer (fully-connected layer).
        y1 = layers.affine(X, self.params['w1'], self.params['b1'])
        # ReLU activation.
        y2 = layers.relu(y1)
        # Second affine layer.
        y3 = layers.affine(y2, self.params['w2'], self.params['b2'])
        return y3

    def loss(self, predict, y):
        # Compute softmax loss between the output and the label.
        return layers.softmax_loss(predict, y)

## Load Data
Please follow these steps to load CIFAR-10 data set and convert data to the format supported by Minpy's solver. Please specify the location of data set in data_dir.

In [None]:
# Create data iterators for training and testing sets.
data_dir = 'cifar'
data = get_CIFAR10_data(data_dir)
train_dataiter = NDArrayIter(
    data       = data['X_train'],
    label      = data['y_train'],
    batch_size = batch_size,
    shuffle=True
)
test_dataiter = NDArrayIter(
    data       = data['X_test'],
    label      = data['y_test'],
    batch_size = batch_size,
    shuffle    = False
)


## Start Training!
Create model instance, specify the configurations of Minpy's solver, initialize model parameters, and start training!

In [None]:
    # Create model.
    model = TwoLayerNet()
    # Create solver.
    solver = Solver(
        model,
        train_dataiter,
        test_dataiter,
        num_epochs   = 1,
        init_rule    = 'gaussian',
        init_config  = {'stdvar' : 0.001},
        update_rule  = 'sgd',
        optim_config = {'learning_rate': learning_rate},
        verbose      = True,
        print_every  = 20
    )
    # Initialize model parameters.
    solver.init()
    # Train!
    solver.train()