# Intro - Tutorial 1
## Basics of Zenkai

A simple tutorial for Zenkai to show how to build a basic learning machine that implements backprop and also expand upon that by adding loops.

In [None]:
import initialize

# Steps

1) Create standard network 
2) Create the STENetwork
3) Create an alternative network that does looping for each layer

In [None]:

from tools import training, evaluate
from tools.learners import intro

from torchvision.datasets import FashionMNIST
from torchvision import transforms



In [None]:
transform = transforms.Compose(
    [transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))])

training_dataset = FashionMNIST(
    '../../Datasets/',
    transform=transform, download=True
)

testing_dataset = FashionMNIST(
    '../../Datasets/', train=False,
    transform=transform, download=True
)

## Standard network

This network effectively implements a network using backprop with Zenkai for demonstration purposes. In general it would not make sense to do this unless you want to connect a "standard" layer to other types of layers.

In [None]:
network = intro.Network(
    784, 64, 32, 10
)

losses = training.train(network, training_dataset, 40, batch_size=128)

classification = training.classify(network, testing_dataset)

print(classification)

In [None]:
training.plot_loss_line(
    [losses], ['Network'], 'Training Loss', save_file='images/t1x1_standard_network.png'
)

## Straight-through layer network

This network effectively implements a network that uses the straight-through estimator for training. This makes it possible to train hard activation functions by passing the gradient "straight-through" the hard activation function
This can also be implemented directly with PyTorch.

In [None]:
network = intro.STENetwork(
    784, 64, 32, 10
)

losses = training.train(network, training_dataset, 40, batch_size=128)

classification = training.classify(network, testing_dataset)
print(classification)

In [None]:
training.plot_loss_line(
    [losses], ['STENetwork'], 'Training Loss', save_file='images/t1x1_ste_network.png'
)

## Loop network

This network uses layers that have looping for step and step_x. It executes step() before it executes step_x().

For updating parameters

- divide the minibatch into sub minibatches for each layer
- do multiple epochs on each individual layer

For computing the targets of the previous layer, it does multiple loops.

One probably would not want to implement this by layer, but in some cases, you may wish to have large minibatches that you train the whole network over on one pass and then for some of the layers divide that into sub-minibatches.


In [None]:
network = intro.LoopNetwork(
    784, 64, 32, 10
)

losses = training.train(network, training_dataset, 40, batch_size=2048)

classification = training.classify(network, testing_dataset)

print(classification)

In [None]:
training.plot_loss_line(
    [losses], ['Loop Network'], 'Training Loss', save_file='images/t1x1_loop_network.png'
)