Logs   
- [2023/03/08]   
  Restart this notebook if you change the scratch library

- [2024/04/15]   
  You do not need to restart this notebook when updating the scratch library

In [1]:
import operator

from typing import List, Callable, Iterable, Tuple
from scratch.neural_networks import NeuralNetworks as nn



In [2]:
%load_ext autoreload
%autoreload 2 


Deep learning is a technique that utilize neural network with many layers
to solve many problems including supervise and unsupervise.

To do deep learning, we need several abstraction of data structure

## The Tensor

The implementation of Tensor type in here using some kind of cheating using List.  
We do this because for the practical purpose to learn the concept first.   
More general Tensor datatype is provided by popular library like TensorFlow or PyTorch

Here we use Tensor data type as a list (in fact in concise
mathematical term, $n$-dimensional array *is not*
a tensor)

In [3]:
Tensor = list

In [4]:
# To find a tensors' shape:
def shape(tensor: Tensor) -> List[int]:
  sizes: List[int] = []
  while isinstance(tensor, list):
    sizes.append(len(tensor))
    tensor = tensor[0]    # we enter the first element and recursively fo the deeper elements
  return sizes

assert shape([1, 2, 3]) == [3]
assert shape([[1, 2], [3, 4], [5, 6]]) == [3, 2]
assert shape([[[1, 2], [2, 3], [4, 5]],
       [[6, 7], [8, 9], [10, 11]]]) == [2, 3, 2]

We implement 1d case tensor with its function modification of tensor    
and generalization greater than 1d can be implemented using recursive function


In [5]:
def is_1d(tensor: Tensor) -> bool:
  """If tensor[0] is a list, it's a higher-order tensor.
     Otherwise, tensor is 1-dimensional (that is, a vector).""" 
  return not isinstance(tensor[0], list)

assert is_1d([1, 2, 3])
assert not is_1d([[1, 2], [3, 4]])

Recursive `tensor_sum` function

In [6]:
def tensor_sum(tensor: Tensor) -> float:
  """Sums up all the values in the tensor"""
  if is_1d(tensor):
    return sum(tensor)      # just a list of floats, use Python sum
  else:
    return sum(tensor_sum(tensor_i)     # Call tensor_sum on each row
                for tensor_i in tensor) # and sum up those results.

assert tensor_sum([1, 2, 3]) == 6
assert tensor_sum([[1, 2], [3, 4]]) == 10

Recursive function to apply a function to a tensor

In [7]:
def tensor_apply(f: Callable[[float], float], tensor: Tensor) -> Tensor:
  """Applies f elementwise"""
  if is_1d(tensor):
    return [f(x) for x in tensor]
  else:
    return [tensor_apply(f, tensor_i) for tensor_i in tensor]

assert tensor_apply(lambda x: x + 1, [1, 2, 3]) == [2, 3, 4]
assert tensor_apply(lambda x: 2*x, [[1, 2], [3, 4]]) == [[2, 4], [6, 8]]

Use the above `tensor_apply` to create a zero tensor with the same shape as   
a given tensor

In [8]:
def zeros_like(tensor: Tensor) -> Tensor:
  return tensor_apply(lambda _: 0.0, tensor)

assert zeros_like([1, 2, 3]) == [0, 0, 0]
assert zeros_like([[1, 2], [3, 4]]) == [[0, 0], [0, 0]]

Element-wise operation of two tensor

In [9]:
def tensor_combine(f: Callable[[float, float], float],
                    t1: Tensor, t2: Tensor) -> Tensor:
  """Applies f to corresponding elements of t1 and t2"""
  if is_1d(t1):
    return [f(x, y) for x, y in zip(t1, t2)]
  else:
    return [tensor_combin(f, t1_i, t2_i) for t1_i, t2_i in zip(t1, t2)]

assert tensor_combine(operator.add, [1, 2, 3], [4, 5, 6]) == [5, 7, 9]
assert tensor_combine(operator.mul, [1, 2, 3], [4, 5, 6]) == [4, 10, 18]

## The Layer Abstraction

This `Layer` class will define an abstraction to derive a specifi layer.    
A layer is a function that perform multidimensional array operations

In [10]:
class Layer:
  """Our neural networks will composed of Layers, each of which knows how to do  
  some computation on its inputs in the "forward" direction and propagate
  gradients in the "backward" direction"""

  def forward(self, input):
    """Note the lack of types. We're not going to be presriptive about what kinds
    of input layers can take and what kinds of outputs they can return"""
    raise NotImplementedError

  def backward(self, gradient):
    """Similarly, we're not going to be prescriptive about what the gradient 
    looks like. It's up to you the user to make sure that you're doing things
    sensibly"""

  def params(self) -> Iterable[Tensor]:
    """Returns the parameters of this layer. The default implementation return
    nothing, so that if you have a layer with no parameters you don't have to 
    implement this."""
    return ()

  def grads(self) -> Iterable[Tensor]:
    """Returns the gradients, in the same order as params()"""
    return ()

The above layer class is an abstraction of specific layer that will be defined   
by inherit from that class. In here we called `Layer` class above as a parent  
class, and all the specific class can be derived from the parent class.

In each specific layer, we can update parameters (`params` variables) in our  
networks using its gradient. We can also get from each specific layer its  
parameters and gradients.

Let us define a specific class `Sigmoid` layer

In [11]:
class Sigmoid(Layer):
  def forward(self, input: Tensor) -> Tensor:
    """Apply sigmoid to each element of the input tensor, and save the results 
    to use in backpropagation."""
    self.sigmoid = tensor_apply(sigmoid, input)
    return self.sigmoids

  def backwards(self, gradient: Tensor) -> Tensor:
    return tensor_combine(lambda sig, grad: sig * (1 - sig) * grad,
                          self.sigmoids, gradient)

SyntaxError: invalid syntax (3133579176.py, line 1)

## The Linear Layer

## Neural Networks as a Sequence of Layers

## Loss and Optimization

## Example: XOR Revisited

## Other Activation Functions

## Example: FizzBuzz Revisited

## Softmaxes and Cross-Entropy

## Dropout

## Example: MNIST

## Saving and Loading Models