# CSC321: HW3
** December 2017 **

Problem set source: http://www.cs.toronto.edu/~rgrosse/courses/csc321_2017/


Solutions by: ** Andrew Riberio @ [AndrewRib.com](http://www.andrewrib.com) **

** Note: ** This notebook contains interactive elements and certain latex snippets that will not render in github markdown. 
You must run this notebook on your local Jupyter notebook environment for interactive elements or render or if you wish to render just the latex by using the url of this repo with the [online NBViewer](https://nbviewer.jupyter.org/).


## Libraries

In [115]:
import numpy as np
import sympy as sp
from IPython.display import display

sp.init_printing(order='rev-lex',use_latex='mathjax')

## 3.1 Hard-Coding a Network

In [324]:
# Our cost function
def newActivationFn(fn):
    actFn = np.vectorize(fn)
    return lambda vect: np.apply_along_axis(actFn,0,vect)

Φ = newActivationFn( lambda z: 1 if z >= 0 else 0 )

def networkOut(X,W_1,b_1,w_2,b_2,debug=False,activationFn=Φ):
    
    if(debug):
        #  X can have a variable number of columns, but we put one here for printing. 
        X_s   = sp.MatrixSymbol("X"  ,1,4)
        W_1_s = sp.MatrixSymbol("W_1",3,4)
        b_1_s = sp.MatrixSymbol("b_1",1,3)
        w_2_s = sp.MatrixSymbol("w_2",1,3)
        b_2_s = sp.MatrixSymbol("b_2",1,1)
        h     = sp.MatrixSymbol("h",1,3)
        phi   = sp.Function(r'\phi')
        
        display(X_s)
        display(sp.Matrix(X))
        
        display(X_s* W_1_s.T)
        display(sp.Matrix(X*W_1.T))
        
        display(X_s* W_1_s.T+b_1_s)
        display(sp.Matrix(X*W_1.T+b_1))
        
        display(sp.Eq(h,phi(X_s* W_1_s.T+b_1_s)))
        display(sp.Matrix(activationFn(X*W_1.T + b_1)))
        
        display(w_2_s*h.T)
        display(sp.Matrix(w_2*activationFn(X*W_1.T + b_1).T))
        
        display(w_2_s*h.T + b_2_s)
        display(sp.Matrix(w_2*activationFn(X*W_1.T + b_1).T  + b_2))
        
        display(phi(w_2*Φ(X*W_1.T + b_1).T  + b_2))
        display(sp.Matrix(activationFn(w_2*activationFn(X*W_1.T + b_1).T  + b_2)))
        
    else:
        h = activationFn(X*W_1.T + b_1)
        return activationFn(w_2*h.T + b_2)

### Solution for $x_1<x_2<x_3<x_4 \mid x_i \in \mathbb{R}$

In [325]:
W_1 = np.matrix([[-1,1,0,0],[0,-1,1,0],[0,0,-1,1]])
b_1 = np.matrix([[-1,-1,-1]])
w_2 = np.matrix([[1,1,1]])
b_2 = -3

In [326]:
examples = np.matrix([[0,1,2,3],[1,2,3,4],[1,1,2,2],[1,5,9,10],[-4,-3,-2,-1],[-3,-4,-1,0]])

In [327]:
networkOut(examples,W_1,b_1,w_2,b_2)

matrix([[1, 1, 0, 1, 1, 0]])

### Solution for $x_1\leq x_2 \leq x_3 \leq x_4 \mid x_i \in \mathbb{R}$
To make the solution work for less than or equal, we will null out the hidden layer bias. 

In [212]:
W_1 = np.matrix([[-1,1,0,0],[0,-1,1,0],[0,0,-1,1]])
b_1 = np.matrix([[0,0,0]])
w_2 = np.matrix([[1,1,1]])
b_2 = -3

In [201]:
examples = np.matrix([[1,2,3,4],[1,1,2,2],[-3,-4,-1,0]])

In [202]:
networkOut(examples,W_1,b_1,w_2,b_2)

matrix([[1, 1, 0]])

### Solution for $x_1\geq x_2 \geq x_3 \geq x_4 \mid x_i \in \mathbb{R}$

In [295]:
W_1 = np.matrix([[1,-1,0,0],[0,1,-1,0],[0,0,1,-1]])
b_1 = np.matrix([[0,0,0]])
w_2 = np.matrix([[1,1,1]])
b_2 = -3

In [247]:
examples = np.matrix([[1,2,3,4],[1,1,2,2],[-3,-4,-1,0],[4,3,2,1]])

In [248]:
networkOut(examples,W_1,b_1,w_2,b_2)

matrix([[0, 0, 0, 1]])

### Solution for $(x_1 + x_2) \lt (x_3 + x_4) \mid x_i \in \mathbb{R}$

In [291]:
W_1 = np.matrix([[-1,-1,1,1],[0,0,0,0],[0,0,0,0]])
b_1 = np.matrix([[-1,-1,-1]])
w_2 = np.matrix([[1,1,1]])
b_2 = -1

In [293]:
examples = np.matrix([[3,2,1,6],[2,1,3,4],[-3,-2,-1,-6],[-1,-6,-3,-2],[6,2,10,1],[10,1,2,6],[10,-2,1,10]])

In [294]:
networkOut(examples,W_1,b_1,w_2,b_2)

matrix([[1, 1, 0, 1, 1, 0, 1]])

### Solution for consecutive squares $x_1<x_2<x_3<x_4 \mid x_i \in \{n^2 \mid n \in \mathbb{N}\}$
The question becomes: can we make a neuron compute a less-than and check for a square?

In [466]:
W_1 = np.matrix([[1,-2,1,0],[0,1,-2,1],[1,-1,-1,1]])
b_1 = np.matrix([[-2,-2,-4]])
w_2 = np.matrix([[1,1,1]])
b_2 = -3

In [471]:
#examples = np.matrix([[1,4,9,16],[1,2,3,4]])
#examples = np.matrix([[1,4,9,16]])
#examples = np.matrix([[9,16,25,36]])
#examples = np.matrix([[10**2,11**2,12**2,13**2]])
#examples = np.matrix([[1,4,9,16]])
examples = np.matrix([[1,4,9,16],[9,16,25,36],[49,64,81,100],[1,2,3,4],[81,49,64,100],[16,9,4,1],[9,16,81,100],
                      [10**2,11**2,12**2,13**2]])


In [483]:
res = networkOut(examples,W_1,b_1,w_2,b_2,False)

sp.Matrix( np.hstack( (examples,res.T) ) )

⎡ 1    4    9   16   1⎤
⎢                     ⎥
⎢ 9   16   25   36   1⎥
⎢                     ⎥
⎢49   64   81   100  1⎥
⎢                     ⎥
⎢ 1    2    3    4   0⎥
⎢                     ⎥
⎢81   49   64   100  1⎥
⎢                     ⎥
⎢16    9    4    1   1⎥
⎢                     ⎥
⎢ 9   16   81   100  0⎥
⎢                     ⎥
⎣100  121  144  169  1⎦