## Ultimate goal : Thinking Machine

### Where to start : Biological Neuron

<img src="https://www.simplilearn.com/ice9/free_resources_article_thumb/diagram-of-a-biological-neuron.jpg" width="500">

### Mathematical neuron model 
<img src="http://bit.ly/2ldH0Bg" width="450">

### Logistic regression units.

<img src="./logistic-units.png" width="300">

### Hardware implementations
Frank Rosenblatt, 1957: Mark 1 perceptron
<img src="https://upload.wikimedia.org/wikipedia/en/thumb/5/52/Mark_I_perceptron.jpeg/220px-Mark_I_perceptron.jpeg" width="200">

### False Promises
“The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself an be conscious of its existence … Dr. Frank Rosenblatt, a research psychologist at the Cornell Aeronautical Laboratory, Buffalo, said Perceptrons might be fired to the planets as mechanical space explorers”- New York Times; July 08, 1958


### XOR problem: linearly separable?

<img src="https://qph.fs.quoracdn.net/main-qimg-a6c557af4280d1f85cacc66e048e82f3">

### MLP can solve XOR problem
<img src="MLP.png" width="450">

### Perceptrons(1969)
* We need to use MLP, multilayer perceptrons
* No on on earth had found a viable way to train MLPs good enough to learn such simple functions.

### 1st Winter(1969) 
"No on on earth had found a viable way to train..." (Marvin Minsky 1969)

### Backpropagation(1986) 
(1974, 1982 by Paul Werbos, 1986 by Hinton)
<img src="https://i.stack.imgur.com/H1KsG.png" width="400">

### CNN (Convolutional Neural Network)
(motivated by biological insights)

<img src="https://1569708099.rsc.cdn77.org/wp-content/uploads/2017/09/lenet-5-825x285.png?x31195">

### BIG Problem
* Backpropagation just did not work well for normal neural nets with many layers
* Other rising machine learning algorithms : SVM, RandomForest, etc.


### 2nd Winter(1995)
* Yann LeCun, 1995 Paper "Comparison of Learning Algorithm For Handwritten Digit Recognition"
"New Machine Learning approach  worked better"

### Breakthrouth(2006,2007) by Hinton and Bengio
* Neural networks with many layers really could be trained well, if the weights are initialized in a clever way rather than randomly. (By Hinton)
* Deep machine learning methods (that is, methods with many processing steps, or equivalently with hierarchical feature representations of the data) are more efficient for difficult problems than shallow methods (which two-layer ANNs or support vector machines are examples of). (By Benzio)
* Rebranding to **Deep Nets, Deep Learning**

### Deep Learning =
Lots of training data + Parallel Computation + Scalable, smart algorithms

<img src="http://3.bp.blogspot.com/-zQlQvmK9U9g/VT_Hk6yKlmI/AAAAAAAAODQ/nNNcpVM4UPM/s1600/bg_pipeline-01.png">

### IMAGENET - Large Scale Visual Recognition Challenge


<img src="https://thegradient.pub/content/images/2018/07/image_1.png" width="400">

<img src="https://www.researchgate.net/profile/Gustav_Von_Zitzewitz/publication/324476862/figure/fig7/AS:614545865310213@1523530560584/Winner-results-of-the-ImageNet-large-scale-visual-recognition-challenge-LSVRC-of-the_W640.jpg" width="450">

### Neural networks that can explain photos

<img src="https://gigaom.com/wp-content/uploads/sites/1/2014/11/googlernncnn-804x326.png" width="450">

from https://gigaom.com/2014/11/18/google-stanford-build-hybrid-neural-networks-that-can-explain-photos/

###  Geoffrey Hinton’s summary of findings up to today
* Our labeled datasets were thousands of times too small.
* Our computers were millions of times too slow.
* We initialized the weights in a stupid way.
* We used the wrong type of non-linearity.


### Why should I care?

* Youtube.com : auto-generated caption & translation
* Facebook.com : 
* Google Searh :
* Netflix : user oriented recommendation system
* Amazon Mall : 
* etc.


### Why Now?
* Students/Researchers
  - Not too late to be a world expert
  - Not too complicated (mathematically)
* Practitioner
  - Accurate enough to be used in practice
  - many ready-to-use tools such as TensorFlow
  - Many easy/simple programming languages such as Python
* After all, it is fun!

## Lab 8  Tensor Manipulation


#### Simple 1D array and slicing

In [12]:
import numpy as np
import pprint as pp
t = np.array([0., 1., 2., 3., 4., 5., 6.])

pp.pprint(t)  
print(t.ndim)  # rank
print(t.shape) # shape
print(t[0], t[1], t[-1])
print(t[2:5], t[4:-1])
print(t[:2],t[3:])

array([0., 1., 2., 3., 4., 5., 6.])
1
(7,)
0.0 1.0 6.0
[2. 3. 4.] [4. 5.]
[0. 1.] [3. 4. 5. 6.]


In [15]:
t2 = np.array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.], [10., 11., 12.]])
pp.pprint(t2)
print(t2.ndim)
print(t2.shape)

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.],
       [10., 11., 12.]])
2
(4, 3)


#### Shape, Rank, Axis

In [51]:
import tensorflow as tf

t = tf.constant([1,2,3,4])
print(t.shape)
tf.shape(t)

(4,)


<tf.Tensor 'Shape_21:0' shape=(1,) dtype=int32>

In [50]:
t = tf.constant([[1,2],
                [3,4]])
print(t.shape)
tf.shape(t)

(2, 2)


<tf.Tensor 'Shape_20:0' shape=(2,) dtype=int32>

In [69]:
t = tf.constant([[ [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]] ,
               [[13,14,15,16], [17,18,19,20], [21, 22, 23, 24]] ]] )
print(t.shape)
tf.shape(t)

(1, 2, 3, 4)


<tf.Tensor 'Shape_36:0' shape=(4,) dtype=int32>

In [78]:
matrix1 = tf.constant([[1., 2.], [3., 4.]])
matrix2 = tf.constant([[1.], [2.]])
print("Matrix 1 shape : ", matrix1.shape)
print("Matrix 2 shape : ", matrix2.shape)

e = tf.matmul(matrix1, matrix2)
sess = tf.Session()
sess.run(e)

f = matrix1 * matrix2  #  not matrix multiplication, but broadcasting
sess.run(f)

Matrix 1 shape :  (2, 2)
Matrix 2 shape :  (2, 1)


array([[1., 2.],
       [6., 8.]], dtype=float32)

#### Broadcasting (use it carefully)

In [83]:
mat1 = tf.constant([[1., 2.]])
mat2 = tf.constant(3.)
sess.run(mat1+mat2)

array([[4., 5.]], dtype=float32)

In [84]:
mat1 = tf.constant([[1., 2.]])
mat2 = tf.constant([3., 4])
sess.run(mat1+mat2)

array([[4., 6.]], dtype=float32)

In [85]:
mat1 = tf.constant([[1., 2.]])
mat2 = tf.constant([[3.], [4]])
sess.run(mat1*mat2)

array([[3., 6.],
       [4., 8.]], dtype=float32)

#### Reduce mean

In [91]:
sess.run(tf.reduce_mean([1,2], axis=0)) # should be floating point number

1

In [90]:
sess.run(tf.reduce_mean([1.,2.], axis=0))

1.5

In [93]:
x = [[1., 2.],
     [3., 4.]]
sess.run(tf.reduce_mean(x))

2.5

In [94]:
sess.run(tf.reduce_mean(x, axis=0))

array([2., 3.], dtype=float32)

In [95]:
sess.run(tf.reduce_mean(x, axis=1))

array([1.5, 3.5], dtype=float32)

In [96]:
sess.run(tf.reduce_mean(x, axis=-1))

array([1.5, 3.5], dtype=float32)

#### Reduce sum

In [97]:
x = [[1., 2.],
     [3., 4.]]
sess.run(tf.reduce_sum(x))

10.0

In [98]:
sess.run(tf.reduce_sum(x, axis=0))

array([4., 6.], dtype=float32)

In [99]:
sess.run(tf.reduce_sum(x, axis=-1))

array([3., 7.], dtype=float32)

In [101]:
sess.run(tf.reduce_mean(tf.reduce_sum(x, axis=-1)))

5.0

#### Argmax

In [102]:
x = [[0, 1, 2],
     [2, 1, 0]]
sess.run(tf.argmax(x, axis=0)) # return maximum location

array([1, 0, 0], dtype=int64)

In [103]:
sess.run(tf.argmax(x, axis=1))

array([2, 0], dtype=int64)

In [104]:
sess.run(tf.argmax(x, axis=-1))

array([2, 0], dtype=int64)

#### Reshape (squeeze, expand)

In [105]:
t = np.array([[[0,1,2],
               [3,4,5]],
              [[6,7,8],
               [9,10,11]]])
t.shape

(2, 2, 3)

In [108]:
sess.run(tf.reshape(t, shape=[-1,3])) # means shape=[*,3]

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [109]:
sess.run(tf.reshape(t, shape=[-1,1,3])) # means shape=[*,1,3]

array([[[ 0,  1,  2]],

       [[ 3,  4,  5]],

       [[ 6,  7,  8]],

       [[ 9, 10, 11]]])

In [111]:
sess.run(tf.squeeze([[0],[1],[2]]))

array([0, 1, 2])

In [113]:
sess.run(tf.expand_dims([0,1,2,], axis=1))

array([[0],
       [1],
       [2]])

#### One hot

In [114]:
sess.run(tf.one_hot([[0],[1],[2],[0]], depth=3))

array([[[1., 0., 0.]],

       [[0., 1., 0.]],

       [[0., 0., 1.]],

       [[1., 0., 0.]]], dtype=float32)

In [115]:
t = tf.one_hot([[0], [1], [2], [0]], depth=3)
sess.run(tf.reshape(t, shape=[-1,3]))

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [1., 0., 0.]], dtype=float32)

#### Casting

In [116]:
sess.run(tf.cast([1.8, 2.2, 3.3, 4.9], tf.int32))

array([1, 2, 3, 4])

In [118]:
sess.run(tf.cast([True, False, 1 == 1, 0 == 1], tf.int32))

array([1, 0, 1, 0])

#### Stack

In [120]:
x = [1, 4]
y = [2, 5]
z = [3, 6]

# pack along first dimension
sess.run(tf.stack([x, y, z]))

array([[1, 4],
       [2, 5],
       [3, 6]])

In [123]:
sess.run(tf.stack([x, y, z], axis=1))

array([[1, 2, 3],
       [4, 5, 6]])

#### Ones and Zeros like

In [125]:
x = [[0, 1, 2], 
     [2, 1, 0]]

sess.run(tf.ones_like(x))

array([[1, 1, 1],
       [1, 1, 1]])

In [127]:
sess.run(tf.zeros_like(x))

array([[0, 0, 0],
       [0, 0, 0]])

#### zip

In [128]:
for x, y in zip([1,2,3], [4,5,6]):
    print(x,y)

1 4
2 5
3 6


In [129]:
for x, y, z in zip([1,2,3], [4,5,6], [7,8,9]):
    print(x,y,z)

1 4 7
2 5 8
3 6 9
