# Part 1
Chapters 1 through 4 cover intoduction to neural networks and keras

In [1]:
import pandas as pd
import numpy as np


In [20]:
cube = np.array([[np.zeros(5), np.zeros(5)]*5, [np.ones(5), np.ones(5)]*5])

# Tensor Operations
```python
keras.layers.Dense(512, activation='relu')
```
This layer can be interpreted as a function, which takes a 2D tensor as
input and returns a 2D tensor as output, with shape (*, 512). 

It implements the operation `output = activation(dot(input, W) + bias)` where
W and bias are 1D tensors (vector). 

## Broadcasting
In the specific case of relu (`max(z, 0)`), it is an element-wise operation, as is 
the addition between tensors. But tensors of different ranks cannot be added element-wise,
so the smaller tensor is broadcast.

When possible, and if there’s no ambiguity, the smaller tensor will be broadcasted to
match the shape of the larger tensor. Broadcasting consists of two steps:
1. Axes (called broadcast axes) are added to the smaller tensor to match the ndim of
the larger tensor.
2. The smaller tensor is repeated alongside these new axes to match the shape of the larger
tensor

So if the smaller tensor is a vector of weights, these are repeated for each sample in the
input tensor.

In [41]:
# Broadcasting
# x is a 2D tensor of all 1's
x = np.array(np.ones((10,5)))
# y is a 1D tensor of all 4's
y = np.array(np.ones(5)*4)
print(x.shape, y.shape)

# However they can still be added together, because numpy broadcasts for you
x + y

(10, 5) (5,)


array([[ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.],
       [ 5.,  5.,  5.,  5.,  5.]])

In [53]:
# Tensor dot (tensor product)
# sumproduct of two vectors of same size
def naive_vector_dot(x, y):
    assert len(x.shape) == 1  # x is a vector
    assert len(y.shape) == 1  # y is a vector
    assert x.shape[0] == y.shape[0]  # x and y have same dimensions
    n_samples = x.shape[0]
    z = sum(x[i] * y[i] for i in range(n_samples))
    return z

x = np.random.random(5)
y = np.random.random(5)
print(naive_vector_dot(x, y), np.dot(x, y))


1.59053941779 1.59053941779


![alt](dot-product.PNG)

In [47]:
%%timeit
naive_vector_dot(x, y)

The slowest run took 5.04 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.07 µs per loop


In [48]:
%%timeit
np.dot(x, y)

The slowest run took 34.41 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 658 ns per loop


In [54]:
# Reshaping
# Re-arrange rows and columns to match a given shape
# reshape(6) => [[1, 2, 3], [4, 5, 6]] -> [1, 2, 3, 4, 5, 6]


# Stochastic Gradient Descent

![alt](sgd.png)

# Chapter 3: Getting started with NN
## Pieces
- **Layers**: Data processing module that takes as input one or more tensors
and returns as output one or more tensors. Can be `Dense`, `LSTM`, `Conv2D`, etc.
- **Models**: Layers are stacked as a network
- **Loss functions**: quantity to be minimized during training. Choosing the right objective
is critical because the optimizer will take any shortcut it can to minimize it.
- **Optimizer**: How the network will be updated

# Keras
## First step: define network
Main way to define models is via the Sequential class, which works for only for linear
stacks of layers (bar far the most common kind):
```python
from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(32, activation='relu', input_shape(784,)))
model.add(layers.Dense(10, activation='softmax'))
```

## Second step: define loss func and optimization method

```python
from keras import optimizers

model.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss='mse',
              metrics=['accuracy'])
```

## Last step: fit
```python
model.fit(input_tensor, target_tensor, batch_size=128, epochs=10)
```