### Using Keras
- As of April 2020, we recommend using Keras **through** Tensorflow, i.e. ``import tensorflow.keras``

3 steps:
- which model you want
    - e.g. architecture, how many neurons, layers, which activation functions, etc.
- complile the model
    - specify the optimizers, metrics, loss function
- then fit the model
    - specify epochs (number of iterations of the dataset in backpropagation)and batch size (the data is fed in layers / batches; not all at once)

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras import backend as K
import numpy as np

K.clear_session()


#The XOR problem!
X = np.array([[0.0, 0.0], [1.0, 0.0], [0.0, 1.0], [1.0, 1.0]], dtype=np.float64)
y = np.array([0.0, 1.0, 1.0, 0.0], dtype=np.float64)


model = Sequential([
    Dense(2, input_shape=(2,)),
    Activation('sigmoid'),
    Dense(1),
    Activation('sigmoid'),
])
#2 layers network, the first layer consists of 2 neurons, and the second (last) consists of 1.
#if your output has N dimensions/rows, then the number of neurons in the output layer must also be N!

#input shape is SUPER IMPORTANT. tells Keras what the input shape is, as a tuple. It is the shape of a SINGLE
#DATA POINT. The network doesn't care how many data points it receives (it's going to get different number of data
#points among trainig and testing, anyways)

#the dense layers get their biases automatically. Keras handles this for us. weights are also initialized automatically,
#unless, of course, you want to overwrite it with a hyperparameter.

model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
#the complicated step. Conceptually very new. This is where it actually communicates with TensorFlow
#and creates what's called a 'computation graph' -- something that Keras needs to run the model. Keras is compiling
#our model into a very abstract form that is implemented in C++.
#include metrics = ['accuracy'] to give you accuracy during the epoch


#one caveat about compile -- if you run this piece of code more than once in a single session, Keras will get confused.
#running Keras in Jupyter is fine, but do this:
#from tensforflow.keras import backend as K
#K.clear_session() . and you should do this everytime you use Keras, because it will clear the memory
#of the previously compiled model every time.

model.fit(X, y, epochs=5000, batch_size=4, verbose=0)
# verbose: Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch.
# 5000 epochs is obviously overkill for such a small data size, but this will become important with larger data sets 


#batch sizes here don't have to be the same size
score = model.evaluate(X, y, batch_size=4)
print(score)

print(model.predict(X))

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
[0.14267781376838684, 1.0]
[[0.13714878]
 [0.82563996]
 [0.90951204]
 [0.12781586]]


some useful features:
- model.summary() provides a really good overview of the model after you compile it.
- you can save and load models to JSON files using the h5py library (copy from class example)
- you can look at the model weights with model.get_weights()
    - but this is more interesting to look at within *TensorBoard*

In [2]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 2)                 6         
_________________________________________________________________
activation (Activation)      (None, 2)                 0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 3         
_________________________________________________________________
activation_1 (Activation)    (None, 1)                 0         
Total params: 9
Trainable params: 9
Non-trainable params: 0
_________________________________________________________________


In [3]:
w = model.get_weights()
w

[array([[ 5.7002287,  4.360515 ],
        [-5.262735 , -4.3788095]], dtype=float32),
 array([ 3.1496096, -2.4337506], dtype=float32),
 array([[-5.2985444],
        [ 4.558196 ]], dtype=float32),
 array([2.8740137], dtype=float32)]

### Save the Keras model as a JSON file (containing the structure) and a H5 file (containing the parameters).

In [4]:
# serialize model to JSON
with open("model.json", "w") as json_file:
    json_file.write(model.to_json())

# serialize weights to HDF5
model.save_weights("model.h5")
print("Saved model to disk")

Saved model to disk


### Reading the contents of the h5 file.

In [5]:
import h5py
filename = "model.h5"
h5 = h5py.File(filename,'r')

# List all groups
print("Keys: {}\n".format(h5.keys()))
group_keys = list(h5.keys())
print(group_keys)

# Get the data
# data = list(h5[a_group_key])

Keys: <KeysViewHDF5 ['activation', 'activation_1', 'dense', 'dense_1']>

['activation', 'activation_1', 'dense', 'dense_1']


---

---

### The MNIST dataset: Guiding principles:
- Flatten the X data to be an array of shape (N, 784).
- This means that each incoming data point into the network (i.e. input shape) should be shaped ``(784, )``
- You will also have to one-hot encode the ylabels. You can use keras' ``to_categorical()`` function for this.
- This is a multi-class classification problem, so how many neurons should you have in the final layer? And which activation function should it be?

In [6]:
from tensorflow.keras.datasets import mnist

(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [7]:
X_train.shape

(60000, 28, 28)

What do the numbers represent?
- **60,000** images (Numpy Arrays), where each image has
- **28** rows (i.e. height), and 
- **28** columns (i.e. width)

In [8]:
y_train.shape

(60000,)

y_train is a *vector*, as it only contains 1-dimension; just a long sequence of 60,000 numbers
- in other word, it doesn't have "columns" since it's not a matrix