[The webpage where I'm following my first tutorial. It's the Tensorflow Keras documentation page](https://www.tensorflow.org/guide/keras#import_tfkeras)

Keras is a high-level API to build and train deep learning models. It's used for fast prototyping, advanced research, and production, with three key advantages:

User friendly
Keras has a simple, consistent interface optimized for __common use cases__. It provides clear and actionable feedback for user errors.
Modular and composable
Keras models are made by connecting configurable building blocks together, with few restrictions.
Easy to extend
Write custom building blocks to express new ideas for research. Create new layers, loss functions, and develop state-of-the-art models

In [3]:
import tensorflow as tf
from tensorflow import keras

In [5]:
tf.keras

<module 'tensorflow.keras' from '/Users/zhiwei/anaconda3/lib/python3.6/site-packages/tensorflow/keras/__init__.py'>

What is HDF5?

HDF: Hierarchical Data Format
    
1. A set of file formats designed to store and organize large amounts of of data.
2. HDF5: 


PyPI: Python Package Index

1. A repository of software for the Python programming language. [click here](https://pypi.org)
2. It's a webpage where you can find all the software developed and shared by the Python community.

#### Building a simple model

Sequential model

In Keras, you assemble layers to build models. A model is (usually) a graph of layers. The most common type of model is a stack of layers: the tf.keras.Sequential model.

To build a simple, fully-connected network (i.e. multi-layer perceptron):

---

What is a multi-layer perceptron?
1. A feed forward nerual network.
2. A neural network that consist of at least 3 layers of of nodes.

What does feed forward mean?
1. In a feed forward neural network information always moves in one direction and never goes backwards.

<img src='feed_forward.png' width='200'>

[Image from wiki](https://en.wikipedia.org/wiki/Feedforward_neural_network)

In [8]:
# The most common type of model is a stack of layers: the tf.keras.Sequential model.

model = keras.Sequential()

model.add(keras.layers.Dense(64, activation='relu')) #first layer with 64 neurons
model.add(keras.layers.Dense(64, activation='relu')) #second layer with 64 neurons

model.add(keras.layers.Dense(10, activation='softmax')) #output layer with 10 neurons and a 'softmax' activation function.



<tensorflow.python.keras.engine.sequential.Sequential at 0xb387dda90>

There are many tf.keras.layers available with some common constructor parameters:

__activation:__ Set the activation function for the layer. This parameter is specified by the name of a built-in function or as a callable object. By default, no activation is applied.

__kernel_initializer and bias_initializer:__ The initialization schemes that create the layer's weights (kernel and bias). This parameter is a name or a callable object. This defaults to the "Glorot uniform" initializer.

__kernel_regularizer and bias_regularizer:__ The regularization schemes that apply the layer's weights (kernel and bias), such as L1 or L2 regularization. By default, no regularization is applied.

---

What is dense in keras?
1. Dense is a chracteristic describing the nerons within the layer.
2. A dense layer means that each neuron in the layer is connected to each neuron in the next layer.

What is the softmax activation function?
1. A function that is often used in the final layer nerual network classifier to restrict the output values into a desired range.

What is a kernal?

[You can read it here, it gets confusing](https://stats.stackexchange.com/questions/152897/how-to-intuitively-explain-what-a-kernel-is)

What is a feature vector?
1. A multi-dimensional vector of numerical features. 1 x N dimension.
2. A matrix has to be 2 or more x N. I would assume this is the difference between a vector and a matrix. 

---

#### Compiling the model

In [None]:
# After the model is constructed, configure its learning process by calling the compile method:


model.compile(optimizer=tf.train.AdamOptimizer(0.001),
             loss='categorical_crossentropy',
             metrics=['accuracy'])

tf.keras.Model.compile takes three important arguments:

__optimizer:__ This object specifies the training procedure. Pass it optimizer instances from the tf.train module, such as AdamOptimizer, RMSPropOptimizer, or GradientDescentOptimizer.

__loss:__ The function to minimize during optimization. Common choices include mean square error (mse), categorical_crossentropy, and binary_crossentropy. Loss functions are specified by name or by passing a callable object from the tf.keras.losses module.

__metrics:__ Used to monitor training. These are string names or callables from the tf.keras.metrics module.

#### Input NumPy data

For small datasets, use in-memory NumPy arrays to train and evaluate a model. The model is "fit" to the training data using the fit method

In [10]:
import numpy as np

data = np.random.random((1000,32)) # 1000 rows, 32 columns
labels = np.random.random((1000,10)) # 1000 rows, 10 columns - a multi-label problem perhaps?

model.fit(data, labels, epochs=10, batch_size=32) # remember that fitting is training 

In [13]:
data.shape #just to take note that np has the .shape attribute too. 

(1000, 32)

tf.keras.Model.fit takes three important arguments:

__epochs:__ Training is structured into epochs. An epoch is one iteration over the entire input data (this is done in smaller batches).

__batch_size:__ When passed NumPy data, the model slices the data into smaller batches and iterates over these batches during training. This integer specifies the size of each batch. Be aware that the last batch may be smaller if the total number of samples is not divisible by the batch size.

__validation_data:__ When prototyping a model, you want to easily monitor its performance on some validation data. Passing this argument—a tuple of inputs and labels—allows the model to display the loss and metrics in inference mode for the passed data, at the end of each epoch.

#### Evalute and predict

In [None]:
model.evaluate(x, y, batch_size=32) # I guess we are evaluating the data batch by batch. 32 times
model.predict(x, batch_size=32) # I guess we are predicting the data batch by batch. 32 times

#### Building advanced models

The tf.keras.Sequential model is a simple stack of layers that cannot represent arbitrary models. Use the Keras functional API to build complex model topologies such as:

* Multi-input models,
* Multi-output models,
* Models with shared layers (the same layer called several times),
* Models with non-sequential data flows (e.g. residual connections).

Building a model with the functional API works like this:

A layer instance is callable (accepts parameters) and returns a tensor.
Input tensors and output tensors are used to define a tf.keras.Model instance.
This model is trained just like the Sequential model.


In [None]:
### Creating the model ###

#input
inputs = keras.Input(shape=(32,))
#hidden layers
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(64, activation='relu')(x)
#output
predictions  = keras.layers.Dense(10, activation='softmax')(x)

###Instantiate the model###

model = keras.Model(inputs=inputs, outputs=predictions)

###Compile the model###

model.compile(optimizer=tf.train.RMSPropOptimizer(0.001)
             loss='categorical_crosssentropy'
             metrics=['accuracy'])

##Train the model###

model.fit(data, labels, batch_size=32, epochs=5)



What does callable mean?

1. A callable object is an object that can accept some arguments (also called parameters) and possibly return an object (often a tuple containing multiple objects). A function is the simplest callable object in Python

#### Distribution

he Estimators API is used for training models for distributed environments. This targets industry use cases such as distributed training on large datasets that can export a model for production.

A tf.keras.Model can be trained with the tf.estimator API by converting the model to an tf.estimator.Estimator object with tf.keras.estimator.model_to_estimator. See Creating Estimators from Keras models.

In [None]:
model = keras.Sequential([layers.Dense(10,activation='softmax'),
                          layers.Dense(10,activation='softmax')])

model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

estimator = keras.estimator.model_to_estimator(model) 
#this is the conversion of the model to an estimator in order to train the model for a distributed envrionment. 