# Keras

In this module we will introduce [Keras](https://keras.io/), a high level API for Neural Networks.


To be specific
- we will mostly restrict ourselves to the Keras Sequential model
- this will greatly simplify your learning and coding
- it will restrict the type of Deep Learning programs that you can write
    - but not a meaningful restriction for the simple programs that you will write in this course

After we introduce the high level Keras API
- we will review the history of Deep Learning programming to see how we got here
- this will give you greater insight into what Keras does "under the covers"
    - appreciate history
    - aid your diagnostics
  

**Note**: 

The code snippets in this notebook are *fragments* of a larger [notebook](DNN_TensorFlow_example.ipynb)
- are illustrative: will not actually execute in this notebook but will in the complete notebook

Confusion warning:
- There are two similar *but different* packages that implement Keras
    - one built into TensorFlow (the one we will use)
    - a separate project

<div class="alert alert-block alert-warning">
    <b>TL;DR</b> 
    <br>
    <br>
        <b>YES</b><br>
    <ul>
        <li>
            <font face="Courier" color="black" weight="bold">
            import tensorflow as tf<br>
            tf.keras.layers.Dense(...)
            </font>
        </li>
        <li>
            <font face="Courier" color="black" weight="bold">
            from tensorflow import keras<br>
            keras.layers.Dense(...)
            </font>
        </li>
    </ul>
    <br>
        <b>NO</b><br>
    <ul>
    <li> <font face="Courier" color="black" weight="bold">
        import keras<br>
        keras.layers.Dense( ... )
        </font>
        </li>
    </ul>
        </div>
        
If you want to know the details, visit this [notebook](Tensorflow_Keras_Archaeology.ipynb#tensorflow.keras-vs-keras-(Confusion-alert))


# The Keras Sequential Model

**Reference**: [Getting started with the Keras Sequential Model](https://keras.io/getting-started/sequential-model-guide/)

Keras has two programming models
- Sequential
- Functional

We will start with the Sequential model

The Sequential model allows you to build Neural Networks (NN) that are composed of a *sequence* of layers
- just like our cartoon
- a very prevalent paradigm

This will likely be sufficient in your initial studies
- but it restricts the architecture of the Neural Networks that  you can build
- use the Functional API for full generality
    - but it might appear more complicated

Let's jump into some code.

Some old friends, in new clothing:

- A model uses the `Sequential` architecture
- A sequence (implemented as an array) of layers
    - Single element array
    - Consisting of a `Dense` (Fully connected) layer
        - with $1$ output
        - No activation
        - Implements Regression
- Loss is `mse`

- A model uses the `Sequential` architecture
- A sequence (implemented as an array) of layers
    - Single element array
    - Consisting of a `Dense` (Fully connected) layer
        - with $1$ output: binary classification
        - sigmoid activation
        - Implements Classification
- Loss is `binary_crossentropy`

<div class="alert alert-block alert-warning">
    <b>TL;DR</b> 
    <br>
    <ul>
        <li>Both examples are a single layer
            <ul>
                <li>Dense, with 1 unit ("neuron")</li>
            </ul>
        <li>Regression example
             <ul>
                <li>No activation</li>
                <li>MSE loss</li>
                 </ul>
        <li>Binary classification example
         <ul>
            <li>Sigmoid activation</li>
            <li>Binary cross entropy loss</li>
             </ul>
    </ul>
    
</div>

Hopefully you get the idea.

Let's explore a slightly more complicated model.


- A model uses the `Sequential` architecture
- A sequence (implemented as an array) of layers
    - 3 layers (3 element array)
    - 2 `Dense` layers 
        - with varying number of outputs: `n_hidden_1`,`n_hidden_2`
        - `relu` activation
    - A `Dense` layer implementing Multinomial Classification
        - number of outputs equal to number of classes
        - `softmax` activation


- The first two layers "transform" the input
- The "head" layer implements Multinomial Classification

To use the model, you first need to "compile" it

"Compiling" is quite significant as we will demonstrate later
- For now: *it is where you define the Loss function*
    
Next, just as in `sklearn`: you "fit" the model to the training data.

Once the model is fit, you can predict, just like `sklearn`.

Here we evaluate the model on the Test dataset.

The idea is quite simple
- Keras Sequential implements an `sklearn`-like API
    - define a model
    - fit the model
    - predict


We have glossed over a lot of details
- What does each layer do ?
- Why do we need to "compile" ?
    - and why does it need an optimizer ?

We will explain some of these details as part of our explanation of the code in our first example notebook (last section of this module).

## Sequential model summary
- A Sequential model consists of an array of layers
- The array may be constructed in several ways
    - By passing it to the `Sequential` constructor
    - By appending one layer at a time to an existing model
    
    `model.add( Dense(10, activation="relu") )`

The first layer of a Sequential model is special.

**Best practice** is to provide an indication of the size of an example as part of the first layer's definition
- Usually the `input_shape` argument (but sometimes different)

    `Sequential( [ Dense(10, activation="relu", input_shape=X[0].shape ] )`
    
    - where `X` is the matrix of training examples
        - so `X[0].shape` is the shape of the first (and hopefully each) example

It is **not** necessary (but still a good idea) to specify the input shape.  If you do not
- You can't examine the model, e.g.

    `model.summary()`
- until **after** you first call it
    - the shape of the input will be inferred from the first example with which the model is called

**Technical note**

The number of weights of the first layer depends on the shape of an input example.
- If you provide the shape: it is known immediately
- If you do not provide the shape, it will not be known until the model is first run
    - inferred from first example

The number of weights of layers subsequent to the first can be inferred from the previous layer.

# The Keras Functional Model

- More verbose than `Sequential`
- Also more flexible
    - you can define more complex computation graphs (multiple inputs/outputs, shared layers)
    

Highlights:
- Manually invoke a single layer at a time
    - Passing as input the output of the prior layer.

- You must define an `Input` layer (placeholder for the input/define its shape)
    - `Sequential` uses the `input_shape=` parameter to the first layer
- You "wrap" the graph into a "model" by a `Model` statement
    - looks like a function definition
        - names the input and output formal parameters
    - a `Model` acts just like a layer (but with internals that you create)

As a beginner, you will probably exclusively use the Sequential model.

Keep the Functional API in the back of your mind.

# Let's code !


Lets see a working notebook.

Two options
- Run on your local machine: [DNN Tensorflow example Notebook local](DNN_TensorFlow_example.ipynb) (local)
    - Tensorflow version 2+ only !
- Run on Google Colab: [DNN Tensorflow example Notebook from github](https://colab.research.google.com/github/kenperry-public/ML_Spring_2024/blob/master/DNN_TensorFlow_example.ipynb) (**Colab**)

<!--- #include (DNN_TensorFlow_example.ipynb) --->

In [1]:
print("Done")

Done
