### Part 1: Introduction to Keras and its features
- Keras is a high-level API for building and training deep learning models
- It is easy to use, flexible, and allows for rapid prototyping of models
- Keras is built on top of TensorFlow library

#### Keras History and Current Status
- Keras was developed by Francois Chollet in 2015 as an open-source software library for building neural networks
- In 2017, Keras was integrated into TensorFlow as the default API for building neural networks (tensorflow.keras)
- As of 2021, Keras is widely used in the deep learning community and has a large and active developer community

#### Features of Keras
- Keras supports a variety of neural network architectures including feedforward, convolutional, and recurrent neural networks. It allows for easy customization of models through the use of layers, activation functions, and loss functions
- Keras also includes a range of pre-built models and datasets for common tasks such as image classification and natural language processing

#### Some of the key features of TensorFlow/Keras include:
- Easy model building and prototyping using tensorflow.keras
- Efficient computation on both CPUs and GPUs
- Built-in support for distributed training
- Large community with active development and support
- Easy deployment of models on mobile and embedded devices

#### Difference between TensorFlow/Keras
TensorFlow and Keras are often used together to build deep learning models.
1. TensorFlow is a low-level open-source library developed by Google Brain team that provides a wide range of tools for building and deploying machine learning models. It is highly customizable and allows for fine-grained control over the training process.
2. Keras, on the other hand, is a high-level API that sits on top of TensorFlow. It provides a simpler interface for building and training deep learning models that are easy to use. Keras is designed for building models quickly and efficiently.

The combination of those two libraries allows for rapid prototyping of deep learning models while still having access to the full power of TensorFlow. In this course, we will essentially be using Keras to build and train our models.

### Part 2: Installation and setup
TensorFlow/Keras can be installed using pip or conda. It is recommended to use a virtual environment to avoid conflicts with other libraries.

1. Installing TensorFlow/Keras using pip or conda
```python
pip install tensorflow
conda install tensorflow
```

2. In case you want to install TensorFlow for gpu, i recommend using the conda command
```python
conda install tensorflow-gpu
```
which will automatically install the gpu version of TensorFlow along with the necessary dependencies (such as CUDA toolkit and CUDNN).

3. Verify version and GPU availability
```python
import tensorflow as tf
print(tf.__version__)
print(tf.test.gpu_device_name())
```

### Part 3: Introduction to Tensors and Variables in TensorFlow
Tensors are the basic building blocks in TensorFlow/Keras, and can be thought of as multidimensional arrays or matrices. They are the data structures used to store and manipulate data in TensorFlow/Keras, and can be used to represent a wide range of data types such as images, audio, text, and numerical data.

Variables, on the other hand, are a specific type of tensor that can be modified during the course of training. They are used to store the weights and biases of a model, and are updated during the training process using an optimization algorithm such as Stochastic Gradient Descent.

In TensorFlow, tensors can be created using the tf.constant() method or the tf.Variable() method.
```python
import tensorflow as tf

# Create a tensor with a constant value of 5
a = tf.constant(5)
print(a)
```

Output:
```python
tf.Tensor(5, shape=(), dtype=int32)
```

Creating a variable :
```python
# Create a variable with a constant value of 5
b = tf.Variable(5)
print(b)
```

Output:
```python
<tf.Variable 'Variable:0' shape=() dtype=int32, numpy=5>
```

It is also possible to create tensors from NumPy arrays using the tf.convert_to_tensor() method.
```python
import numpy as np

# Create a tensor from a NumPy array
c = tf.convert_to_tensor(np.array([1, 2, 3]))
print(c)
```

Output:
```python
tf.Tensor([1 2 3], shape=(3,), dtype=int64)
```

#### Tensor Operations
Similarly to NumPy or PyTorch libraries, TensorFlow/Keras supports a wide range of tensor operations such as addition, subtraction, multiplication, division, and matrix multiplication. These operations can be performed using the tf.add(), tf.subtract(), tf.multiply(), tf.divide(), and tf.matmul() methods respectively.

```python
# Create two tensors
a = tf.constant(5)
b = tf.constant(10)

# Basic arithmetic operations in tensors
c = tf.add(a, b)
d = tf.subtract(a, b)
e = tf.multiply(a, b)
f = tf.divide(a, b)

# Matrix multiplication of two tensors
g = tf.matmul(a, b)
```

Other complex operations such as reshaping and concatenation can also be performed, refer to the PyTorch course, most of the operations are the same.

```python
# Convert a numpy array to a tensor
numpy_array = np.array([[1, 2, 3], [4, 5, 6]])
tensor_b = tf.convert_to_tensor(numpy_array)

# Reshape a tensor
tensor_c = tf.reshape(tensor_b, [3, 2])

# Concatenate two tensors
tensor_d = tf.concat([tensor_a, tensor_c], axis=0)
```

### Part 4: Gradients and Variable Updates
Variables can be updated during training using Keras optimizer functions such as tf.keras.optimizers.Adam() and tf.keras.optimizers.SGD(). For example:
```python
import tensorflow as tf
import numpy as np

# Create a variable
initial_value = np.random.randn(3, 4)
variable_a = tf.Variable(initial_value)

# Update the variable
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
with tf.GradientTape() as tape:
    # Compute the loss
    loss = some_function(variable_a)
# Compute the gradients
gradients = tape.gradient(loss, variable_a)
# Update the variable
optimizer.apply_gradients(zip(gradients, variable_a))
```

**Important note** The previous code may seem a bit confusing at first, however, we will not be using this method to update variables in this course. Instead, we will be using the Keras API to build and train our models. This method is only presented here for completeness. You should note that : 
- Keras allows training models with very fewer lines of code.
- Keras allows the use of NumPy arrays as inputs to the model, so we don't have to convert them to tensors. But note it is always worth knowing how tensors work.

#### Using Keras backend to access low-level operations
Keras provides a backend module (keras.backend) that allows the user to access low-level TensorFlow operations. This is useful when creating custom layers or loss functions. One of the main reasons for using backend operations in Keras is to perform automatic differentiation for gradient backpropagation during training of neural networks.

One example where it may be necessary to use the backend is when working with custom loss functions. For instance, let's say we have a custom loss function that requires a mathematical operation that is not directly supported by Keras. In this case, we can use the backend to perform the operation and still be able to compute gradients for backpropagation.
```python
import tensorflow as tf
from keras import backend as K

# Custom loss function using low-level TensorFlow operations
def custom_loss(y_true, y_pred):
    return K.sum(tf.square(y_true - y_pred))
```

### Part 5: Building a feedforward neural network with dense layers
To build a feedforward neural network with dense layers in Keras, we first need to import the necessary modules and functions:
```python
from tensorflow import keras
from tensorflow.keras import layers
```

Then, we can define the model by creating a Sequential object and adding layers to it using the add method. Here's an example of a feedforward neural network with one hidden layer:
```python
# Define the model
model = keras.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(784,)))
model.add(layers.Dense(10, activation='softmax'))
```

It is also possible to pass a whole array of layers to the Sequential constructor:
```python
model = keras.Sequential([
    layers.Dense(64, activation='relu', input_shape=(784,)),
    layers.Dense(10, activation='softmax')
])
```

In this example, we define a model with two layers: a Dense layer with 64 units and ReLU activation function as the hidden layer, and a Dense layer with 10 units and softmax activation function as the output layer. The input_shape argument specifies the shape of the input data.
- The Dense layer is equivalent to the Linear in PyTorch, it represents a fully connected layer.

#### Activation functions
Activation functions are essential components of a neural network because they allow the neural network to model more complex non-linear functions. Keras provides a variety of activation functions. For instance, we saw in the previous example the ReLU and softmax activation functions:
- ReLU stands for Rectified Linear Unit, it is defined as ``max(x, 0)``. This function is well approriate in computer vision.
- Softmax, defined as ``exp(x_i) / sum(exp(x))``, is an activation function that is used in the output layer of classification models. It is used to convert the output of the model to a probability distribution over the predicted output classes.

In the previous example, we set the activation function by passing it as a string to the activation argument. However, we can also pass the activation function as a callable function. Keras provides multiple activation function in keras.activations, for example:
```python
model.add(layers.Dense(64, activation=keras.activations.relu))
```

For some activation functions, we can also use them as layers. For example, we can use the ReLU activation function as a layer by calling the keras.layers.ReLU() function. This allows to set some hyperparameters of the activation function:
```python
model.add(layers.Dense(64))
model.add(layers.ReLU(alpha=0.01))
```
- A whole set of activations is provided by Keras, some of them can be specified by a simple string, such as "tanh" or "sigmoid", however for more complex functions, it is necessary to create the object first, and then pass it to the layer.
**Note** When an activation is not specified, the layers will use the linear activation function by default.

#### Compiling the model
After defining the model, we need to compile it before training it. To compile the model, we need to specify the loss function, the optimizer, and the metrics to monitor during training. For example:
```python
model.compile(
    loss='categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy'],
)
```

Similar to the activation functions, we can also pass the loss function, optimizer, and metrics as callable functions:
```python
model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(),
    optimizer=keras.optimizers.Adam(),
    metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
```
- Compiling the model will create the computation graph that will be used during training.
- It is also possible to pass multiple metrics to the metrics argument, for example, metrics=['accuracy', 'mse'].
- Similarly, if we have multiple outputs, we can pass multiple loss functions to the loss argument, for example, loss=['categorical_crossentropy', 'mse'] (We will see next how we can create models with multiple outputs and inputs).

**Note** It also possible to print a summary of the model which will contain detailed informations about each layer and the number of parameters in the model. This is recommended to do after defining the model as a sanity check:
```python
model.summary()
```

#### Training the model
After compiling the model, we can train it using the fit method:
```python
history = model.fit(x_train, y_train,
                    batch_size=32,
                    epochs=10,
                    verbose=1,
                    validation_data=(x_val, y_val))
```
- The fit method takes the training data (x_train as inputs and y_train as targets), the batch size, the number of epochs, and the validation data if any. 
- It returns a History object that contains the training history of the model. 
- The history isa dictionnary containing the loss and metrics measured during training and validation.

**Important note** Unlike PyTorch where we had to create DataLoaders for train and validation sets, in Keras, we can pass directly NumPy arrays containing our data. For instance, x_train and y_train are NumPy arrays, Keras takes care of turning them into tensors.

#### Evaluating the model
After training the model, we can evaluate it on the test set using the evaluate method:
```python
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=2)
```
The evaluate function returns a list containing the loss and metrics values for the test set.

#### Making predictions
After training the model, we can use it to make predictions on new data using the predict method (single examples or batches of examples):
```python
predictions = model.predict(x_test)
```
``predictions`` will contains the outputs of the model, if the model has multiple outputs, then ``predictions`` will be a list.

Regarding the previous example where the output was a probability distribution over the predicted output classes, we can get the predicted class by taking the argmax of the output:
```python
import numpy as np

# Get the class probabilities for the test images
probs = model.predict(x_test)

# Get the predicted class labels
y_pred = np.argmax(probs, axis=1)

# Print the first 10 predicted labels
print(y_pred[:10])
```

### Part 6: Functional API
- The Sequential API is very convenient for building simple models, however, it is limited in that it does not allow to create models with multiple inputs and outputs. For instance, if we want to create a model with two inputs and one output, we can't use the Sequential API. Instead, we need to use the Functional API.
- Functional API offers more flexibility and freedom when constructing architectures.
- Functional API consists on creating the different components of the model independently, and connecting them together. For instance, the previous example using Sequential can be rewritten using the Functional API as follows:
```python
# Define the input layer
inputs = keras.Input(shape=(784,))
# Define the hidden layer
x = layers.Dense(64, activation='relu')(inputs)
# Define the output layer
outputs = layers.Dense(10, activation='softmax')(x)

# Create the model
model = keras.Model(inputs=inputs, outputs=outputs)
```

- Note how each layer is connected to the previous one using the syntax ``x = layers_i(layer_{i-1})``. ``layer_i`` can be declared inplace directly or assigned to a variable. 
- At the end the keras.Model() function takes the list of inputs and outputs to create the model.
- The final model has exactly the same functionalities as the Sequential model, we can compile it and train it the same way.

We can also use Sequential models as part of functional models, for instance:
```python
# Define the input layer
inputs = keras.Input(shape=(784,))

# Define the hidden layer
hiddens = keras.Sequential([
    layers.Dense(64, activation='relu'),
    layers.Dense(32, activation='relu'),
])

h = hiddens(inputs)

# Define the output layer
outputs = layers.Dense(10, activation='softmax')(x)

# Create the model
model = keras.Model(inputs=inputs, outputs=outputs)
```

#### Multiple inputs and outputs (With multiple loss functions)
- Functional API allows to create models with multiple inputs and outputs. For instance, if we want to create a model with two inputs and two outputs, we can do it as follows:
```python
# Define the first input
inputs1 = keras.Input(shape=(784,))
# Define the second input
inputs2 = keras.Input(shape=(784,))
# Concatenate the two inputs
x = layers.concatenate([inputs1, inputs2])
# Define the hidden layer
x = layers.Dense(64, activation='relu')(x)
# Define the first output
outputs1 = layers.Dense(10, activation='softmax')(x)
# Define the second output
outputs2 = layers.Dense(784, activation='sigmoid')(x)

# Create the model
model = keras.Model(inputs=[inputs1, inputs2], outputs=[outputs1, outputs2])

# Compile the model
model.compile(
    loss=['categorical_crossentropy', 'mse'],
    optimizer='adam',
    metrics=['accuracy'],
)
```

In this example, we have two inputs and two outputs. The first output is a classification output with 10 classes, and the second output is a reconstruction output with 784 pixels. We set 2 loss functions, one for each output. By default, Keras will optimize the sum of the two losses, however, we can also set different weights for each loss function using the loss_weights argument:
```python
model.compile(
    loss=['categorical_crossentropy', 'mse'],
    loss_weights=[1.0, 0.5],
    optimizer='adam',
    metrics=['accuracy'],
)
```

### Part 7: Custom layers
- Keras allows to create custom layers by subclassing the Layer class. For instance, if we want to create a custom layer that adds a bias to the input, we can do it as follows:
```python
class BiasLayer(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(BiasLayer, self).__init__(**kwargs)
        self.units = units

    def build(self, input_shape):
        self.bias = self.add_weight(
            shape=(self.units,),
            initializer='zeros',
            trainable=True,
        )

    def call(self, inputs):
        return inputs + self.bias
```

- The BiasLayer class inherits from the Layer class and overrides the __init__, build and call methods.
- The __init__ method is used to initialize the layer, it takes the layer parameters as arguments.
- The build method is used to create the layer weights, it takes the input_shape as argument.
- The call method is used to compute the layer output, it takes the inputs as argument (Equivalent to the forward() method in PyTorch).

The BiasLayer class can be used as any other layer in Keras:
```python
# Create the model
model = keras.Sequential([
    layers.Dense(64, activation='relu'),
    BiasLayer(64),
    layers.Dense(10, activation='softmax'),
])
```

### Part 8: Custom loss functions
- Keras allows to create custom loss functions by subclassing the Loss class. For instance, if we want to create a custom loss function that computes the mean squared error between the true and predicted outputs, we can do it as follows:
```python
class MSE(keras.losses.Loss):
    def __init__(self, **kwargs):
        super(MSE, self).__init__(**kwargs)

    def call(self, y_true, y_pred):
        return tf.reduce_mean(tf.square(y_true - y_pred))
```

- The MSE class inherits from the Loss class and overrides the __init__ and call methods.
- The __init__ method is used to initialize internal parameters to the loss.
- The call method is used to compute the loss, it takes the true and predicted outputs as arguments.
**Note** call method always takes the true and predicted outputs as arguments, even if the loss function does not use the true outputs (e.g. in the case of a GAN).
**Note2** Make sure to use the Keras backend functions to compute the loss, otherwise the loss will not be differentiable and there maybe a discontinuity in the computation graph.

The new loss function can be used as any other loss function in Keras:
```python
# Compile the model
model.compile(
    loss=MSE(),
    optimizer='adam',
    metrics=['accuracy'],
)
```

- Similarly, it is also possible to create custom metrics by subclassing the Metric class. However, we'll leave this part to the reader.

### Part 9: Keras custom data generators
- Keras allows to create custom data generators by subclassing the Sequence class. This can be helpful when loading and processing data on the fly, for instance when the data is too large to be loaded all at once in memory. Here's an example:
```python
class CustomDataGenerator(keras.utils.Sequence):
    def __init__(self, x_set, y_set, batch_size=32):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size

    def __len__(self):
        return math.ceil(len(self.x) / self.batch_size)

    def __getitem__(self, idx):
        batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size]
        batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size]

        # you can preprocess your data at this point

        return batch_x, batch_y

    def on_epoch_end(self):
        # you can shuffle your data or update the dataset at this point
        # this function is optional
        pass
```

- The CustomDataGenerator class inherits from the Sequence class and overrides the __init__, __len__, __getitem__ and on_epoch_end optional methods.
- The __init__ method is used to initialize the data generator, it takes the data and batch_size as arguments.
- The __len__ method is used to compute the number of batches in an epoch, it takes no arguments.
- The __getitem__ method is used to load and process a batch of data, it takes the batch index as argument.
- The on_epoch_end method is called at the end of each epoch, it is optional and takes no arguments.

**Note** In case you have enough memory, it is recommanded to process all your data inside the __init__ method and store the processed data in memory. Avoid heavy preprocessing during training, as it can slow down the training.

The new data generator can be used as any other data generator in Keras:
```python
# Create the data generator
train_generator = CustomDataGenerator(x_train, y_train, batch_size=32)
validation_generator = CustomDataGenerator(x_val, y_val, batch_size=32)

# Train the model
model.fit(
    train_generator,
    validation_data=validation_generator,
    epochs=10,
    # ... other arguments
)
```
