## (a)
*In computational graphs, we say that the "+" gate is a gradient distributor. Why ? How can we qualify the multiplication gate and the max gate ?*

An addition gate is a gradient distributor because the derivative operator is distributive

$$
\frac{d}{dx} (x\cdot y + x\cdot z) = \frac{d}{dx}(x\cdot y) + \frac{d}{dx}(x\cdot z)
$$

The multiplication gate can be called a switcher due to

$$
\frac{d}{dx} (x\cdot y) = y \quad \frac{d}{dy} (x\cdot y) = x
$$

And the max gate a routing gate as

$$
\frac{d}{dx} \text{max}(x, y) = 1 \quad \frac{d}{dy} \text{max}(x, y) = 0 \quad \text{assuming that } x > y
$$

## (b)
*Give 4 advantages of using computational graphs for learning strategies.*
 - The gradient calculation can be implemented once at a granular level and can then be used to derivate arbitrary complex graph calculations.
 - Backpropagation of gradient is intuitive.
 - Graphs can be compiled and hence optimized.
 - Nodes of a graph can be implemented as classes and therefore can easily be adapted or extended.
 - Custom nodes can be composed of sevral atomic operations or can also be factorized in order to reduce computational complexity.
 - The loss function, gradient calculation and weights update can be included into the graph. Therefore only a minimal dataflow between the GPU memory and the CPU memory is required.
 

## (c)
*What are the expected advantages / disadvantages of a static graph strategy (TensorFlow) versus a dynamic graph stategy (Pytorch) ?*

A static graph strategy is expected to be more efficient in terms of computational complexity where as a dynamic graph strategy is most probably more convenient to work with in the experiental / development phase.

## (d)
*What is the use of the @tf.function decorator for functions in TensorFlow 2.0 ?*

It compiles a callable function into a TensorFlow graph.

## (e)
*What does the function tf.gradients(ys, xs) ? Describe precisely the output.*

It calculates the gradients (local rate of change) of the tensors in `ys` with respect to the tensors in `xs`. It returns a list of tensors of length `len(xs)` where each value $x$ in each tensor is the sum $\sum_{y\in ys}\frac{dy}{dx}$. 


## (f)
*Explain the difference between the Keras sequential and functional API.*

 - Sequential API allows you to create models layer-by-layer by stacking them. It is limited in that it does not allow you to create models that share layers or have multiple inputs or outputs
 
```python
from tensorflow.keras import Sequential, Model
from tensorflow.keras import layers

seq_model = Sequential()
seq_model.add(layers.Dense(4, input_shape=(10,2)))
seq_model.add(layers.Dense(4))
seq_model.add(layers.Dense(1))
seq_model.summary()
```
 
 - Keras functional API provides a more flexibility as you can easily define models where layers connect to more than just the previous and next layers, and you can connect layers to any other layers. As a result, you can create complex network such as Residual Network.

 
```python
from tensorflow.keras import Sequential, Model, Input
from tensorflow.keras import layers

input1 = Input(shape=(10,2))
lay1 = layers.Dense(4, input_shape=(10,2))(input1)
lay2 = layers.Dense(4)(lay1)
out1 = layers.Dense(1)(lay2)
out2 = layers.Dense(1)(lay2)
func_model = Model(inputs=input1, outputs=[out1, out2])
func_model.summary()
```

 - There is a third (in my opinion the most elegant) way to build a keras model, subclassing the Model class: in that case, you should define your layers in `__init__` and you should implement the model's forward pass in `call`. Notice that the `training` flag has to be manually set if the call method is executed during training.
 
```python
import tensorflow as tf

class MyModel(tf.keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.dense1 = tf.keras.layers.Dense(4, activation=tf.nn.relu)
        self.dense2 = tf.keras.layers.Dense(5, activation=tf.nn.relu)
        self.dropout = tf.keras.layers.Dropout(0.5)
    
    def call(self, inputs, training=False)
        x=self.dense1(inputs)
        if training:
            x = self.dropout(x, training=training)
        return self.dense2(x)
   
model = MyModel()
```

from: https://medium.com/analytics-vidhya/keras-model-sequential-api-vs-functional-api-fc1439a6fb10