![prova](https://upload.wikimedia.org/wikipedia/commons/thumb/a/ab/TensorFlow_logo.svg/1200px-TensorFlow_logo.svg.png)

In [None]:
import tensorflow as tf

In [None]:
tf.__version__

# Tensor Manipulation Basics

### Tensor Creation

A 0-dimensional tensor is a __scalar__

In [None]:
scalar = tf.constant(0)
print(f"Value of the tensor = {scalar}")
print(f"Number of dimensions = {len(scalar.shape)}")
print(f"Tensor's shape = {scalar.shape}")

A 1-dimensional tensor is a __vector__

In [None]:
vector = tf.constant([1, 2, 3])
print(f"Value of the tensor = {vector}")
print(f"Number of dimensions = {len(vector.shape)}")
print(f"Tensor's shape = {vector.shape}")


A 2-dimensional tensor is a __matrix__

In [None]:
matrix = tf.constant(
    [[1, 2, 3],
     [4, 5, 6],
     [7, 8, 9]]
)
print(f"Value of the tensor = \n {matrix}")
print(f"Number of dimensions = {len(matrix.shape)}")
print(f"Tensor's shape = {matrix.shape}")

We can generalize tensors to __n dimensions__

In [None]:
n = 3
tensor = tf.random.normal(tuple(3 for _ in range(n)))
print(f"Value of the tensor = \n {tensor}")
print(f"Number of dimensions = {len(tensor.shape)}")
print(f"Tensor's shape = {tensor.shape}")

Each tensor is characterized also by a __data type__

In [None]:
tensor.dtype

Which can be cast to others (with the clear consequences on the numerical representation)

In [None]:
int_tensor = tf.cast(tensor, dtype=tf.int32)
print(int_tensor)

### Tensor indexing e slicing

Tensors can be __indexed__ (i.e., ``tensor[i, :, :]``) or __sliced__ (i.e., ``tensor[:i, ...]``).

Indexing a tensor reduces its dimensionality depending to the number of "free" dimensions.

In [None]:
print(f"Scalar = {tensor[0, 1, 0]}")
print(f"Vector = {tensor[:, 0, -1]}") # : means all elements in that dimension
print(f"Matrix = {tensor[:, 2]}")

Slicing reduces the size of the sliced dimension. With this approach, only contiguous slices can be taken. To get scattered slices, use [``tf.gather``](https://www.tensorflow.org/api_docs/python/tf/gather).

In [None]:
tensor_slice = tensor[1:]
print(f"tensor_slice = {tensor_slice}")
print(f"Shape of tensor_slice = {tensor_slice.shape}")
print(f"Number of dimensions = {len(tensor_slice.shape)}")

In [None]:
tensor_slice = tensor[1:, :, -1:]
print(f"tensor_slice = {tensor_slice}")
print(f"Shape of tensor_slice = {tensor_slice.shape}")
print(f"Number of dimensions = {len(tensor_slice.shape)}")

Indexing and slicing can be also mixed.

In [None]:
tensor_slice = tensor[1:, :, 2]
print(f"tensor_slice = {tensor_slice}")
print(f"Shape of tensor_slice = {tensor_slice.shape}")
print(f"Number of dimensions = {len(tensor_slice.shape)}")

### Concat and stack

- ``tf.concat``: concatenates n tensors on the `axis` dimension. All the dimensions of the input tensors, except for the `axis` dimension, must match.
- ``tf.stack``: stacks n tensors on the `axis` dimension, which is added for the resulting tensor. All the dimensions of the input tensors must match. 

In [None]:
tensor_1 = tf.random.normal((2, 3, 4))
tensor_2 = tf.random.uniform((2, 6, 4))
concat_tensor = tf.concat([tensor_1, tensor_2], axis=1)
print(f"Shape = {concat_tensor.shape}")

In [None]:
tensor_2 = tf.random.uniform((2, 3, 4))
stacked_tensor = tf.stack([tensor_1, tensor_2], axis=1)
print(f"Shape = {stacked_tensor.shape}")

### Your turn!

1. Create a tensor with random numbers of shape (2, 5, 3);
2. Get the last element on dimension 1;
3. Put it at the beginning of dimension 1.

_Note_: use the placeholders on the next cell.

In [None]:
init_tensor = None
final_tensor = None

In [None]:
if final_tensor.shape == (2, 5, 3):
    if tf.reduce_all(final_tensor[:, 0] == init_tensor[:, -1]):
        print("Good job!")
    else:
        print("Mh, correct dimensions but wrong values")
else:
    print(f"Wrong, dimensions are {init_tensor.shape} and {final_tensor.shape}")

# Tensor operations

``+, *, /, -`` are overloaded to support tensor operations. All the operations are element-wise, support broadcasting for the non-matching dimensions (i.e., (2, 1, 2) + (2, 3, 2) --> (2, 3, 2)).

In [None]:
tensor = tf.random.normal((3, 3, 3))
print(tensor)

In [None]:
tensor + 1

In [None]:
tensor * 5

In [None]:
tensor / 2

In [None]:
tensor_1 = tf.random.normal((3, 2))
tensor_2 = tf.random.normal((3, 2))
tensor_1 + tensor_2

`@` defines the dot product between two tensors. The inner dimensions must match the criterion for the dot product.

In [None]:
tensor_1 @ tf.transpose(tensor_2, [1, 0])

In [None]:
tensor_1 = tf.random.normal((10, 3, 2))
tensor_2 = tf.random.normal((10, 3, 2))
tensor_1 @ tf.transpose(tensor_2, [0, 2, 1])

# Learning with TensorFlow

### Digit Classification with a Linear Classifier

In [None]:
from PIL import Image

In [None]:
(train_x, train_y), (test_x, test_y) = tf.keras.datasets.mnist.load_data(path='ds')

In [None]:
print(f"Label is {train_y[0]}")
Image.fromarray(train_x[0])

In [None]:
train_x.shape

Images need to be flattened to be taken as input by a linear classifier.

In [None]:
train_x = tf.reshape(train_x, [train_x.shape[0], -1])

Need to __rescale__ the values to the interval [0, 1].

In [None]:
train_x = train_x / 255

Defining the linear model requires only a weight matrix and a bias vector.

In [None]:
W, b = tf.Variable(tf.random.normal((784, 10))), tf.Variable(tf.zeros(10))

In [None]:
prediction = tf.nn.softmax(train_x @ W + b, axis=-1)
prediction

Now that we have our linear model, let's define the tools for the optimization, i.e., __the optimizer and the loss function__.

In [None]:
optimizer = tf.keras.optimizers.SGD(learning_rate=1e-1)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()

A training loop with TensorFlow is shaped as follows.

In [None]:
from sklearn.metrics import accuracy_score
epochs = 500 

for e in range(epochs):
    # The GradientTape context records every operation applied to tensors
    # inside the context. The tape can then be used to compute the gradient
    # of a computation with respect to the tensors it has "watched".
    with tf.GradientTape() as tape:
        prediction = tf.nn.softmax(train_x @ W + b, axis=-1)
        loss_value = loss_fn(train_y, prediction)
    
    # We compute the gradient of the loss with respect to the parameters
    # of the model
    grads = tape.gradient(loss_value, [W, b])

    # We apply the gradient to the parameters of the model
    optimizer.apply_gradients(zip(grads, [W, b]))

    # We print the loss every 20 epochs
    prediction = tf.nn.softmax(train_x @ W + b, axis=-1)
    if e % 20 == 0:
        print(f"Epoch {e}: accuracy = {accuracy_score(train_y, tf.argmax(prediction, axis=-1))}")

Of course, if we use Adam as optimization approach, the performance in the learning phase increases dramatically.

In [None]:
W, b = tf.Variable(tf.random.normal((784, 10))), tf.Variable(tf.zeros(10))
optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()

epochs = 500

for e in range(epochs):
    with tf.GradientTape() as tape:
        prediction = tf.nn.softmax(train_x @ W + b, axis=-1)
        loss_value = loss_fn(train_y, prediction)
    
    grads = tape.gradient(loss_value, [W, b])
    optimizer.apply_gradients(zip(grads, [W, b]))

    if e % 20 == 0:
        prediction = tf.nn.softmax(train_x @ W + b, axis=-1)
        print(f"Epoca {e}: accuratezza = {accuracy_score(train_y, tf.argmax(prediction, axis=-1))}")

### Your turn!

Try to implement your own training loop with a linear regression on the Boston Housing dataset. 

Notes:
1. It is a regression problem, so you need to use a different loss function;
2. MinMax Scaling may not be the most appropriate way to rescale the input features (maybe sklearn's StandardScaler?).

In [None]:
(train_x, train_y), _ = tf.keras.datasets.boston_housing.load_data("./ds")

In [None]:
train_x.shape