https://colab.research.google.com/drive/1dmgloT5ADF_6rH9t5FvK2MuwPPBKtm9d?usp=sharing

More instructions and examples in the notebook
The purpose of the project is to learn the mapping from polar coordinates to a a discrete 10x10 grid of cells in the plane, using a neural network.

The supervised dataset is given to you in the form of a generator (to be considered as a black box).

The model must achieve an accuracy of 95%, and it will be evaluated in a way **inversely proportional to the number of its parameters: the smaller, the better.**

**WARNING**: Any solution taking advantage of meta-knowledge about the generator will be automatically rejected.

In [450]:
from keras.layers import Input, Dense, Lambda
from keras.models import Model
import numpy as np
import tensorflow as tf
from keras.optimizers import Adam
tf.random.set_seed(42)

Here is the generator. It returns triples of the form ((theta,rho),out) where (theta,rho) are the polar coordinates of a point in the first quadrant of the plane, and out is a 10x10 map with "1" in the cell corresponding to the point position, and "0" everywhere else.

By setting flat=True, the resulting map is flattened into a vector with a single dimension 100. You can use this variant, if you wish. 

In [451]:
def polar_generator(batchsize,grid=(10,10),noise=.002,flat=False):
  while True:
    x = np.random.rand(batchsize)
    y = np.random.rand(batchsize)
    out = np.zeros((batchsize,grid[0],grid[1]))
    xc = (x*grid[0]).astype(int)
    yc = (y*grid[1]).astype(int)
    for b in range(batchsize):
      out[b,xc[b],yc[b]] = 1
    #compute rho and theta and add some noise
    rho = np.sqrt(x**2+y**2) + np.random.normal(scale=noise)
    theta = np.arctan(y/np.maximum(x,.00001)) + np.random.normal(scale=noise)
    if flat:
      out = np.reshape(out,(batchsize,grid[0]*grid[1]))
    yield ((theta,rho),out)

Let's create an instance of the generator on a grid with dimension 3x4

In [452]:
g1,g2 = 3,4
gen = polar_generator(1,grid=(g1,g2),noise=0.0,flat=True)

And now let's see a few samples.

In [453]:
(theta,rho),maps = next(gen)
for i,map in enumerate(maps):
  #let us compute the cartesian coordinates
  x = np.cos(theta[i])*rho[i]
  y = np.sin(theta[i])*rho[i]
  print("x coordinate (row): {}".format(int(x*g1)))
  print("y coordinate (col): {}".format(int(y*g2)))
  print("map:")
  print(np.reshape(map,(g1,g2)))

x coordinate (row): 1
y coordinate (col): 3
map:
[[0. 0. 0. 0.]
 [0. 0. 0. 1.]
 [0. 0. 0. 0.]]


Exercise: add noise to the generator, and check the effect on the "ground truth".

# What to deliver

For the purposes of the project you are supposed to work with the **default 10x10 grid, and the default noise=.002**

The generator must be treatead as a black box, do not tweak it, and do not exploit its semantics that is supposed to be unknown. You are allowed to work with the "flat" modality, if you prefer so.

You need to:
1.   define an accuracy function (take inspiration from the code of the previous cell)
2.   define a neural network taking in input theta and rho, and returning out
3. measure the network's accuracy that must be above 95% (accuracy must be evaluated over at least 20000 samples)
4. tune the network trying to decrease as much as possible the numer of parameters, preserving an accuracy above 95%. Only your best network must be delivered.

You must deliver a SINGLE notebook working on colab, containing the code of the network, its summary, the training history, the code for the accurary metric and its evaluation on the network.

**N.B.** The accuracy must be above 95% but apart from that it does not influence the evaluation. You score will only depend on the number of parameters: the lower, the better.

#Good work!




# PROJECT

In this project I created a neural network that takes in input theta and rho, and returns out.
The network make use of the Transfer Learning technique, in order to reduce the number of parameters.
I created one first model, capable of taking in input theta and rho, and returning 2 vectors indicating the row and the column of the 10x10 grid. These 2 vectors are represented as vectors of zeros with a single 1 in the position corresponding to the row and column of the 10x10 grid. Then I created a second model, that takes in input the 2 vectors and returns the 10x10 grid.
Then I created a second model, that takes in input the output of the first model, and returns out as a map of 10x10.
This is done by applying a Lambda layer, that takes in input the 2 vectors and returns the 10x10 flattened grid.
This way, the output of the net is that required by the project, and the accuracy is not affected.

So, for example, if we have the following map:
  [0 0 0 0 1 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0
   0 0 0 0 0 0 0 0 0 0]
  the output of the first net will be y=[row, col] where:
  row = [1 0 0 0 0 0 0 0 0 0]
  col = [0 0 0 0 1 0 0 0 0 0]
  but the general output (the output of the second model) will be the initial map.

It is better to use the 2x10 representation in the first model instead of 10x10 to indicate rows and columns within a neural network because using a more specific representation can help prevent errors and improve the overall performance of the network.

- **generate_data(num_samples)** generates dataset.
  For every example it generates theta,rho and maps.
  It then appends theta and rho as an array as the input of the net.
  Maps the current maps item from array of 100 to 1 element (y_item) indicating the position of the element 1 in flat maps.
  Then it calculates row and column of the 10x10 grid of the item.
  Then it creates two arrays, one representing the row and the other representing the column, with a 1 in the corresponding index, and 0's elsewhere.
  These two arrays are the output y of the first net.
  grid is the flat map of 100 elements, with a 1 in the position of the element 1 in maps, and 0's elsewhere.
  Finally, it returns the above-generated dataset returning the tuple <x,y, grid>.

- **create_model(current_layers, dims)** creates a neural network model.
  It takes in input two arguments: current_layers (the input layer of the model), dims (the dimensions of the hidden layers in the model).
  The function then loops through the dims list, and for each hidden layer, it creates a Dense layer with the specified dimension, and the activation function 'swish'.
  Then it creates two Dense layers with 10 neurons and 'softmax' activation function, called o1, o2, which are the output layers of the model corrispondent to the row and column of the 10x10 grid.
  Finally, it creates a model with the input layer current_layers, and the two output layers o1, o2, and returns it.

- **transform(tensor)** takes in input a tensor of shape 2x10 and applies a transformation on it.
  It finds the maximum element's row and column position using tf.argmax() function, so it finds the position where 1 is stored in the tensor.
  Then it creates a new tensor of shape [10,10] filled with zeros, with 1 on the position [row][col] using the tf.tensor_scatter_nd_update() function.
  Finally, it returns the modified tensor as the output of the function.

- **my_accuracy(y_true, y_pred)** calculates the accuracy of the model. It takes in input two arguments: y_true (the true output values of the dataset), y_pred (the predicted output values of the model)
  The function uses the tf.argmax() function to find the index of the maximum element in both y_pred and y_true (i.e. in which position is 1).
  Then it compares these indices using the tf.equal() function to determine if the model's prediction was correct.
  The method reduce_all is used to check if the prediction of both row and col was correct, instead considering them as 2 different examples.
  It then cast the resulting boolean tensor to the mean of this tensor, this mean represents the accuracy of the model.
  It returns this accuracy as the output of the function.

- **my_accuracy(y_true, y_pred)** calculates the accuracy of the second model. It takes in input two arguments: y_true (the true output values of the dataset), y_pred (the predicted output values of the model)
  The function uses the tf.argmax() function to find the index of the maximum element in both y_pred and y_true (i.e. in which position is 1).
  Then it compares these indices using the tf.equal() function to determine if the model's prediction was correct.
  The difference here is that the model's output is a tensor of shape [100] and no more a tensor of shape [2,10], so we don't need to use the reduce_all method.
  It then cast the resulting boolean tensor to the mean of this tensor, this mean represents the accuracy of the model.
  It returns this accuracy as the output of the function.

In [None]:
# generates dataset
def generate_data_grid(num_samples):
  x = []
  y = []
  grid = []
  gen = polar_generator(1,grid=(10,10),noise=0.002,flat=True)
  for i in range(num_samples):
    (theta,rho),maps = next(gen)
    x_item = np.array([theta[0], rho[0]])
    x.append(x_item)
    y_item = maps[0] # maps from float to int
    y_item = np.where(y_item==1)[0] # maps from array of 100 to 1 element
    row = ((y_item) / 10)[0].astype(int) # calculates row of the 10x10 grid
    col = ((y_item) % 10)[0] # calculates col of the 10x10 grid
    r = np.zeros(10)
    r[row] = 1
    c = np.zeros(10)
    c[col] = 1
    y_item = np.array([r, c])
    y.append(y_item)
    grid.append(maps[0])
  return np.array(x), np.array(y), np.array(grid)

'''
x_prova, y_prova, grid_prova = [], [], []
x_prova, y_prova, grid_prova = generate_data_grid(2)
grid_prova = tf.reshape(grid_prova, [2,10,10])
# test to print some generated data
for i in range(2):
  #print("x_train[", i, "] = ", x_prova[i])
  print("y_train[", i, "] = ", y_prova[i])
  print("grid_train[", i, "] = ", grid_prova[i])
print("x_train.shape = ", x_prova.shape)
print("y_train.shape = ", y_prova.shape)
print("grid_train.shape = ", grid_prova.shape)
'''


# create the first model
def create_model(current_layers, dims):
  for i in range(0,len(dims)):
    current_layers = Dense(dims[i], activation='swish')(current_layers)
  o1 = Dense(10, activation='softmax')(current_layers)
  o2 = Dense(10, activation='softmax')(current_layers)
  out = tf.stack([o1, o2], axis=1)
  return out

# trasform tensor from tensors row and col to one_hot grid 10x10 with 1 in [row, col] position
def transform(tensor):
  row = tf.cast(tf.argmax(tensor[:, 0], axis=-1), dtype=tf.int32)
  col = tf.cast(tf.argmax(tensor[:, 1], axis=-1), dtype=tf.int32)
  one_hot = tf.cast(tf.one_hot(row*10+col, 100), dtype=tf.float32)
  return one_hot

'''
# test transform with a tensor
x_test = tf.constant([[[1,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,1]],[[0,0,0,0,0,0,0,0,0,1],[0,0,0,0,0,0,0,0,0,1]]])
print("transform(x_test): ", transform(x_test))
'''


# accuracy for the first model
def my_accuracy(y_true, y_pred):
  correct_predictions = tf.equal(tf.argmax(y_pred, -1), tf.argmax(y_true, -1))
  correct_predictions = tf.reduce_all(correct_predictions, axis=1)
  return tf.reduce_mean(tf.cast(correct_predictions, tf.float32))

# accuracy for the second model
def my_accuracy_grid(y_true, y_pred):
  correct_predictions = tf.equal(tf.argmax(y_pred, axis=1), tf.argmax(y_true, axis=1))
  return tf.reduce_mean(tf.cast(correct_predictions, tf.float32))

'''
# generate some examples to use accuracy
labels = tf.constant([[[1,0,0,0,0,0,0,0,0,0], [0,0,1,0,0,0,0,0,0,0]]])
predictions = tf.constant([[[1,0,0,0,0,0,0,0,0,0], [0,1,0,0,0,0,0,0,0,0]]])
tf.print("accuracy: ", my_accuracy(labels, predictions))
'''

In [None]:
# x_train and y_train are the training dataset
x_train, y_train, _ = generate_data_grid(80000)

"""
# test to print some generated data
for i in range(3):
  print("x_train[", i, "] = ", x_train[i])
  print("y_train[", i, "] = ", y_train[i])
print("x_train.shape = ", x_train.shape)
print("y_train.shape = ", y_train.shape)
"""

In [456]:
input_layer = Input(shape=(2,))
# size of hidden layers (one for each of them)
inner_layers_dims = [5,3]
layers = create_model(input_layer, inner_layers_dims)

In [457]:
first_model = Model(inputs=input_layer,outputs=layers)

In [458]:
first_model.summary()

Model: "model_47"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_31 (InputLayer)          [(None, 2)]          0           []                               
                                                                                                  
 dense_168 (Dense)              (None, 5)            15          ['input_31[0][0]']               
                                                                                                  
 dense_169 (Dense)              (None, 3)            18          ['dense_168[0][0]']              
                                                                                                  
 dense_170 (Dense)              (None, 10)           40          ['dense_169[0][0]']              
                                                                                           

In [459]:
my_optimizer = Adam(learning_rate=0.01)
first_model.compile(optimizer=my_optimizer, loss=['categorical_crossentropy'], metrics=my_accuracy)

In [460]:
batchsize = 512
history = first_model.fit(x= x_train, y= y_train, batch_size=batchsize, epochs=1000, validation_split=0.2, verbose=2)

Epoch 1/1000
125/125 - 2s - loss: 2.0123 - my_accuracy: 0.0310 - val_loss: 1.7227 - val_my_accuracy: 0.0623 - 2s/epoch - 17ms/step
Epoch 2/1000
125/125 - 1s - loss: 1.3153 - my_accuracy: 0.1815 - val_loss: 1.0338 - val_my_accuracy: 0.2879 - 502ms/epoch - 4ms/step
Epoch 3/1000
125/125 - 1s - loss: 0.9403 - my_accuracy: 0.3313 - val_loss: 0.8758 - val_my_accuracy: 0.3776 - 519ms/epoch - 4ms/step
Epoch 4/1000
125/125 - 1s - loss: 0.8115 - my_accuracy: 0.4420 - val_loss: 0.7752 - val_my_accuracy: 0.4913 - 527ms/epoch - 4ms/step
Epoch 5/1000
125/125 - 1s - loss: 0.7295 - my_accuracy: 0.5132 - val_loss: 0.6965 - val_my_accuracy: 0.5610 - 519ms/epoch - 4ms/step
Epoch 6/1000
125/125 - 0s - loss: 0.6615 - my_accuracy: 0.5670 - val_loss: 0.6366 - val_my_accuracy: 0.5779 - 458ms/epoch - 4ms/step
Epoch 7/1000
125/125 - 0s - loss: 0.6013 - my_accuracy: 0.6121 - val_loss: 0.5745 - val_my_accuracy: 0.6472 - 491ms/epoch - 4ms/step
Epoch 8/1000
125/125 - 0s - loss: 0.5449 - my_accuracy: 0.6658 - val_lo

In [461]:
# create test with new generated data
x_test, y_test, grid_test = generate_data_grid(20000)

In [462]:
# evaluate model on test data
loss, my_accuracy_score = first_model.evaluate(x_test, y_test, verbose=0)
print('Loss: ', loss)
print('My Accuracy: ', my_accuracy_score)

Loss:  0.060111064463853836
My Accuracy:  0.9501000046730042


The following code creates a new model by taking the input and output of the first_model and passing the output through the Lambda(transform) function. This applies the transform function on the output of the first_model changing the accuracy metric to the one of the second model.
Finally, it evaluates the second model on the test data.

In [None]:
# create the second model
second_model = Model(inputs=first_model.input,outputs=Lambda(transform)(first_model.output))
second_model.compile(optimizer=my_optimizer, loss=['categorical_crossentropy'], metrics=my_accuracy_grid)

In [464]:
loss, my_accuracy_score = second_model.evaluate(x_test, grid_test, verbose=0)
print('Loss: ',loss)
print('My Accuracy: ',my_accuracy_score)

Loss:  0.8042952418327332
My Accuracy:  0.9501000046730042
