# Auto Encoders
<hr>

Hi guys,

we will be working on the same dataset as in **Part 5 - Boltzmann Machines** so the Data Preprocessing phase is the same for *Parts 5* and *6*. Therefore, if you already completed *Part 5*, feel free to skip the five following tutorials and jump directly to the Lecture: **Building an AutoEncoder - Step 6**.

*Enjoy Deep Learning!*

## Importing the Libraries

In [None]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.optim as optim
import torch.utils.data
from torch.autograd import Variable

## Importing the Dataset

In [None]:
movies = pd.read_csv("ml-1m/movies.dat", sep = "::", header = None, engine = "python", encoding = "latin-1")
movies

In [None]:
users = pd.read_csv("ml-1m/users.dat", sep = "::", header = None, engine = "python", encoding = "latin-1")
users

In [None]:
ratings = pd.read_csv("ml-1m/ratings.dat", sep = "::", header = None, engine = "python", encoding = "latin-1")
ratings

<hr>

### Preparing the training set and the test set

In [None]:
training_set = pd.read_csv("ml-100k/u1.base", delimiter = "\t")
training_set

In [None]:
training_set = np.array(training_set, dtype = "int")
training_set

In [None]:
test_set = pd.read_csv("ml-100k/u1.test", delimiter = "\t")
test_set

In [None]:
test_set = np.array(test_set, dtype = "int")
test_set

<hr>

### Getting the number of users and movies

In [None]:
nb_users = int(max(max(training_set[:, 0]), max(test_set[:, 0])))
nb_users

In [None]:
nb_movies = int(max(max(training_set[:, 1]), max(test_set[:, 1])))
nb_movies

<hr>

## Homework Challenge - Coding Exercise

So far our training and test sets have the following format:

*Column 1:* User

*Column 2:* Movie

*Column 3:* Rating

*Column 4:* Timestamp

Define a function that will convert this format into a list of horizontal lists, where each horizontal list corresponds to a user and includes all its ratings of the movies. In each list should also be included the movies that the user didn't rate and for these movies, just put a zero. So what you should get in the end is a huge list of **943** horizontal lists (because there are **943** users):

*List of User 1:* `[Ratings of all the movies by User 1]`

*List of User 2:* `[Ratings of all the movies by User 2]`

................................................................................

*List of User 943:* `[Ratings of all the movies by User 943]`

Why doing this? Because we want to create a new structure of data, having the shape of a 2d array where:

the rows are the users,
the columns are the movies,
the cells are the ratings.

This coding exercise will be excellent practice for you because you will work with four important techniques in Python:

functions
lists and arrays
for loops
handling indexes
Try to complete this Homework as hard as you can, the more you try, the more you will progress.

The solution is in the next tutorial.

*Good luck!*

<hr>

### Converting the data into an array with users in lines and movies in columns

In [None]:
def convert(data):
    new_data = []
    
    for id_users in range(1, nb_users + 1):
        id_movies = data[:, 1][data[:, 0] == id_users]
        id_ratings = data[:, 2][data[:, 0] == id_users]
        ratings = np.zeros(nb_movies)
        ratings[id_movies - 1] = id_ratings
        new_data.append(list(ratings))
    return new_data

In [None]:
training_set = convert(training_set)
test_set = convert(test_set)

In [None]:
training_set

In [None]:
test_set

<hr>

### Converting the data into Torch tensors

In [None]:
training_set = torch.FloatTensor(training_set)
training_set

In [None]:
test_set = torch.FloatTensor(test_set)
test_set

<hr>

### Creating the Architecture of the Neural Network

In [None]:
class SAE(nn.Module):
    # The below function basically defines the architecture of the neural network.
    def __init__(self, ):
        # It will make sure that we get all the inherited classes and methods of the parent class and that module.
        super(SAE, self).__init__()

        # The below line represents the full connection between the first input vector features and the first encoded vector.
        self.fc1 = nn.Linear(nb_movies, 20) # (number of features in the input vector, No. of Nodes in First Hidden Layer)
        self.fc2 = nn.Linear(20, 10) # (No. of Nodes in First Hidden Layer, No. of Nodes in Second Hidden Layer)
        self.fc3 = nn.Linear(10, 20) # (No. of Nodes in Second Hidden Layer, No. of Nodes in Third Hidden Layer)
        self.fc4 = nn.Linear(20, nb_movies) # (No. of Nodes in Third Hidden Layer, No. of Nodes in Output Layer and "output_vec = input_vec")

        # Specifying the Activation Function
        self.activation = nn.Sigmoid()


    # Function for different encodings and decodings when the observation is forwarded into the network. It will also apply to different activation functions inside the full connections.
    # The main purpose of making this function is that it will return in the end the vector of predicted ratings that we will compare to the vector of real ratings, that is, the input vector.
    def forward(self, x): # "x" -> input vector
        # Encoding the input vector feature i.e., "x" into a first shorter vector composed of 20 elements in our first hidden layer.

        # This is the new first encoded vector resulting from the first encoding that happens here with the AF in the first fc.
        x = self.activation(self.fc1(x))

        # Doing same above thing for the other full connections
        x = self.activation(self.fc2(x))
        x = self.activation(self.fc3(x))
        x = self.fc4(x) # Here it's the output layer that's why we don't use the activation function (AF) here.

        return x

<hr>

### Creating the object of SAE Class

In [None]:
sae = SAE()

# Criterion for the Loss Function and the Loss Function is going to be the Mean Squared Error.
criterion = nn.MSELoss()

# Creating an Optimizer
optimizer = optim.RMSprop(sae.parameters(), lr = 0.01, weight_decay = 0.5)    # (all the parameters of our AutoEncoders, Learning Rate, Decay which is used to reduce the LR after every few epochs and that's in order to regulate the convergence.)

<hr>

### Training the SAE

In [None]:
# No. of epochs
nb_epochs = 200

for epoch in range(1, nb_epochs + 1):
    train_loss = 0
    s = 0.  # Counter

    for id_user in range(nb_users):
        input = Variable(training_set[id_user]).unsqueeze(0)
        # Cloning the input variable
        target = input.clone()

        # The purpose of doing this is to optimize the memory. To save as much memory as possible because this if condition will consist of only looking at the users who rated at least one movie.
        # "target.data" is all the ratings. But we have to consider all the ratings that are larger than zero.
        if torch.sum(target.data > 0) > 0:
            # Vector of predicted ratings
            output = sae(input)
            target.requires_grad = False
            output[target == 0] = 0
            loss = criterion(output, target)
            mean_corrector = nb_movies/float(torch.sum(target.data > 0 + 1e-10))
            loss.backward()
            train_loss += np.sqrt(loss.item() * mean_corrector)
            s += 1.
            optimizer.step()

    print("epoch: " + str(epoch) + " loss: " + str(train_loss/s))

<hr>

### Testing the SAE

In [None]:
test_loss = 0
s = 0.

for id_user in range(nb_users):
    input = Variable(training_set[id_user]).unsqueeze(0)
    target = Variable(test_set[id_user])

    if torch.sum(target.data > 0) > 0:
        output = sae(input)
        target.requires_grad = False
        target = target.unsqueeze(0)  # Add a dimension at index 0
        mask = target == 0
        output[mask] = 0
        loss = criterion(output, target)
        mean_corrector = nb_movies/float(torch.sum(target.data > 0 + 1e-10))
        test_loss += np.sqrt(loss.item() * mean_corrector)
        s += 1.

print('test loss: '+str(test_loss/s))

<hr>