# Autoencoder and Tensorflow in Python

This is a copy of the code presented in medium article called **Recommender system on the Movielens dataset using an Autoencoder and Tensorflow in Python** [https://medium.com/@connectwithghosh/recommender-system-on-the-movielens-using-an-autoencoder-using-tensorflow-in-python-f13d3e8d600d]

--------------

## Recommender system on the Movielens dataset using an Autoencoder and Tensorflow in Python

I’m a huge fan of autoencoders. They have a ton of uses. They can be used for dimensionality reduction like I show here, they can be used for image denoising like I show in this tutorial and a lot of other stuff.

Today I’ll use it to build a recommender system using the movielens 1 million dataset. You can download it yourself from here. I was mostly inspired by this research paper to build this model. First let me show you what the neural net model will look like. I took this pic straight out of the research paper.

In [3]:
from IPython.display import Image

# For local image file
#Image(filename='path/to/image.png')

# For image URL
Image(url='https://miro.medium.com/v2/resize:fit:1400/format:webp/1*5cYwk8ChU0Uf2pNkgT8R1Q.png', width=600)

This is a shallow neural net with only one hidden layer. So I’ll just feed in all the movie ratings watched by a user and expect a more generalized rating distribution per user to come out. I can use that to get an idea of what their ratings would be for movies they havn’t watched.

Now there’s only about seven thousand users in this dataset. That’s nowhere near what we need to build a good neural net model but this would be a good exercise. Lets start out with importing some libraries.

In [5]:
# Importing tensorflow
import tensorflow as tf

# Importing some more libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error as MSE

Then I read the ratings data. It has user_id, movie_id, ratings and a timestamp. I drop the timestamp and pivot the data so that I have it at a user level and his movie ratings as features. Then I create train and test sets from the pivoted data.

In [6]:
# reading the ratings data
ratings = pd.read_csv('ml-1m/ratings.dat',\
          sep="::", header = None, engine='python')
ratings.head()

Unnamed: 0,0,1,2,3
0,1,1193,5,978300760
1,1,661,3,978302109
2,1,914,3,978301968
3,1,3408,4,978300275
4,1,2355,5,978824291


In [7]:
# Lets pivot the data to get it at a user level
ratings_pivot = pd.pivot_table(ratings[[0,1,2]],\
          values=2, index=0, columns=1 ).fillna(0)
ratings_pivot.head()

1,1,2,3,4,5,6,7,8,9,10,...,3943,3944,3945,3946,3947,3948,3949,3950,3951,3952
0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [8]:
# creating train and test sets
X_train, X_test = train_test_split(ratings_pivot, train_size=0.8)

So, right now I have the data at user level, with his ratings as the features. There are a total of 3706 movies. So our dataset had 3706 features per user. I replace all ratings not put in by the user with 0, it makes things simpler. Now lets start building the model.

In [9]:
# Deciding how many nodes wach layer should have
n_nodes_inpl = 3706  
n_nodes_hl1  = 256  
n_nodes_outl = 3706  
# first hidden layer has 784*32 weights and 32 biases
hidden_1_layer_vals = {'weights':tf.Variable(tf.random_normal\([n_nodes_inpl + 1,n_nodes_hl1]))}
# first hidden layer has 784*32 weights and 32 biases
output_layer_vals = {'weights':tf.Variable(tf.random_normal\([n_nodes_hl1+1,n_nodes_outl])) }

SyntaxError: unexpected character after line continuation character (37455237.py, line 6)

I’ve defined how many nodes each layer has and what the weight matrix looks like. Notice that I’m not defining any bias matrix associated with the layers. That because I’m going to add a bias node instead to each layer which has a constant value of one. This is a bit unusual as to how I usually build a network. But Im sticking to the whats depicted in the original pic.

In [None]:
# user with 3706 ratings goes in
input_layer = tf.placeholder('float', [None, 3706])
# add a constant node to the first layer
# it needs to have the same shape as the input layer for me to be
# able to concatinate it later
input_layer_const = tf.fill( [tf.shape(input_layer)[0], 1] ,1.0  )
input_layer_concat =  tf.concat([input_layer, input_layer_const], 1)
# multiply output of input_layer wth a weight matrix 
layer_1 = tf.nn.sigmoid(tf.matmul(input_layer_concat,\
hidden_1_layer_vals['weights']))
# adding one bias node to the hidden layer
layer1_const = tf.fill( [tf.shape(layer_1)[0], 1] ,1.0  )
layer_concat =  tf.concat([layer_1, layer1_const], 1)
# multiply output of hidden with a weight matrix to get final output
output_layer = tf.matmul( layer_concat,output_layer_vals['weights'])
# output_true shall have the original shape for error calculations
output_true = tf.placeholder('float', [None, 3706])
# define our cost function
meansq =    tf.reduce_mean(tf.square(output_layer - output_true))
# define our optimizer
learn_rate = 0.1   # how fast the model should learn
optimizer = tf.train.AdagradOptimizer(learn_rate).minimize(meansq)

Usually I add a bias matrix instead of a bias node. But its supposed to act the same way. Alright, now that we’re done building the model, Ill define how many epochs I want to run and the batch size.

In [None]:
# initialising variables and starting the session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# defining batch size, number of epochs and learning rate
batch_size = 100  # how many images to use together for training
hm_epochs =200    # how many times to go through the entire dataset
tot_users = X_train.shape[0] # total number of images

Now I’ll start the training. All Im doing is iterating through the data in batches, training the model and printing out the test error after each epoch.

In [None]:
# running the model for a 200 epochs taking 100 users in batches
# total improvement is printed out after each epoch
for epoch in range(hm_epochs):
    epoch_loss = 0    # initializing error as 0
    
    for i in range(int(tot_images/batch_size)):
        epoch_x = X_train[ i*batch_size : (i+1)*batch_size ]
        _, c = sess.run([optimizer, meansq],\
               feed_dict={input_layer: epoch_x, \
               output_true: epoch_x})
        epoch_loss += c
        
    output_train = sess.run(output_layer,\
               feed_dict={input_layer:X_train})
    output_test = sess.run(output_layer,\
                   feed_dict={input_layer:X_test})
        
    print('MSE train', MSE(output_train, X_train),'MSE test', MSE(output_test, X_test))      
    print('Epoch', epoch, '/', hm_epochs, 'loss:',epoch_loss)

Im almost finished. Now to get the predicted rating for an user, all you have t do is pass it through the net once. If youre going to use the output to recommend that user some movies, you can pick the ones with the highest ratings that he hasnt seen yet,

In [None]:
# pick a user
sample_user = X_test.iloc[99,:]
#get the predicted ratings
sample_user_pred = sess.run(output_layer, feed_dict={input_layer:[sample_user]})

Thanks for making it to the end. Like and leave a comment if this was helpful to you.