<a href="https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/Notebook/8_Deep_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 8.

<a name="top"></a>
## Deep Learning

### Table of Contents

Note: The internal links work in Google Colab.

1. **[Preface](https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/MovieLens.ipynb#preface)**
2. **[Introduction](https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/MovieLens.ipynb#introduction)**
3. **[Exploratory Data Analysis](https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/Notebook/3_Exploratory_Data_Analysis.ipynb.ipynb#eda)**
4. **[Framework](https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/Notebook/4_Framework.ipynb#framework)**
5. **[Content Based Recommenders](https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/Notebook/5_Content_Based_Recommenders.ipynb#content)**
6. **[Collaborative Based Recommenders](https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/Notebook/6_Collaborative_Based_Recommenders.ipynb#collaborative)**
7. **[Matrix Factorization Methods](https://colab.research.google.com/github/villafue/Capstone_2_MovieLens/blob/main/Notebook/7_Matrix_Factorization_Methods.ipynb#matrix)**
8. **[Deep Learning](#deep_learning)**
    - 8.1 - [Introduction](#introduction)
    - 8.2 - [Import Files](#import)
    - 8.3 - [Restricted Boltzmann Machines (RBM)](#rbm)
      - 8.3.1 - [RBM.py](#rbm_py)
      - 8.3.2 - [RBMAlgorithm.py](#rbm_algo)
      - 8.3.3 - [RBM Model](#rbm_model)
    - 8.4 - [Autoencoders for Recommendations](#autorec)
      - 8.4.1 - [AutoRec.py](#autorec_py)
      - 8.4.2 - [AutoRecAlgorithm.py](#autorec_algo)
      - 8.4.3 - [AutoRec Model](#autorec_mod)
    - 8.5 - [Results](#results)

***

<a name="introduction"></a>
### 8.1 - Introduction

In this section, I'm going to try two neural networks coded by Frank. The first is [Restricted Boltzmann Machines (RBM)](https://wiki.pathmind.com/restricted-boltzmann-machine) and the second [Autoencoders for Recommendations (AutoRec)](https://medium.com/@connectwithghosh/recommender-system-on-the-movielens-using-an-autoencoder-using-tensorflow-in-python-f13d3e8d600d). Although I am pretty impressed and satisfied with my SVD results, Frank argues that there are good reasons for building recommenders with Deep Learning techniques. One is that it lets us take advantage of all the rapid advances in the fields of AI and deep learning. Neural networks (deep learning) have the potential to find complex patterns that more traditional machine learning techniques may miss. Second, we can handle larger datasets, especially when using TensorFlow along clusters of GPUs.

The Python scripts and files I used are in this [folder](https://github.com/villafue/Capstone_2_MovieLens/tree/main/Python%20Scripts/DeepLearning). 


***

**[Back to Top](#top)**

***

<a name="import"></a>
### 8.2 - Import Files

In [1]:
import os
os.mkdir('/content/deep_learning')
print('Folder created!')
os.chdir('/content/deep_learning')
print('Files are in this folder!')

Folder created!
Files are in this folder!


In [2]:
pip install scikit-surprise

Collecting scikit-surprise
[?25l  Downloading https://files.pythonhosted.org/packages/97/37/5d334adaf5ddd65da99fc65f6507e0e4599d092ba048f4302fe8775619e8/scikit-surprise-1.1.1.tar.gz (11.8MB)
[K     |████████████████████████████████| 11.8MB 4.9MB/s 
Building wheels for collected packages: scikit-surprise
  Building wheel for scikit-surprise (setup.py) ... [?25l[?25hdone
  Created wheel for scikit-surprise: filename=scikit_surprise-1.1.1-cp37-cp37m-linux_x86_64.whl size=1617541 sha256=f2de6f701dfaaf76a6b8fcee8e1fee4ab9540d93eca8090d6969241917f3bbbc
  Stored in directory: /root/.cache/pip/wheels/78/9c/3d/41b419c9d2aff5b6e2b4c0fc8d25c538202834058f9ed110d0
Successfully built scikit-surprise
Installing collected packages: scikit-surprise
Successfully installed scikit-surprise-1.1.1


In [3]:
print("Loading Architecture.")
!python "MovieLens.py"
print('1 of 5: Done')
!python "RecommenderMetrics.py"
print('2 of 5: Done')
!python "EvaluationData.py"
print('3 of 5: Done')
!python "EvaluatedAlgorithm.py"
print('4 of 5: Done')
!python "Evaluator.py"
print('5 of 5: Core Framework Loaded!')

Loading Architecture.
1 of 5: Done
2 of 5: Done
3 of 5: Done
4 of 5: Done
5 of 5: Core Framework Loaded!


In [4]:
print('Loading Model Algorithms')
print('\n', '-' * 136)
!python "RBM.py"
!python "RBMAlgorithm.py"
!python "AutoRec.py"
!python "AutoRecAlgorithm.py"
print('\n', '-' * 136) 
print('1 of 2: Loaded RBM')
print('2 of 2: Loaded AutoRec')
print('All Models Loaded!')

Loading Model Algorithms

 ----------------------------------------------------------------------------------------------------------------------------------------
2021-03-23 21:33:27.943654: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-03-23 21:33:30.913382: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-03-23 21:33:33.364720: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-03-23 21:33:36.252782: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

 ----------------------------------------------------------------------------------------------------------------------------------------
1 of 2: Loaded RBM
2 of 2: Loaded AutoRec
All Models Loaded!


***

**[Back to Top](#top)**

***

<a name="rbm"></a>
### 8.3 - Restricted Boltzmann Machines (RBM)

Frank calls RBMs, "The granddaddy of neural networks and recommender systems." It's been in use since 2007 and still a common technique today. In fact, a few years ago, Netflix confirmed they were still using RBM's as part of their recommender system. The original paper is [here](https://www.cs.toronto.edu/~rsalakhu/papers/rbmcf.pdf).
 
RBMs are also one of the simplest neural networks. It's just two layers, a visible layer and a hidden layer. We train it by feeding our training data into the visible layer in a forward pass, then training weights and biases between them during back propagation. An activation function such as ReLU is used to produce the output from each hidden neuron.  

In this section, I include the two algorithms Frank wrote and my final optimized RBM model.

<a name="rbm_py"></a>
#### 8.3.1 - RBM.py


RBM.py is what implements the RBM itself with all the twists we talk about that are specific to recommender systems such as using contrastive divergence. Frank used tensor flow 2 so I can use my GPU to accelerate the runtime. 
 
**Line (15)** in the `__init__` function, has all the hyperparameters that can be tuned.  `Epochs` how many times we cycle through to train this model in hopes that it will converge on a good set of weights and biases for prediction. `hiddenDimensions` is the number of hidden neurons in the system. The `learning rate` controls how quickly the model converges, towards a solution, on each iteration. `batchSize` controls how many users we train at a time. 

Although not a tunable hyperparameters, `visibleDimensions` is the product of the number of movies and distinct rating values. `ratingValues` is 10 because we technically have 10 distinct ratings once we include 1/2 star ratings values.

The RBM iself is coded on **Line (44)** under the `MakeGraph` function. Frank admits the code is pretty convoluted but he was trying to deal with the sparsity issue and use SoftMax to further sort and rank movies that were tied with the same rating. 
 


In [None]:
# -*- coding: utf-8 -*-
"""
Updated on Sun Dec 1 08:32:13 2019

@author: Frank

@modified: Saurabh
"""

import numpy as np
import tensorflow as tf

class RBM(object):

    def __init__(self, visibleDimensions, epochs=30, hiddenDimensions=50, ratingValues=10, learningRate=0.001, batchSize=100):

        self.visibleDimensions = visibleDimensions
        self.epochs = epochs
        self.hiddenDimensions = hiddenDimensions
        self.ratingValues = ratingValues
        self.learningRate = learningRate
        self.batchSize = batchSize
        
                
    def Train(self, X):

        for epoch in range(self.epochs):
            np.random.shuffle(X)
            
            trX = np.array(X)
            for i in range(0, trX.shape[0], self.batchSize):
                epochX = trX[i:i+self.batchSize]
                self.MakeGraph(epochX)

            print("Trained epoch ", epoch)


    def GetRecommendations(self, inputUser):
        
        feed = self.MakeHidden(inputUser)
        rec = self.MakeVisible(feed)
        return rec[0]       

    def MakeGraph(self, inputUser):

        # Initialize weights randomly
        maxWeight = -4.0 * np.sqrt(6.0 / (self.hiddenDimensions + self.visibleDimensions))
        self.weights = tf.Variable(tf.random.uniform([self.visibleDimensions, self.hiddenDimensions], minval=-maxWeight, maxval=maxWeight), tf.float32, name="weights")
        
        self.hiddenBias = tf.Variable(tf.zeros([self.hiddenDimensions], tf.float32, name="hiddenBias"))
        self.visibleBias = tf.Variable(tf.zeros([self.visibleDimensions], tf.float32, name="visibleBias"))
        
        # Perform Gibbs Sampling for Contrastive Divergence, per the paper we assume k=1 instead of iterating over the 
        # forward pass multiple times since it seems to work just fine
        
        # Forward pass
        # Sample hidden layer given visible...
        # Get tensor of hidden probabilities
        hProb0 = tf.nn.sigmoid(tf.matmul(inputUser, self.weights) + self.hiddenBias)
        # Sample from all of the distributions
        hSample = tf.nn.relu(tf.sign(hProb0 - tf.random.uniform(tf.shape(hProb0))))
        # Stitch it together
        forward = tf.matmul(tf.transpose(inputUser), hSample)
        
        # Backward pass
        # Reconstruct visible layer given hidden layer sample
        v = tf.matmul(hSample, tf.transpose(self.weights)) + self.visibleBias
        
        # Build up our mask for missing ratings
        vMask = tf.sign(inputUser) # Make sure everything is 0 or 1
        vMask3D = tf.reshape(vMask, [tf.shape(v)[0], -1, self.ratingValues]) # Reshape into arrays of individual ratings
        vMask3D = tf.reduce_max(vMask3D, axis=[2], keepdims=True) # Use reduce_max to either give us 1 for ratings that exist, and 0 for missing ratings
        
        # Extract rating vectors for each individual set of 10 rating binary values
        v = tf.reshape(v, [tf.shape(v)[0], -1, self.ratingValues])
        vProb = tf.nn.softmax(v * vMask3D) # Apply softmax activation function
        vProb = tf.reshape(vProb, [tf.shape(v)[0], -1]) # And shove them back into the flattened state. Reconstruction is done now.
        # Stitch it together to define the backward pass and updated hidden biases
        hProb1 = tf.nn.sigmoid(tf.matmul(vProb, self.weights) + self.hiddenBias)
        backward = tf.matmul(tf.transpose(vProb), hProb1)
    
        # Now define what each epoch will do...
        # Run the forward and backward passes, and update the weights
        weightUpdate = self.weights.assign_add(self.learningRate * (forward - backward))
        # Update hidden bias, minimizing the divergence in the hidden nodes
        hiddenBiasUpdate = self.hiddenBias.assign_add(self.learningRate * tf.reduce_mean(hProb0 - hProb1, 0))
        # Update the visible bias, minimizng divergence in the visible results
        visibleBiasUpdate = self.visibleBias.assign_add(self.learningRate * tf.reduce_mean(inputUser - vProb, 0))

        self.update = [weightUpdate, hiddenBiasUpdate, visibleBiasUpdate]
        
    def MakeHidden(self, inputUser):
        hidden = tf.nn.sigmoid(tf.matmul(inputUser, self.weights) + self.hiddenBias)
        self.MakeGraph(inputUser)
        return hidden
    
    def MakeVisible(self, feed):
        visible = tf.nn.sigmoid(tf.matmul(feed, tf.transpose(self.weights)) + self.visibleBias)
        #self.MakeGraph(feed)
        return visible


***

**[Back to Top](#top)**

***

<a name="rbm_algo"></a>
#### 8.3.2 - RBMAlgorithm.py

The `RBMAlgorithm.py` is basically a wrapper around the RBM.py module which makes it easier to use, as well as ties it into Frank's core recommendation framework. 

In [None]:
# -*- coding: utf-8 -*-
"""
Created on Fri May  4 13:08:25 2018

@author: Frank
"""

from surprise import AlgoBase
from surprise import PredictionImpossible
import numpy as np
from RBM import RBM

class RBMAlgorithm(AlgoBase):

    def __init__(self, epochs=30, hiddenDim=100, learningRate=0.001, batchSize=100, sim_options={}):
        AlgoBase.__init__(self)
        self.epochs = epochs
        self.hiddenDim = hiddenDim
        self.learningRate = learningRate
        self.batchSize = batchSize
        
    def softmax(self, x):
        return np.exp(x) / np.sum(np.exp(x), axis=0)

    def fit(self, trainset):
        AlgoBase.fit(self, trainset)

        numUsers = trainset.n_users
        numItems = trainset.n_items
        
        trainingMatrix = np.zeros([numUsers, numItems, 10], dtype=np.float32)
        
        for (uid, iid, rating) in trainset.all_ratings():
            adjustedRating = int(float(rating)*2.0) - 1
            trainingMatrix[int(uid), int(iid), adjustedRating] = 1
        
        # Flatten to a 2D array, with nodes for each possible rating type on each possible item, for every user.
        trainingMatrix = np.reshape(trainingMatrix, [trainingMatrix.shape[0], -1])
        
        # Create an RBM with (num items * rating values) visible nodes
        rbm = RBM(trainingMatrix.shape[1], hiddenDimensions=self.hiddenDim, learningRate=self.learningRate, batchSize=self.batchSize, epochs=self.epochs)
        rbm.Train(trainingMatrix)

        self.predictedRatings = np.zeros([numUsers, numItems], dtype=np.float32)
        for uiid in range(trainset.n_users):
            if (uiid % 50 == 0):
                print("Processing user ", uiid)
            recs = rbm.GetRecommendations([trainingMatrix[uiid]])
            recs = np.reshape(recs, [numItems, 10])
            
            for itemID, rec in enumerate(recs):
                # The obvious thing would be to just take the rating with the highest score:                
                #rating = rec.argmax()
                # ... but this just leads to a huge multi-way tie for 5-star predictions.
                # The paper suggests performing normalization over K values to get probabilities
                # and take the expectation as your prediction, so we'll do that instead:
                normalized = self.softmax(rec)
                rating = np.average(np.arange(10), weights=normalized)
                self.predictedRatings[uiid, itemID] = (rating + 1) * 0.5
        
        return self


    def estimate(self, u, i):

        if not (self.trainset.knows_user(u) and self.trainset.knows_item(i)):
            raise PredictionImpossible('User and/or item is unkown.')
        
        rating = self.predictedRatings[u, i]
        
        if (rating < 0.001):
            raise PredictionImpossible('No valid prediction exists.')
            
        return rating
    

***

**[Back to Top](#top)**

***

<a name="rbm_model"></a>
#### 8.3.3 - RBM Model

When first attempting to tune this model, I tried using Surpriselib's RandomizedSearchCV. Unfortunately, just like with my SVD++ model, it kept producing errors. Furthermore, running the model Frank used took hours! Because of this, I felt very constrained in exploring hyperparameter tuning. With five-fold cross validation using GridSearchCV, I did not have the time to truly optimize the model. The one below took all night to finish training. 

In [None]:
# -*- coding: utf-8 -*-
"""
Created on Thu May  3 11:11:13 2018

@author: Frank
"""

from MovieLens import MovieLens
from RBMAlgorithm import RBMAlgorithm
from surprise import NormalPredictor
from Evaluator import Evaluator
from surprise.model_selection import GridSearchCV

import random
import numpy as np

def LoadMovieLensData():
    ml = MovieLens()
    print("Loading movie ratings...")
    data = ml.loadMovieLensLatestSmall()
    print("\nComputing movie popularity ranks so we can measure novelty later...")
    rankings = ml.getPopularityRanks()
    return (ml, data, rankings)

np.random.seed(29)
random.seed(29)

# Load up common data set for the recommender algorithms
(ml, evaluationData, rankings) = LoadMovieLensData()

print("Searching for best parameters...")
param_grid = {'hiddenDim': [20, 30, 40], 'learningRate': [0.1, 0.15, 0.2]}
gs = GridSearchCV(RBMAlgorithm, param_grid, measures=['rmse', 'mae'], cv=5)

gs.fit(evaluationData)

# best RMSE score
print("Best RMSE score attained: ", gs.best_score['rmse'])

# combination of parameters that gave the best RMSE score
print(gs.best_params['rmse'])

# Construct an Evaluator to, you know, evaluate them
evaluator = Evaluator(evaluationData, rankings)

params = gs.best_params['rmse']
RBMtuned = RBMAlgorithm(hiddenDim = params['hiddenDim'], learningRate = params['learningRate'])
evaluator.AddAlgorithm(RBMtuned, "RBM - Tuned")

RBMUntuned = RBMAlgorithm()
evaluator.AddAlgorithm(RBMUntuned, "RBM - Untuned")

# Just make random recommendations
Random = NormalPredictor()
evaluator.AddAlgorithm(Random, "Random")

# Fight!
evaluator.Evaluate(True)

evaluator.SampleTopNRecs(ml)


Loading movie ratings...

Computing movie popularity ranks so we can measure novelty later...
Searching for best parameters...
Trained epoch  0
Trained epoch  1
Trained epoch  2
Trained epoch  3
Trained epoch  4
Trained epoch  5
Trained epoch  6
Trained epoch  7
Trained epoch  8
Trained epoch  9
Trained epoch  10
Trained epoch  11
Trained epoch  12
Trained epoch  13
Trained epoch  14
Trained epoch  15
Trained epoch  16
Trained epoch  17
Trained epoch  18
Trained epoch  19
Trained epoch  20
Trained epoch  21
Trained epoch  22
Trained epoch  23
Trained epoch  24
Trained epoch  25
Trained epoch  26
Trained epoch  27
Trained epoch  28
Trained epoch  29
Processing user  0
Processing user  50
Processing user  100
Processing user  150
Processing user  200
Processing user  250
Processing user  300
Processing user  350
Processing user  400
Processing user  450
Processing user  500
Processing user  550
Processing user  600
Trained epoch  0
Trained epoch  1
Trained epoch  2
Trained epoch  3
Train

Looking at the metrics below, the `RMSE` is barely better tuned versus untuned. I do prefer the `Novelty` score, however, with the tuned version. Next, I'll look at the movie recommendations.

| Algorithm | RMSE | MAE | HR | cHR | ARHR | Coverage | Diversity |Novelty |  
|-|-|-|-|-|-|-|-|-|
| RBM - Tuned | 1.2797 | 1.0856 | 0.0230 | 0.0230 | 0.0089 | 0.0000 | 0.0574 | 965.0569 | 
| RBM - Untuned | 1.2815 | 1.0872 | 0.0016 | 0.0016 | 0.0002 | 0.0000 | 0.7377 | 4912.7415 |
| Random | 1.4218 | 1.1346 | 0.0115 | 0.0115 | 0.0042 | 1.0000 | 0.0499 | 860.2303 | 

```
Legend:

RMSE:      Root Mean Squared Error. Lower values mean better accuracy.
MAE:       Mean Absolute Error. Lower values mean better accuracy.
HR:        Hit Rate; how often we are able to recommend a left-out rating. Higher is better.
cHR:       Cumulative Hit Rate; hit rate, confined to ratings above a certain threshold. Higher is better.
ARHR:      Average Reciprocal Hit Rank - Hit rate that takes the ranking into account. Higher is better.
Coverage:  Ratio of users for whom recommendations above a certain threshold exist. Higher is better.
Diversity: 1-S, where S is the average similarity score between every possible pair of recommendations
           for a given user. Higher means more diverse.
Novelty:   Average popularity rank of recommended items. Higher means more novel.
```

Despite the `RMSE` being almost the same, there is a huge quality improvement of the movie recommendations with the tuned RBM model. It does make sense considering the `Novelty` score is much lower than the untuned version meaning it's recommending more popular movies.

```
Using recommender  RBM - Tuned

We recommend:
Departed, The (2006) 2.892468
Howl's Moving Castle (Hauru no ugoku shiro) (2004) 2.8908362
Fight Club (1999) 2.8869665
Memento (2000) 2.8868942
Inside Out (2015) 2.8860178
Monsters, Inc. (2001) 2.8839574
Spirited Away (Sen to Chihiro no kamikakushi) (2001) 2.8832886
Finding Nemo (2003) 2.8820338
Godfather, The (1972) 2.8793077
How to Train Your Dragon (2010) 2.8788555

---------------------------------------

Using recommender  RBM - Untuned

We recommend:
My Dog Skip (1999) 2.7925565
Coco Before Chanel (Coco avant Chanel) (2009) 2.789673
Story of Women (Affaire de femmes, Une) (1988) 2.7859182
Sunshine (1999) 2.7854161
Music of the Heart (1999) 2.783457
Four Christmases (2008) 2.7831357
Chronicles of Narnia: The Voyage of the Dawn Treader, The (2010) 2.7824636
Splash (1984) 2.7824612
Doctor Strange (2016) 2.782002
Natural Born Killers (1994) 2.7810924

--------------------------

Using recommender  Random

We recommend:
Ed Wood (1994) 5
Mrs. Doubtfire (1993) 5
Batman (1989) 5
Highlander (1986) 5
Great Mouse Detective, The (1986) 5
Rocketeer, The (1991) 5
Willow (1988) 5
Texas Chainsaw Massacre, The (1974) 5
Superman II (1980) 5
Frankenstein (1931) 5
```

Next, I will look at the Autorec model.

***

**[Back to Top](#top)**

***

<a name="autorec"></a>
### 8.4 - Autoencoders for Recommendations (AutoRec)

RBMs are a very early type of Neural Network and the field of deep learning has evolved considerably since then. A group from the Australian National University published a paper called [AutoRec: Autoencoders Meet Collaborative Filtering](https://users.cecs.anu.edu.au/~akmenon/papers/autorec/autorec-paper.pdf) and it's the framework for the model I'll use in this section. 
 
An autoencoder is just a three layer neural network with an input layer, a hidden layer, and an output layer. Learning the weights between the input and hidden layer is called encoding and reconstructing predictions with the weights between the hidden layer and the output layer is called decoding. But fundamentally it's just a neural network with one hidden layer. 

In this section, I include the two algorithms Frank wrote and my final optimized AutoRec model.




<a name="autorec_py"></a>
#### 8.4.1 - AutoRec.py

Like the RBM code, the AutoRec model uses Tensor Flow and it's structured in much the same way. However, fundamentally differs on **line (44)** with the `neural_net` function. The `weights` are initialized on **line (49)** for encoding and decoding, and the `biases`, set-up on **line (55)**.

After, the layers are set up. On **line (61)** `inputLayer` receives ratings for each item for a given user. The hidden layer **(line 64)** is constructed on **line (64)** and the output layer on **line (67)**. 
 
Our output layer (line 67) applies the learned decoder weights and biases to what's in the hidden layer and applies the sigmoid activation function to that final result as well. This makes up our actual predicted ratings for every item for a given user. 
 
The `run_optimization` function **line (71)** trains the model and uses Tensor Flow's [RMSProp](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/RMSprop) as its optimizer. 
  
 



In [None]:
# -*- coding: utf-8 -*-
"""
Updated on Thu Nov 28 10:39:24 2019

@author: Frank

@modified: Saurabh
"""

import numpy as np
import tensorflow as tf

class AutoRec(object):

    def __init__(self, visibleDimensions, epochs=200, hiddenDimensions=50, learningRate=0.1, batchSize=100):

        self.visibleDimensions = visibleDimensions
        self.epochs = epochs
        self.hiddenDimensions = hiddenDimensions
        self.learningRate = learningRate
        self.batchSize = batchSize
        self.optimizer = tf.keras.optimizers.RMSprop(self.learningRate)
        
                
    def Train(self, X):
        
        for epoch in range(self.epochs):
            for i in range(0, X.shape[0], self.batchSize):
                epochX = X[i:i+self.batchSize]
                self.run_optimization(epochX)


            print("Trained epoch ", epoch)

    def GetRecommendations(self, inputUser):
        
        # Feed through a single user and return predictions from the output layer.
        rec = self.neural_net(inputUser)
        
        # It is being used as the return type is Eager Tensor.
        return rec[0]

    
    def neural_net(self, inputUser):

        #tf.set_random_seed(0)
        
        # Create varaibles for weights for the encoding (visible->hidden) and decoding (hidden->output) stages, randomly initialized
        self.weights = {
            'h1': tf.Variable(tf.random.normal([self.visibleDimensions, self.hiddenDimensions])),
            'out': tf.Variable(tf.random.normal([self.hiddenDimensions, self.visibleDimensions]))
            }
        
        # Create biases
        self.biases = {
            'b1': tf.Variable(tf.random.normal([self.hiddenDimensions])),
            'out': tf.Variable(tf.random.normal([self.visibleDimensions]))
            }
        
        # Create the input layer
        self.inputLayer = inputUser
        
        # hidden layer
        hidden = tf.nn.sigmoid(tf.add(tf.matmul(self.inputLayer, self.weights['h1']), self.biases['b1']))
        
        # output layer for our predictions.
        self.outputLayer = tf.nn.sigmoid(tf.add(tf.matmul(hidden, self.weights['out']), self.biases['out']))
        
        return self.outputLayer
    
    def run_optimization(self, inputUser):
        with tf.GradientTape() as g:
            pred = self.neural_net(inputUser)
            loss = tf.keras.losses.MSE(inputUser, pred)
            
        trainable_variables = list(self.weights.values()) + list(self.biases.values())
        
        gradients = g.gradient(loss, trainable_variables)
        
        self.optimizer.apply_gradients(zip(gradients, trainable_variables))


***

**[Back to Top](#top)**

***

<a name="autorec_algo"></a>
#### 8.4.2 - AutoRecAlgorithm.py

The `AutoRecAlgorithm.py` is basically a wrapper around the AutoRec.py module which makes it easier to use, as well as ties it into Frank's core recommendation framework. 

In [None]:
# -*- coding: utf-8 -*-
"""
Created on Fri May  4 13:08:25 2018

@author: Frank
"""

from surprise import AlgoBase
from surprise import PredictionImpossible
import numpy as np
from AutoRec import AutoRec

class AutoRecAlgorithm(AlgoBase):

    def __init__(self, epochs=100, hiddenDim=100, learningRate=0.01, batchSize=100, sim_options={}):
        AlgoBase.__init__(self)
        self.epochs = epochs
        self.hiddenDim = hiddenDim
        self.learningRate = learningRate
        self.batchSize = batchSize

    def fit(self, trainset):
        AlgoBase.fit(self, trainset)

        numUsers = trainset.n_users
        numItems = trainset.n_items
        
        trainingMatrix = np.zeros([numUsers, numItems], dtype=np.float32)
        
        for (uid, iid, rating) in trainset.all_ratings():
            trainingMatrix[int(uid), int(iid)] = rating / 5.0
        
        # Create an RBM with (num items * rating values) visible nodes
        autoRec = AutoRec(trainingMatrix.shape[1], hiddenDimensions=self.hiddenDim, learningRate=self.learningRate, batchSize=self.batchSize, epochs=self.epochs)
        autoRec.Train(trainingMatrix)

        self.predictedRatings = np.zeros([numUsers, numItems], dtype=np.float32)
        
        for uiid in range(trainset.n_users):
            if (uiid % 50 == 0):
                print("Processing user ", uiid)
            recs = autoRec.GetRecommendations([trainingMatrix[uiid]])
            
            for itemID, rec in enumerate(recs):
                self.predictedRatings[uiid, itemID] = rec * 5.0
        
        return self


    def estimate(self, u, i):

        if not (self.trainset.knows_user(u) and self.trainset.knows_item(i)):
            raise PredictionImpossible('User and/or item is unkown.')
        
        rating = self.predictedRatings[u, i]
        
        if (rating < 0.001):
            raise PredictionImpossible('No valid prediction exists.')
            
        return rating
    

***

**[Back to Top](#top)**

***

<a name="autorec_mod"></a>
#### 8.4.3 - AutoRec Model

When first attempting to tune this model, I tried using Surpriselib's RandomizedSearchCV. Unfortunately, just like with my SVD++ model, it kept producing errors. Furthermore, running the model Frank used took hours! Because of this, I felt very constrained in exploring hyperparameter tuning. With five-fold cross validation using GridSearchCV, I did not have the time to truly optimize the model. The one below took all night to finish training. 

Next, I'll run a GridSearch of AutoEncoder. 

Update: I tried running it and it timed out after about 18 hours. I'm running it again with less hyperparameters and with only 2 cross validations instead of 5.

In [None]:
# -*- coding: utf-8 -*-
"""
Created on Thu May  3 11:11:13 2018

@author: Frank
"""

from MovieLens import MovieLens
from AutoRecAlgorithm import AutoRecAlgorithm
from surprise import NormalPredictor
from Evaluator import Evaluator
from surprise.model_selection import GridSearchCV

import random
import numpy as np

def LoadMovieLensData():
    ml = MovieLens()
    print("Loading movie ratings...")
    data = ml.loadMovieLensLatestSmall()
    print("\nComputing movie popularity ranks so we can measure novelty later...")
    rankings = ml.getPopularityRanks()
    return (ml, data, rankings)

np.random.seed(29)
random.seed(29)

# Load up common data set for the recommender algorithms
(ml, evaluationData, rankings) = LoadMovieLensData()

print("Searching for best parameters...")
param_grid = {'hiddenDim': [100, 200], 'learningRate': [0.1, 0.2]}
gs_ac = GridSearchCV(AutoRecAlgorithm, param_grid, measures=['rmse', 'mae'], cv=2)

gs_ac.fit(evaluationData)

# best RMSE score
print("Best RMSE score attained: ", gs_ac.best_score['rmse'])

# combination of parameters that gave the best RMSE score
print(gs_ac.best_params['rmse'])

# Construct an Evaluator to, you know, evaluate them
evaluator = Evaluator(evaluationData, rankings)

params = gs_ac.best_params['rmse']
AutoRectuned = AutoRecAlgorithm(hiddenDim = params['hiddenDim'], learningRate = params['learningRate'])
evaluator.AddAlgorithm(AutoRectuned, "AutoRec - Tuned")

#Autoencoder
AutoRecUntuned = AutoRecAlgorithm()
evaluator.AddAlgorithm(AutoRecUntuned, "AutoRec - Untuned")

# Just make random recommendations
Random = NormalPredictor()
evaluator.AddAlgorithm(Random, "Random")

# Fight!
evaluator.Evaluate(True)

evaluator.SampleTopNRecs(ml)


Loading movie ratings...

Computing movie popularity ranks so we can measure novelty later...
Searching for best parameters...
Trained epoch  0
Trained epoch  1
Trained epoch  2
Trained epoch  3
Trained epoch  4
Trained epoch  5
Trained epoch  6
Trained epoch  7
Trained epoch  8
Trained epoch  9
Trained epoch  10
Trained epoch  11
Trained epoch  12
Trained epoch  13
Trained epoch  14
Trained epoch  15
Trained epoch  16
Trained epoch  17
Trained epoch  18
Trained epoch  19
Trained epoch  20
Trained epoch  21
Trained epoch  22
Trained epoch  23
Trained epoch  24
Trained epoch  25
Trained epoch  26
Trained epoch  27
Trained epoch  28
Trained epoch  29
Trained epoch  30
Trained epoch  31
Trained epoch  32
Trained epoch  33
Trained epoch  34
Trained epoch  35
Trained epoch  36
Trained epoch  37
Trained epoch  38
Trained epoch  39
Trained epoch  40
Trained epoch  41
Trained epoch  42
Trained epoch  43
Trained epoch  44
Trained epoch  45
Trained epoch  46
Trained epoch  47
Trained epoch  48
T