<a href="https://colab.research.google.com/github/mayukh776/Projects/blob/master/Movie_Recommendations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Boltzmann Machines

## Notebook Details
- __notebook name__: `'Movie_Recommendations'
- __notebook version/date__: `1.0.0`/`17-04-20`
- __notebook server__: Google Colab
- __python version__: `3.6`
- __pytorch version__: `1.1.0`

In [None]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.utils.data
import torch.optim as optim
from torch.autograd import Variable

The following dataset contains a list of 3952 movies along with their Genre which is instrumental in determining a user's preference for a particular movie. Viewers are more likely to enjoy a movie more similar to the genre they are accustomed to viewing. For instance, if a viewer emjoyed The Godfather it might be possible that it is because he/she likes Drama movies in general. So there is a higher probability of him/her to like another Drama - The Goodfellas than an adventure film like Jumanji.Just like Genre, there might be other factors in asserting whether a person would enjoy a certain movie or not like 
- whether the film won an Oscar
- who is the Director ?
- Who is the lead Actor ?  


In [None]:
# Importing the datasets
movies_headers = ['MOVIE_ID','TITLE','GENRE']
movies = pd.read_csv('movies.dat',sep ='::',header=None,engine='python',encoding='latin-1').values
movies = pd.DataFrame(movies,columns=movies_headers)
movies = movies.set_index(['MOVIE_ID'])
print(movies.head(10))

                                        TITLE                         GENRE
SERIAL_NO                                                                  
1                            Toy Story (1995)   Animation|Children's|Comedy
2                              Jumanji (1995)  Adventure|Children's|Fantasy
3                     Grumpier Old Men (1995)                Comedy|Romance
4                    Waiting to Exhale (1995)                  Comedy|Drama
5          Father of the Bride Part II (1995)                        Comedy
6                                 Heat (1995)         Action|Crime|Thriller
7                              Sabrina (1995)                Comedy|Romance
8                         Tom and Huck (1995)          Adventure|Children's
9                         Sudden Death (1995)                        Action
10                           GoldenEye (1995)     Action|Adventure|Thriller


Given below are the list of all the users (who will give their ratings) along with all their details which are not much of use apart from uniquely identifying the users.

In [None]:
user_headers = ['USER_NO','GENDER','AGE','OCCUPATION_CODE','VISIT_CODE']
users = pd.read_csv('users.dat',sep ='::',header = None,engine='python',encoding='latin-1').values
users = pd.DataFrame(users,columns=user_headers)
users = users.set_index(['USER_NO'])
print(users.head(10))   

          GENDER AGE OCCUPATION_CODE VISIT_CODE
SERIAL_NO                                      
1              F   1              10      48067
2              M  56              16      70072
3              M  25              15      55117
4              M  45               7      02460
5              M  25              20      55455
6              F  50               9      55117
7              M  35               1      06810
8              M  25              12      11413
9              M  25              17      61614
10             F  35               1      95370


Given below is the list of all the ratings given by each user for each and every movie he/she has watched. The ratings are from 1 to 5. We shall use this dataset to train our Boltzmann machine.

In [None]:
ratings_headers = ['USER_NO','MOVIE_ID','RATINGS','TIME_STAMPS']
ratings = pd.read_csv('ratings.dat',sep ='::',header = None,engine='python',encoding='latin-1').values
ratings = pd.DataFrame(ratings,columns=ratings_headers)
ratings = ratings.set_index(['USER_NO'])
print(ratings.head(10))

         MOVIE_ID  RATINGS  TIME_STAMPS
USER_NO                                
1            1193        5    978300760
1             661        3    978302109
1             914        3    978301968
1            3408        4    978300275
1            2355        5    978824291
1            1197        3    978302268
1            1287        5    978302039
1            2804        5    978300719
1             594        4    978302268
1             919        4    978301368


# Boltzmann Machines - Principle and Working
A Boltzmann machine  is a type of stochastic recurrent neural network.
Although learning is impractical in general Boltzmann machines, it can be made quite efficient in a restricted Boltzmann machine (RBM) which does not allow intralayer connections between hidden units. After training one RBM, the activities of its hidden units can be treated as data for training a higher-level RBM. This method of stacking RBMs makes it possible to train many layers of hidden units efficiently and is one of the most common deep learning strategies. As each new layer is added the generative model improves.
An extension to the restricted Boltzmann machine allows using real valued data rather than binary data

# TEST SET I

In [None]:
# Preparing the training sets and test sets
training_set = pd.read_csv('u1.base',delimiter='\t')
training_set = np.array(training_set,dtype='int')
test_set = pd.read_csv('u1.test',delimiter='\t')
test_set = np.array(test_set,dtype='int')
print(training_set)

[[        1         2         3 876893171]
 [        1         3         4 878542960]
 [        1         4         3 876893119]
 ...
 [      943      1188         3 888640250]
 [      943      1228         3 888640275]
 [      943      1330         3 888692465]]


In [None]:
no_users = int(max(max(training_set[:,0]), max(test_set[:,0])))
no_movies = int(max(max(training_set[:,1]), max(test_set[:,1])))

In [None]:
# Converting the data into a matrix with users in lines and movies in columns
def convert(data):
  new_data = []
  for id_users in range(1,no_users + 1):
    id_movies = data[:,1][data[:,0] == id_users]
    id_ratings = data[:,2][data[:,0] == id_users]
    ratings = np.zeros(no_movies)
    ratings[id_movies-1] = id_ratings
    new_data.append(list(ratings))
  return new_data
training_set_list = convert(training_set)
test_set_list = convert(test_set)

In [None]:
# Torch Tensors
# Input for a Tensor would be a list of lists 
training_set_list = torch.FloatTensor(training_set_list)
test_set_list = torch.FloatTensor(test_set_list)

In [None]:
training_set_list[training_set_list == 0] = -1
training_set_list[training_set_list == 1] = 0
training_set_list[training_set_list == 2] = 0
training_set_list[training_set_list >= 3] = 1
test_set_list[test_set_list == 0] = -1
test_set_list[test_set_list == 1] = 0
test_set_list[test_set_list == 2] = 0
test_set_list[test_set_list >= 3] = 1

In [None]:
# Create the architecture of the RBM
class RBM():
  def __init__(self,nv,nh):
    # Initializes the weights of the probs for the visible node given the hidden nodes
    self.W = torch.randn(nh,nv) 
    # Initializes the bias of the probs for the hidden node
    self.A = torch.randn(1,nh) 
    # Initializes the bias of the probs for the visible node
    self.B = torch.randn(1,nv) 
  # Sampling the hidden nodes according to probs P(H/V)
  def sample_h(self,x):
    # Product of W and x
    wx = torch.mm(x,self.W.t())
    # To have the same dimensionality as wx appiled at each line of mininbatch wx.
    activation = wx + self.A.expand_as(wx) 
    p_h_given_v = torch.sigmoid(activation)
    # Returns bernoulli samples of p_h_given_v distribution
    return p_h_given_v,torch.bernoulli(p_h_given_v)
  # Sampling the visible nodes according to probs P(V/H)
  def sample_v(self,y):
    wy = torch.mm(y,self.W) # Product of W and x
    activation = wy + self.B.expand_as(wy) # To have the same dimensionality as wx
    p_v_given_h = torch.sigmoid(activation)
    return p_v_given_h,torch.bernoulli(p_v_given_h)
  # Contrastive Divergence
  def train(self,v0,vk,ph0,phk):
    self.W = self.W + (torch.mm(v0.t(),ph0) - torch.mm(vk.t(),phk)).t()
    self.B = self.B + torch.sum((v0 - vk),0)
    self.A = self.A + torch.sum((ph0 - phk),0)

In [None]:
# no of movies
nv = len(training_set_list[0])
# no of features we want to detect
nh = 100
batch_size = 100
# Creating the object
rbm = RBM(nv,nh)
# Training the RBM
no_epoch = 10
for epoch in range(1,no_epoch + 1):
    train_loss = 0 # diff b/w pred and actual ratings
    s = 0.0
    for id_users in range(0,no_users - batch_size,batch_size):
        vk = training_set_list[id_users:id_users + batch_size]
        v0 = training_set_list[id_users:id_users + batch_size]  # Actual ratings
        ph0,_ = rbm.sample_h(v0)
        # K-step Contrastive Divergence
        for k in range(10):
            _,hk = rbm.sample_h(vk)
            _,vk = rbm.sample_v(hk)
            vk[v0<0] = v0[v0<0]
        phk,_ = rbm.sample_h(vk)
        rbm.train(v0,vk,ph0,phk)
        train_loss += torch.mean(torch.abs(v0[v0>=0]-vk[v0>=0]))
        s+=1
    print('epoch: ' + str(epoch) + ' loss: ' + str(train_loss/s))

epoch: 1 loss: tensor(0.3327)
epoch: 2 loss: tensor(0.2470)
epoch: 3 loss: tensor(0.2527)
epoch: 4 loss: tensor(0.2485)
epoch: 5 loss: tensor(0.2507)
epoch: 6 loss: tensor(0.2460)
epoch: 7 loss: tensor(0.2463)
epoch: 8 loss: tensor(0.2479)
epoch: 9 loss: tensor(0.2447)
epoch: 10 loss: tensor(0.2488)


In [None]:
# diff b/w pred and actual ratings
test_loss = 0 
s = 0.0
for id_users in range(no_users):
    v = training_set_list[id_users:id_users + 1]
    vt = test_set_list[id_users:id_users + 1]  # Actual ratings
    if len(vt[vt>=0])>0:
        _,h = rbm.sample_h(v)
        _,v = rbm.sample_v(h)
        test_loss += torch.mean(torch.abs(vt[vt>=0]-v[vt>=0]))
        s+=1
print('test loss:' + str(test_loss/s))

test loss:tensor(0.2464)


# TEST SET II

In [None]:
# Preparing the training sets and test sets
training_set = pd.read_csv('u2.base',delimiter='\t')
training_set = np.array(training_set,dtype='int')
test_set = pd.read_csv('u2.test',delimiter='\t')
test_set = np.array(test_set,dtype='int')
print(training_set)

[[        1         4         3 876893119]
 [        1         5         3 889751712]
 [        1         6         5 887431973]
 ...
 [      943      1188         3 888640250]
 [      943      1228         3 888640275]
 [      943      1330         3 888692465]]


In [None]:
no_users = int(max(max(training_set[:,0]), max(test_set[:,0])))
no_movies = int(max(max(training_set[:,1]), max(test_set[:,1])))

In [None]:
# Converting the data into a matrix with users in lines and movies in columns
def convert(data):
  new_data = []
  for id_users in range(1,no_users + 1):
    id_movies = data[:,1][data[:,0] == id_users]
    id_ratings = data[:,2][data[:,0] == id_users]
    ratings = np.zeros(no_movies)
    ratings[id_movies-1] = id_ratings
    new_data.append(list(ratings))
  return new_data
training_set_list = convert(training_set)
test_set_list = convert(test_set)

In [None]:
# Torch Tensors
# Input for a Tensor would be a list of lists 
training_set_list = torch.FloatTensor(training_set_list)
test_set_list = torch.FloatTensor(test_set_list)

In [None]:
training_set_list[training_set_list == 0] = -1
training_set_list[training_set_list == 1] = 0
training_set_list[training_set_list == 2] = 0
training_set_list[training_set_list >= 3] = 1
test_set_list[test_set_list == 0] = -1
test_set_list[test_set_list == 1] = 0
test_set_list[test_set_list == 2] = 0
test_set_list[test_set_list >= 3] = 1

In [None]:
# Create the architecture of the RBM
class RBM():
  def __init__(self,nv,nh):
    # Initializes the weights of the probs for the visible node given the hidden nodes
    self.W = torch.randn(nh,nv) 
    # Initializes the bias of the probs for the hidden node
    self.A = torch.randn(1,nh) 
    # Initializes the bias of the probs for the visible node
    self.B = torch.randn(1,nv) 
  # Sampling the hidden nodes according to probs P(H/V)
  def sample_h(self,x):
    # Product of W and x
    wx = torch.mm(x,self.W.t())
    # To have the same dimensionality as wx appiled at each line of mininbatch wx.
    activation = wx + self.A.expand_as(wx) 
    p_h_given_v = torch.sigmoid(activation)
    # Returns bernoulli samples of p_h_given_v distribution
    return p_h_given_v,torch.bernoulli(p_h_given_v)
  # Sampling the visible nodes according to probs P(V/H)
  def sample_v(self,y):
    wy = torch.mm(y,self.W) # Product of W and x
    activation = wy + self.B.expand_as(wy) # To have the same dimensionality as wx
    p_v_given_h = torch.sigmoid(activation)
    return p_v_given_h,torch.bernoulli(p_v_given_h)
  # Contrastive Divergence
  def train(self,v0,vk,ph0,phk):
    self.W = self.W + (torch.mm(v0.t(),ph0) - torch.mm(vk.t(),phk)).t()
    self.B = self.B + torch.sum((v0 - vk),0)
    self.A = self.A + torch.sum((ph0 - phk),0)

In [None]:
# no of movies
nv = len(training_set_list[0])
# no of features we want to detect
nh = 100
batch_size = 100
# Creating the object
rbm = RBM(nv,nh)
# Training the RBM
no_epoch = 10
for epoch in range(1,no_epoch + 1):
    train_loss = 0 # diff b/w pred and actual ratings
    s = 0.0
    for id_users in range(0,no_users - batch_size,batch_size):
        vk = training_set_list[id_users:id_users + batch_size]
        v0 = training_set_list[id_users:id_users + batch_size]  # Actual ratings
        ph0,_ = rbm.sample_h(v0)
        # K-step Contrastive Divergence
        for k in range(10):
            _,hk = rbm.sample_h(vk)
            _,vk = rbm.sample_v(hk)
            vk[v0<0] = v0[v0<0]
        phk,_ = rbm.sample_h(vk)
        rbm.train(v0,vk,ph0,phk)
        train_loss += torch.mean(torch.abs(v0[v0>=0]-vk[v0>=0]))
        s+=1
    print('epoch: ' + str(epoch) + ' loss: ' + str(train_loss/s))

epoch: 1 loss: tensor(0.3170)
epoch: 2 loss: tensor(0.2568)
epoch: 3 loss: tensor(0.2561)
epoch: 4 loss: tensor(0.2523)
epoch: 5 loss: tensor(0.2492)
epoch: 6 loss: tensor(0.2456)
epoch: 7 loss: tensor(0.2487)
epoch: 8 loss: tensor(0.2493)
epoch: 9 loss: tensor(0.2528)
epoch: 10 loss: tensor(0.2482)


In [None]:
# diff b/w pred and actual ratings
test_loss = 0 
s = 0.0
for id_users in range(no_users):
    v = training_set_list[id_users:id_users + 1]
    vt = test_set_list[id_users:id_users + 1]  # Actual ratings
    if len(vt[vt>=0])>0:
        _,h = rbm.sample_h(v)
        _,v = rbm.sample_v(h)
        test_loss += torch.mean(torch.abs(vt[vt>=0]-v[vt>=0]))
        s+=1
print('test loss:' + str(test_loss/s))

test loss:tensor(0.2477)


# TEST SET III

In [None]:
# Preparing the training sets and test sets
training_set = pd.read_csv('u3.base',delimiter='\t')
training_set = np.array(training_set,dtype='int')
test_set = pd.read_csv('u3.test',delimiter='\t')
test_set = np.array(test_set,dtype='int')
print(training_set)

[[        1         2         3 876893171]
 [        1         3         4 878542960]
 [        1         4         3 876893119]
 ...
 [      943      1188         3 888640250]
 [      943      1228         3 888640275]
 [      943      1330         3 888692465]]


In [None]:
no_users = int(max(max(training_set[:,0]), max(test_set[:,0])))
no_movies = int(max(max(training_set[:,1]), max(test_set[:,1])))

In [None]:
# Converting the data into a matrix with users in lines and movies in columns
def convert(data):
  new_data = []
  for id_users in range(1,no_users + 1):
    id_movies = data[:,1][data[:,0] == id_users]
    id_ratings = data[:,2][data[:,0] == id_users]
    ratings = np.zeros(no_movies)
    ratings[id_movies-1] = id_ratings
    new_data.append(list(ratings))
  return new_data
training_set_list = convert(training_set)
test_set_list = convert(test_set)

In [None]:
# Torch Tensors
# Input for a Tensor would be a list of lists 
training_set_list = torch.FloatTensor(training_set_list)
test_set_list = torch.FloatTensor(test_set_list)

In [None]:
training_set_list[training_set_list == 0] = -1
training_set_list[training_set_list == 1] = 0
training_set_list[training_set_list == 2] = 0
training_set_list[training_set_list >= 3] = 1
test_set_list[test_set_list == 0] = -1
test_set_list[test_set_list == 1] = 0
test_set_list[test_set_list == 2] = 0
test_set_list[test_set_list >= 3] = 1

In [None]:
# Create the architecture of the RBM
class RBM():
  def __init__(self,nv,nh):
    # Initializes the weights of the probs for the visible node given the hidden nodes
    self.W = torch.randn(nh,nv) 
    # Initializes the bias of the probs for the hidden node
    self.A = torch.randn(1,nh) 
    # Initializes the bias of the probs for the visible node
    self.B = torch.randn(1,nv) 
  # Sampling the hidden nodes according to probs P(H/V)
  def sample_h(self,x):
    # Product of W and x
    wx = torch.mm(x,self.W.t())
    # To have the same dimensionality as wx appiled at each line of mininbatch wx.
    activation = wx + self.A.expand_as(wx) 
    p_h_given_v = torch.sigmoid(activation)
    # Returns bernoulli samples of p_h_given_v distribution
    return p_h_given_v,torch.bernoulli(p_h_given_v)
  # Sampling the visible nodes according to probs P(V/H)
  def sample_v(self,y):
    wy = torch.mm(y,self.W) # Product of W and x
    activation = wy + self.B.expand_as(wy) # To have the same dimensionality as wx
    p_v_given_h = torch.sigmoid(activation)
    return p_v_given_h,torch.bernoulli(p_v_given_h)
  # Contrastive Divergence
  def train(self,v0,vk,ph0,phk):
    self.W = self.W + (torch.mm(v0.t(),ph0) - torch.mm(vk.t(),phk)).t()
    self.B = self.B + torch.sum((v0 - vk),0)
    self.A = self.A + torch.sum((ph0 - phk),0)

In [None]:
# no of movies
nv = len(training_set_list[0])
# no of features we want to detect
nh = 100
batch_size = 100
# Creating the object
rbm = RBM(nv,nh)
# Training the RBM
no_epoch = 10
for epoch in range(1,no_epoch + 1):
    train_loss = 0 # diff b/w pred and actual ratings
    s = 0.0
    for id_users in range(0,no_users - batch_size,batch_size):
        vk = training_set_list[id_users:id_users + batch_size]
        v0 = training_set_list[id_users:id_users + batch_size]  # Actual ratings
        ph0,_ = rbm.sample_h(v0)
        # K-step Contrastive Divergence
        for k in range(10):
            _,hk = rbm.sample_h(vk)
            _,vk = rbm.sample_v(hk)
            vk[v0<0] = v0[v0<0]
        phk,_ = rbm.sample_h(vk)
        rbm.train(v0,vk,ph0,phk)
        train_loss += torch.mean(torch.abs(v0[v0>=0]-vk[v0>=0]))
        s+=1
    print('epoch: ' + str(epoch) + ' loss: ' + str(train_loss/s))

epoch: 1 loss: tensor(0.3169)
epoch: 2 loss: tensor(0.2503)
epoch: 3 loss: tensor(0.2496)
epoch: 4 loss: tensor(0.2510)
epoch: 5 loss: tensor(0.2534)
epoch: 6 loss: tensor(0.2499)
epoch: 7 loss: tensor(0.2509)
epoch: 8 loss: tensor(0.2507)
epoch: 9 loss: tensor(0.2522)
epoch: 10 loss: tensor(0.2500)


In [None]:
# diff b/w pred and actual ratings
test_loss = 0 
s = 0.0
for id_users in range(no_users):
    v = training_set_list[id_users:id_users + 1]
    vt = test_set_list[id_users:id_users + 1]  # Actual ratings
    if len(vt[vt>=0])>0:
        _,h = rbm.sample_h(v)
        _,v = rbm.sample_v(h)
        test_loss += torch.mean(torch.abs(vt[vt>=0]-v[vt>=0]))
        s+=1
print('test loss:' + str(test_loss/s))

test loss:tensor(0.2528)
