# Boltzmann Machines

<img src="img/67_blog_image_2.png" width="800" height="500">

* Use-case: Recommender Systems
* Undirected and Unsupervised models
  * Generative deep-learning
  * No output layer 
  * Each visible node is something that we measure, e.g. each component of a power plant
  * Each hidden layer is something we don't measure, e.g. wind speed, humidity <img src="img/67_blog_image_3.png" width="500" height="250" >
* Energy-based models (EBM)
  * Uses the Boltzmann distribution formula <img src="img/boltzmann_dist.png" width="100" height="50" >
  * Systems tend to move towards their lowest energy state
  * [A Tutorial on Energy-Based Learning](http://yan.lecuncom/exdb/publis/pdf/lecun-06.pdf)
  * Mr. Nobody
* Restricted Boltzmann Machine (RBM) <img src="img/restricted_boltzmann_machine.png" height="500" width="800" >
  * In/visible nodes cannot connect to each other
  * Visible nodes can be different movies
  * Hidden nodes can be features such as movie genre, actor, award, director, etc.
* Contrastive Divergence
  * Allows RBM's to learn through Gibb's sampling
  * Iteratively updates the weights to minimize the energy in the system
  * [A Fast Learning Algorithm for Deep Belief Nets](https://www.cs.toronto.edu/~hinton/absps/fastnc.pdf)
* Deep Belief Networks (DBN)
  * Stacked RBM's
  * Greedy Layer-wise training
  * Wake-Sleep algorithm
  * Top two layers are undirected
  * Bottom two layer connections flow downwards towards the inputs (directed)
  * [Greedy Layer-Wise Training of Deep Networks](http://www.iro.umontreal.ca/~lisa/pointeurs/BengioNips2006all.pdf)
* Deep Boltzmann Machines (DBM)
  * Stacked RBM's but all connections remained undirected
  * [Deep Boltzmann Machines](http://www.utstat.toronto.edu/~rsalakhu/dbm.pdf)

In [1]:
from os.path import dirname, abspath, join, curdir

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.optim as optim
import torch.utils.data

from torch.autograd import Variable

In [8]:
# Import the dataset
datapath = join(dirname(dirname(abspath(curdir))), "data", "raw", "rbm")

movies = pd.read_csv(join(datapath, "movielens-1m", "movies.dat"),
                     sep="::",
                     header=None,
                     engine="python",
                     encoding="latin-1")

users = pd.read_csv(join(datapath, "movielens-1m", "users.dat"),
                    sep="::",
                    header=None,
                    engine="python",
                    encoding="latin-1")

ratings = pd.read_csv(join(datapath, "movielens-1m", "ratings.dat"),
                      sep="::",
                      header=None,
                      engine="python",
                      encoding="latin-1")

movies.shape, users.shape, ratings.shape

((3883, 3), (6040, 5), (1000209, 4))

In [15]:
# Prepare training and test sets
train_df = pd.read_csv(join(datapath, "movielens-100k", "u1.base"),
                        sep="\t",
                        header=None)

train_set = np.array(train_df, dtype="int")

test_df = pd.read_csv(join(datapath, "movielens-100k", "u1.test"),
                        sep="\t",
                        header=None)

test_set = np.array(test_df, dtype="int")

train_set.shape, test_set.shape

((80000, 4), (20000, 4))