-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Jan Melchior
authored and
Jan Melchior
committed
Apr 24, 2017
1 parent
af7dd26
commit ba3cf92
Showing
9 changed files
with
252 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
Big centered binary RBM on MNIST | ||
========================================================== | ||
|
||
Example for training a centered binary restricted Boltzmann machine on the MNIST handwritten digit dataset. | ||
The model has 500 hidden units, is trainer 200 epochs, and the Log-likelihood is evaluated using Annealed Importance Sampling. | ||
and allows to reproduce the the results from the publication `How to Center Deep Boltzmann Machines. Melchior, J., Fischer, A., & Wiskott, L.. (2016). Journal of Machine Learning Research, 17(99), 1–61. <http://jmlr.org/papers/v17/14-237.html>`_ | ||
Running the code as it is reproduces a single trial of the plot in Figure 9. (PCD-1) for $dd^b_s$. | ||
|
||
See also `RBM_MNIST_small <RBM_MNIST_small.html#RBM_MNIST_small>`__. | ||
|
||
Theory | ||
*********** | ||
|
||
For an analysis of advantage of centering in RBMs see `How to Center Deep Boltzmann Machines. Melchior, J., Fischer, A., & Wiskott, L.. (2016). Journal of Machine Learning Research, 17(99), 1–61. <http://jmlr.org/papers/v17/14-237.html>`_ | ||
|
||
If you are new on RBMs, have a look into my `master's theses <https://www.ini.rub.de/PEOPLE/wiskott/Reprints/Melchior-2012-MasterThesis-RBMs.pdf>`_ | ||
|
||
A good theoretical introduction is also given by `Course Material RBMs <https://www.ini.rub.de/PEOPLE/wiskott/Teaching/Material/index.html>`_ and in the following video. | ||
|
||
.. raw:: html | ||
|
||
<div style="margin-top:10px;"> | ||
<iframe width="560" height="315" src="http://www.youtube.com/embed/bMaITeXhOaE" frameborder="0" allowfullscreen></iframe> | ||
</div> | ||
|
||
and | ||
|
||
.. raw:: html | ||
|
||
<div style="margin-top:10px;"> | ||
<iframe width="560" height="315" src="http://www.youtube.com/embed/nyk5XUklb5M" frameborder="0" allowfullscreen></iframe> | ||
</div> | ||
|
||
Results | ||
*********** | ||
|
||
The code_ given below produces the following output. | ||
|
||
Learned filters of a centered binary RBM with 500 hidden units on the MNIST dataset. | ||
The filters have been normalized such that the structure is more prominent. | ||
|
||
.. figure:: images/BRBM_big_centered_weights.png | ||
:scale: 75 % | ||
:alt: weights centered | ||
|
||
Sampling results for some examples. The first row shows training data and the following rows are the results after one Gibbs-sampling step starting from the previous row. | ||
|
||
.. figure:: images/BRBM_big_centered_samples.png | ||
:scale: 75 % | ||
:alt: samples centered | ||
|
||
The Log-Likelihood is calculated using annealed importance sampling estimation (optimistic) and reverse annealed importance sampling estimation (pessimistic). | ||
|
||
.. code-block:: Python | ||
Training time: 0:49:51.186054 | ||
AIS Partition: 951.21017149 (LL: -76.0479396244) | ||
reverse AIS Partition: 954.687597369 (LL: -79.525365503) | ||
The code can also be executed without centering by setting | ||
|
||
.. code-block:: python | ||
update_offsets = 0.0 | ||
Resulting in the following weights and sampling steps. | ||
The filters have been normalized such that the structure is more prominent. | ||
|
||
.. figure:: images/BRBM_big_normal_weights.png | ||
:scale: 75 % | ||
:alt: weights normal | ||
|
||
Sampling results for some examples. The first row shows training data and the following rows are the result after one Gibbs-sampling step. | ||
|
||
.. figure:: images/BRBM_big_normal_samples.png | ||
:scale: 75 % | ||
:alt: samples normal | ||
|
||
The Log-Likelihood for this model is significantly worse (8 nats lower). | ||
|
||
.. code-block:: Python | ||
Training time: 0:49:51.186054 | ||
AIS Partition: 951.21017149 (LL: -76.0479396244) | ||
reverse AIS Partition: 954.687597369 (LL: -79.525365503) | ||
Source code | ||
*********** | ||
|
||
.. figure:: images/download_icon.png | ||
:scale: 20 % | ||
:target: https://github.com/MelJan/PyDeep/blob/master/examples/RBM_MNIST_big.py | ||
|
||
.. literalinclude:: ../../examples/RBM_MNIST_big.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
''' Example using a big BB-RBMs on the MNIST handwritten digit database. | ||
:Version: | ||
1.1.0 | ||
:Date: | ||
24.04.2017 | ||
:Author: | ||
Jan Melchior | ||
:Contact: | ||
JanMelchior@gmx.de | ||
:License: | ||
Copyright (C) 2017 Jan Melchior | ||
This file is part of the Python library PyDeep. | ||
PyDeep is free software: you can redistribute it and/or modify | ||
it under the terms of the GNU General Public License as published by | ||
the Free Software Foundation, either version 3 of the License, or | ||
(at your option) any later version. | ||
This program is distributed in the hope that it will be useful, | ||
but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
GNU General Public License for more details. | ||
You should have received a copy of the GNU General Public License | ||
along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
''' | ||
import numpy as numx | ||
import pydeep.rbm.model as model | ||
import pydeep.rbm.trainer as trainer | ||
import pydeep.rbm.estimator as estimator | ||
|
||
import pydeep.misc.io as io | ||
import pydeep.misc.visualization as vis | ||
import pydeep.misc.measuring as mea | ||
|
||
# Set random seed (optional) | ||
numx.random.seed(42) | ||
|
||
# normal RBM | ||
#update_offsets = 0.0 | ||
# centered RBM | ||
update_offsets = 0.0 | ||
|
||
# Input and hidden dimensionality | ||
v1 = v2 = 28 | ||
h1 = 25 | ||
h2 = 20 | ||
|
||
# Load data , get it from 'deeplearning.net/data/mnist/mnist.pkl.gz' | ||
train_data = io.load_mnist("../../data/mnist.pkl.gz", True)[0] | ||
|
||
# Training paramters | ||
batch_size = 100 | ||
epochs = 200 | ||
rbm = io.load_object("mnist500.rbm") | ||
|
||
# Create centered or normal model | ||
if update_offsets <= 0.0: | ||
rbm = model.BinaryBinaryRBM(number_visibles=v1 * v2, | ||
number_hiddens=h1 * h2, | ||
data=train_data, | ||
initial_visible_offsets=0.0, | ||
initial_hidden_offsets=0.0) | ||
else: | ||
rbm = model.BinaryBinaryRBM(number_visibles=v1 * v2, | ||
number_hiddens=h1 * h2, | ||
data=train_data, | ||
initial_visible_offsets='AUTO', | ||
initial_hidden_offsets='AUTO') | ||
|
||
trainer_pcd = trainer.PCD(rbm, batch_size) | ||
|
||
# Measuring time | ||
measurer = mea.Stopwatch() | ||
|
||
# Train model | ||
print('Training') | ||
print('Epoch\t\tRecon. Error\tLog likelihood \tExpected End-Time') | ||
for epoch in range(1, epochs + 1): | ||
|
||
# Shuffle training samples (optional) | ||
train_data = numx.random.permutation(train_data) | ||
|
||
# Loop over all batches | ||
for b in range(0, train_data.shape[0], batch_size): | ||
batch = train_data[b:b + batch_size, :] | ||
trainer_pcd.train(data=batch, epsilon=0.01) | ||
|
||
# Calculate reconstruction error and expected end time every 10th epoch | ||
if epoch % 10 == 0: | ||
RE = numx.mean(estimator.reconstruction_error(rbm, train_data)) | ||
print('{}\t\t{:.4f}\t\t\t{}'.format( | ||
epoch, RE, measurer.get_expected_end_time(epoch, epochs))) | ||
else: | ||
print(epoch) | ||
|
||
# Save the model | ||
io.save_object(rbm, "mnist500.rbm") | ||
|
||
# Stop time measurement | ||
measurer.end() | ||
|
||
# Print end time | ||
print("End-time: \t{}".format(measurer.get_end_time())) | ||
print("Training time:\t{}".format(measurer.get_interval())) | ||
|
||
# Approximate partition function using AIS for lower bound approximiation | ||
Z = estimator.annealed_importance_sampling(rbm)[0] | ||
print("AIS Partition: {} (LL: {})".format(Z, numx.mean( | ||
estimator.log_likelihood_v(rbm, Z, train_data)))) | ||
|
||
# Approximate partition function using reverse AIS for upper bound approximiation | ||
Z = estimator.reverse_annealed_importance_sampling(rbm)[0] | ||
print("reverse AIS Partition: {} (LL: {})".format(Z, numx.mean( | ||
estimator.log_likelihood_v(rbm, Z, train_data)))) | ||
|
||
# Reorder RBM features by average activity decreasingly | ||
reordered_rbm = vis.reorder_filter_by_hidden_activation(rbm, train_data) | ||
|
||
# Display RBM parameters | ||
vis.imshow_standard_rbm_parameters(reordered_rbm, v1, v2, h1, h2) | ||
|
||
# Sample some steps and show results | ||
samples = vis.generate_samples(rbm, train_data[0:30], 30, 1, v1, v2, False, None) | ||
vis.imshow_matrix(samples, 'Samples') | ||
|
||
# Display results | ||
vis.show() |
File renamed without changes.