[WIP]Restricted Boltzmann Machine #1027

kris-singh · 2017-06-14T18:19:35Z

I have impmented the

Base layer(visible and hidden layer)
RBM(wrapper)
Cd-k algorithm(In progress)

I will add test after the cd-k algorithm is added

kris-singh · 2017-06-14T18:24:45Z

I am also confused about the way to handle the biases of the hidden and visible layers.
Visible layer uses the biases of the hidden layer. Right now we have 2 biases for every layer
ownBias and otherBias the problem with that is when we initialize the parameters of the layer we have to swap the how the parameters is initialized using the initializer.

lozhnikov

I commented the main issues. There are some other minor issues. But I think they'll be fixed when you'll solve the main problems.

lozhnikov · 2017-06-14T19:58:39Z

src/mlpack/methods/ann/layer/binary_rbm.hpp

+    typename InputDataType = arma::mat,
+    typename OutputDataType = arma::mat
+>
+class VisibleLayer


I guess it is better to call the class BinaryRBMVisibleLayer or BinaryRBMLayer since the implementation assumes that the class is intended only for the binary RBM.

lozhnikov · 2017-06-14T20:00:58Z

src/mlpack/methods/ann/layer/binary_rbm.hpp

+   * @param output samples from the parameters
+   */
+  template<typename eT>
+  void Sample(arma::Mat<eT>&& input, arma::Mat<eT>&& output);


What is the difference between Sample() and Forward()? Why do you don't want to merge these functions into one?

If we look at the deeplearning.net example. Forward function actually computes the activation form the sigmoid layer and sample actually uses the activations to sample visible/ hidden layer. I think we need both

I guess Forward() can compute both the activation and samples.

Well in many cases you need the preactivation outputs. I think we need both the funcitons

lozhnikov · 2017-06-14T20:08:11Z

src/mlpack/methods/ann/layer/binary_rbm.hpp

+  OutputDataType ownBias;
+
+  //! Locally-stored bias of the other layer.
+  OutputDataType otherBias;


Hidden variables are sampled from P(h=1|v)=Sigm(W^t * v + c). The weight matrix W is shared between the visible and the hidden layers. The matrix W is denoted by weight (in your notation). The bias c is denoted by ownBias (I suggest just bias) So, you need to store only two matrices: weight and bias.

Okay... so you mean to say that bias(ownBias + otheBIas). because c actually denotes the other bias always.

No, c denotes ownBias. otherBias is not needed at all. So, for visible layer bias means visible biases and for hidden layer bias means hidden biases.

but the forward pass of the visible layer requires us to have the hidden bias. Where do we get that from. Should i include that as a input to the Forward function. Then i would have to concat the data point and hidden bias as we would have to maintain the signature of the forward funciton.

Hmm.. I looked again through the article, the forward pass of the visible layer needn't hidden biases. Hidden variables are sampled from P(h=1|v)=Sigm(W^t * v + c) where W denotes the shared weight matrix and c denotes visible biases.

I think you can look at the equation 7 c: represents the hidden biases http://deeplearning.net/tutorial/rbm.html#equation-rbm_propup

oh, you are right, nevertheless you can use the technique that I suggested below.

In your solution we have only one bias for the binaryRBMLayer. So the problem is that since the forward function uses the bias of the other layer(hidden layer uses the bias of the visible layer) how can we code that up. Are you suggesting we still keep both biases and do something like this

if(visible)
{
// Parameters for the visible layer
weight = arma::mat(weights.memptr(), outSize, inSize, false, false);
ownBias = arma::mat(weights.memptr()+weight.n_elem, inSize, 1, false, false);
otherBias = arma::mat(weights.memptr()+weight.n_elem+inSize, outSize, 1, false, false);
}
else
{
// Patameters for the hidden layer
weight = arma::mat(weights.memptr(), outSize, inSize, false, false);
ownBias = arma::mat(weights.memptr()+weight.n_elem+inSize, outSize, 1, false, false);
otherBias = arma::mat(weights.memptr()+weight.n_elem, inSize, 1, false, false);
}

yeah, of course

lozhnikov · 2017-06-14T20:09:27Z

src/mlpack/methods/ann/layer/binary_rbm_impl.hpp

+  inSize(inSize), 
+  outSize(outSize)
+{
+  weights.set_size(outSize * inSize + inSize, 1);


I guess it is better to allocate memory inside the main RBM class.

I did not understand your comment your saying that when we declare a new layer we do not allocate memory to it. Other layer allocate memory when they are created. Also we cannot this in the vanilla_rbm as we do not have acess to the insize and outsize there.

lozhnikov · 2017-06-14T20:19:44Z

src/mlpack/methods/ann/layer/binary_rbm_impl.hpp

+  if(Parameters().empty())
+    Reset();
+  else
+    LogisticFunction::Fn(weight * input + otherBias, output);


This is just a forward pass through Linear and then through SigmoidLayer.
We discussed that it is better to use the existing API.
So this line should look like

linear.Forward(std::move(input), std::move(linear.OutputParameters())); sigmoid.Forward(std::move(linear.OutputParameters()), std::move(output));

The reason for not doing so is that we would have to declare the linear and sigmoid layer which mean we would have to store there parameters and reset them also(overhead). This seems a overhead to me. Also all layer in mlpack are independent that's why i decided to go with this

lozhnikov · 2017-06-14T20:23:18Z

src/mlpack/methods/ann/layer/binary_rbm_impl.hpp

+void VisibleLayer<InputDataType, OutputDataType>::Sample(arma::Mat<eT>&& input, arma::Mat<eT>&& output)
+{
+    Forward(std::move(input), std::move(output));
+    for(size_t i = 0; i < output.n_elem; i++)


We also discussed that this is just a forward pass through some layer. I sent an example.
Moreover binomial distribution should look like math::Random() functions (see core/math/random.hpp).

math::Random() generates a given number between 0 and 1. The way RBM uses it is checking if the generated number is greater the ForwardPass activation through the layer then output(i) = 1.
If we directly use math random it does not take into account the input

I mean you should add a function to core/math/random.hpp:) for example RandBinomial

lozhnikov · 2017-06-14T20:27:52Z

src/mlpack/methods/ann/vanilla_rbm.hpp

+template<typename InitializationRuleType,
+         typename VisibleLayerType,
+         typename HiddenLayerType>
+class VanillaRBM


I guess if you pass layer types as template parameters these parameters completely define the type of the RBM. So, for example RBM<... , BinaryRBMLayer, BinaryRBMLayer> represents the binary RBM and RBM<..., SSRBMVisibleLayer, SSRBMHiddenLayer> represents the spike and slab RBM. In such a way it is better to call the class just RBM.

Yes i think we can do that

lozhnikov · 2017-06-14T20:32:21Z

src/mlpack/methods/ann/vanilla_rbm.hpp

+   * @param persistence pcdk / not 
+   * @param output store the gradients
+   */
+  void Gradient(arma::mat input, 


You can pass k and persistence at the constructor. If you do so, the method will look like other Gradient() methods.

lozhnikov · 2017-06-14T20:34:11Z

src/mlpack/methods/ann/vanilla_rbm.hpp

+  //! The matrix of data points (predictors).
+  arma::mat predictors;
+  // Network
+  std::vector<LayerTypes> network;


The network consists of only two layers: visible and hidden. So, this variable is not needed.

lozhnikov · 2017-06-14T20:35:52Z

src/mlpack/tests/ann_layer_test.cpp

+/**
+ * Simple add module test.
+ */
+BOOST_AUTO_TEST_CASE(SimpleBinaryRbmLayerTest)


It seems you forgot to remove this test.

I think we need to test the rbm_binary layer. So there should be this test.

lozhnikov · 2017-06-15T14:59:36Z

Regarding your question about shared parameters:
I suggest the following:

The main RBM class allocates memory:

template<.........>
void RBM<VisibleLayerType, HiddenLayerType, .........>::ResetParameters()
{
  size_t weights = visible.Parameters().n_elem;

  parameter.set_size(weights, 1);
  initializeRule.Initialize(parameter, parameter.n_elem, 1);

  visible.Parameters() = arma::mat(parameters.memptr(),
      visible.Parameters().n_rows, visible.Parameters().n_cols, false, false);

  // And the same for the hidden layer
  hidden.Parameters() = arma::mat(parameters.memptr(),
      visible.Parameters().n_rows, visible.Parameters().n_cols, false, false);

  // The Reset() function have to set up links properly
  visible.Reset();
  hidden.Reset();
}

Implementation of the Reset() method:
The implementation assumes that you have to define the visible variable (at the constructor for example).

template<typename InputDataType, typename OutputDataType>
void BinaryRBMLayer<InputDataType, OutputDataType>::Reset()
{
  if (visible)
  {
    // The number of input (visible) variables is equal to inSize
    // The number of output (hidden) variables is equal to outSize
    weight = arma::mat(weights.memptr(), outSize, inSize, false, false);

    // The offset is equal to the number of elements of the weight matrix
    bias = arma::mat(weights.memptr() + weight.n_elem, outSize, 1, false, false);
  }
  else
  {
    // The number of input (hidden) variables is equal to inSize
    // The number of output (visible) variables is equal to outSize
    weight = arma::mat(weights.memptr(), inSize, outSize, false, false);

    // The offset is equal to the number of elements of the weight matrix and the size of visible biases
    bias = arma::mat(weights.memptr() + weight.n_elem + inSize, outSize, 1, false, false);
  }
}

So, the weights matrix contains all parameters. The weight matrix W is denoted by weight. And each layer has only its own biases.

kris-singh · 2017-06-16T09:55:58Z

Just a few minor things I think
visible.Parameters() = arma::mat(parameters.memptr(), visible.Parameters().n_elem, 1, false, false);
Since parameters is a 1d vector
Also here
// The offset is equal to the number of elements of the weight matrix bias = arma::mat(weights.memptr() + weight.n_elem, inSize, 1, false, false);

lozhnikov · 2017-06-16T20:22:12Z

src/mlpack/core/math/random.hpp

@@ -66,6 +66,31 @@ inline double Random(const double lo, const double hi)
 }

 /**
+ * Generates a binomial random number in the specified range.
+ */
+inline double BinomialRandom(const int input)


I guess the comment is invalid since input is equal to the success probability (and the mean). Moreover input is a real variable and belongs to [0, 1].
Actually this is the Bernoulli distribution since the binomial distribution depends on the number of trials. And the name of the function should be similar to others (RandNormal, RandInt, etc).
So, I suggest RandBernoulli.

lozhnikov · 2017-06-16T20:25:48Z

src/mlpack/core/math/random.hpp

+/**
+ * Generates a binomial random number in the specified range.
+ */
+template<typename InputDatatype, typename OutputDatatype>


I think this function is not needed (e.g. since arma::mat::size() and others return the dimensions).

I agree, you can just use for_eachor imbue to do the same without introducing another function.

Okay i get it. I will remove this now.

lozhnikov · 2017-06-16T20:30:23Z

src/mlpack/core/optimizers/cdk/cdk_impl.hpp

+      if (shuffle) // Determine order of visitation.
+        visitationOrder = arma::shuffle(visitationOrder);
+    }
+


Well, right now the implementation of the CDk optimizer is similar to the SGD optimizer. I guess there is no need to duplicate code.

Hmmm, i agree for the most part. But the problem remains that sgd function utilities a evaluate function as stopping criteria. Maybe we can set the tolerance to very low so that the difference evaluate function of the previous and current function is never less than the tolerance level but we cannot attach any guarantees with that. Also i will look at the ssRBM and let you know if we nedd the cdk optimizer or not.

If I remember right the ssRBM paper don't mention the stopping criteria. You can use the SGD optimizer with tolerance < 0.

lozhnikov · 2017-06-16T20:34:49Z

src/mlpack/methods/ann/layer/binary_rbm_impl.hpp

-  ownBias = arma::mat(weights.memptr() + weight.n_elem, inSize, false, false);
-  otherBias = arma::mat(weight.memptr() + weight.n_elem + inSize, outSize, false, false);
-
+  if(typeVisible)


I think there are some errors since visible.inSize = hidden.outSize and visible.outSize = hidden.inSize.

Do you mean to say here that when we are initializing the 1.hidden layer(outsize, insize, 0).
According to that i think you might be correct but i wrote it with the idea that 2.hidden layer(insize, outsize, 0). Should i change this to keeping the 1 in mind.

I think inSize should denote the number of input units. I guess if the meaning of inSize depends on the type of the level it is confusing.

lozhnikov · 2017-06-16T20:44:44Z

src/mlpack/methods/ann/vanilla_rbm_impl.hpp

+  VisibleLayerType, 
+  HiddenLayerType>::Evaluate(const arma::mat& /* parameters*/, const size_t i)
+{
+  arma::mat freeEnergy, freeEnergyCorrupted, output;


Could you send me the link to the source?
P.S. FreeEnergy() returns double.

Ohhh sorry. I would change this. So there are 2 sources i am following

https://github.com/scikit-learn/scikit-learn/blob/14031f6/sklearn/neural_network/rbm.py#L28

http://deeplearning.net/tutorial/code/rbm.py you can have look at the pseudo likelihood cost function

lozhnikov

I commented out some issues.
I didn't point out style guide issues (check spaces, variable names and so on).

lozhnikov · 2017-06-19T17:12:50Z

src/mlpack/methods/ann/layer/binary_rbm_impl.hpp

+{ 
+  if(Parameters().empty())
+    Reset();
+  else


I guess you forgot braces.

lozhnikov · 2017-06-19T17:35:33Z

src/mlpack/methods/ann/vanilla_rbm.hpp

+   */
+  void CalcGradient(arma::mat&& input, arma::mat&& output)
+  {
+    arma::mat temp1, temp2;


Could you reduce memory allocations here? As for me it is better to define these temporary variables as class members.

lozhnikov · 2017-06-19T17:36:46Z

src/mlpack/methods/ann/vanilla_rbm_impl.hpp

+  for(size_t j = 0; j < num_steps - 1; j++)
+  {
+    // Use probabilties for updation till the last step(section 3 hinton)
+    ForwardVisible(std::move(temp), std::move(temp1));


I think you forgot to sample variables after each Forward() call.

I think you can read about it here. https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf.
But earlier i was doing just sampleVisible and sampleHidden for numSteps then also the result was not converging

kris-singh · 2017-06-19T19:51:49Z

How can i fix the appveyor check. I want to add a new test to ann_layer_test.cpp. Also all the style checks on my files pass i don't understand why the test is failing.

zoq · 2017-06-19T20:03:55Z

If you rebase the PR against the latest master branch the style checks should pass and fix the appveyor and travis build issue. For more information: https://github.com/edx/edx-platform/wiki/How-to-Rebase-a-Pull-Request

zoq · 2017-06-20T10:06:54Z

src/mlpack/core/math/random.hpp

+/**
+ * Generates a binomial random number in the specified range.
+ */
+template<typename InputDatatype, typename OutputDatatype>


I agree, you can just use for_eachor imbue to do the same without introducing another function.

zoq · 2017-06-20T10:21:36Z

src/mlpack/core/optimizers/cdk/cdk_impl.hpp

+    {
+      std::cout << rbm.Evaluate(iterate, orderFunction) << std::endl;
+      // Update the step
+      iterate += stepSize * cumgradient;


I haven't checked the Gradient function yet, but should we subtract or add here? Also, should we divide cumgradient by batchSize?

I did subtract and divide the cumgradient by batchsize but it didn't help with the training. Thought i agree with the scaling by batch size if you look at the update for the RBM they the gradients are added to the parameters.

zoq · 2017-06-20T12:28:18Z

src/mlpack/tests/rbm_network_test.cpp

+      gaussian, visible, hidden);
+  CDK<RBM<GaussianInitialization, BinaryLayer<>, BinaryLayer<> >> cdk(model,
+      100, 0.1, 15 * trainData.n_cols, 20, true, true);
+  model.Train(trainData, cdk);


Since you used std::move() inside the RBM constructor you can't reuse trainData here. You could remove std::move from the RBM constructor or pass a new copy to the Train function.

I have remove the reference in the model constructor now.

zoq · 2017-06-20T13:13:04Z

src/mlpack/methods/ann/vanilla_rbm_impl.hpp

+    freeEnergy = FreeEnergy(std::move(currentInput));
+    currentInput(corruptIdx) =  1 - predictors(corruptIdx);
+    freeEnergyCorrupted = FreeEnergy(std::move(currentInput));
+    output(i) = SoftplusFunction::Fn(freeEnergyCorrupted - freeEnergy + 1);


orderFunction holds the visitation order of the current batch and is of size batchSize right? So let's say orderFunction is [120, 121, 122, ...] I guess I run into an issue if I do output(i)with i = 120 since output has the same size as orderFunction which is batchSize. Maybe I missed something?

So I guess, you meant to use the index of the current iteration instead of using the value?

Also, we should avoid auto here, I don't think it makes it easier for someone trying to understand the code if you have to lookup the type.

Yes i meant to use the iteration index here for the output. But funnily the code actually does not give error on this which is strange.

What test case do you use? It will not fail if numFunctions = batchSize which I think is the case for the SGDFunction.

kris-singh · 2017-06-21T00:46:38Z

The training error / evaluate function now only increases rather than going up and down.
The samples are also always one.

lozhnikov

I think I found the fix. Try the following changes.

lozhnikov · 2017-06-21T12:07:00Z

src/mlpack/core/optimizers/cdk/cdk_impl.hpp

+  // Now iterate!
+  arma::mat gradient(iterate.n_rows, iterate.n_cols);
+  arma::mat cumgradient(iterate.n_rows, iterate.n_cols);
+


You forgot to add here

cumgradient.zeros();

Or just add to the constructor arma::fill::zeros.

lozhnikov · 2017-06-21T12:09:51Z

src/mlpack/core/optimizers/cdk/cdk_impl.hpp

+
+    if (i % batchSize == 0)
+    {
+      std::cout << overallCost << std::endl;


There is no need to print overallCost each time. I think it is reasonable to print that after each training epoch (in order to debug of course).
I suggest the following:

if (shuffle) rbm.Gradient(visitationOrder[currentFunction], gradient); else rbm.Gradient(currentFunction, gradient); cumgradient += gradient; if (shuffle) overallCost +=rbm.Evaluate(iterate, visitationOrder[currentFunction]); else overallCost +=rbm.Evaluate(iterate, currentFunction); if (i % batchSize == 0) { // Update the step iterate -= stepSize * (cumgradient / batchSize); cumgradient.zeros(); } if (i % numFunctions == 0) { overallCost /= numFunctions; std::cout << overallCost << std::endl; overallCost = 0; }

lozhnikov · 2017-06-21T12:15:35Z

src/mlpack/methods/ann/vanilla_rbm_impl.hpp

+  // Collect the negative gradients
+  CalcGradient(std::move(negativeSamples), std::move(negativeGradient));
+
+  output = positiveGradient - negativeGradient;


This is the main issue: the sign is invalid.
So write the following

output = -(positiveGradient - negativeGradient);

Or change the signs of positiveGradient and negativeGradient.
If you calculate the derivatives of the free energy function you will see that they have got the minus sign in the beginning.

lozhnikov · 2017-06-21T12:16:45Z

src/mlpack/tests/rbm_network_test.cpp

+BOOST_AUTO_TEST_CASE(VanillaNetworkTest)
+{
+  arma::mat dataset;
+  dataset.load("/Users/kris/Desktop/GsoC/mlpack/src/mlpack/tests/data/mnist_first250_training_4s_and_9s.arm");


Of course here should be just dataset.load("mnist_first250_training_4s_and_9s.arm");

lozhnikov · 2017-06-21T12:21:43Z

If I forget to comment something, here is the patch that seems works. Don't commit this patch (there are
a lot of debug prints and so on).
patch.zip

lozhnikov · 2017-06-29T20:22:12Z

Try the following:

arma::mat("1.0 0.0 1.0;"
          "1.0 0.0 0.0;"
          /*etc*/);

lozhnikov · 2017-06-30T08:41:46Z

src/mlpack/tests/rbm_network_test.cpp

+  model.Train(trainData, msgd);
+  model.Reset();
+  // Set the parmaeters from a learned rbm
+  model.Parameters() = {-0.23224054, -0.23000632,


Could you also fix the initializer list here?

lozhnikov · 2017-06-30T08:42:02Z

src/mlpack/tests/rbm_network_test.cpp

+      model.HiddenLayer().Weight()), 0, 1e-2);
+
+  // Check free energy
+  arma::vec freeEnergy = {-0.87523715, 0.50615066, 0.46923476, 1.21509084};


lozhnikov · 2017-06-30T12:30:23Z

Regarding the serialization test:
You couldn't compile the test since the BinaryLayer class contains constants such as inSize, outSize and typeVisible. So, you should remove const modifiers.

kris-singh · 2017-07-01T01:31:23Z

Only krann test is failing so you can review this.

lozhnikov

Sorry for the slow response. I noticed that the regression accuracy is too low (about 10%). So, I looked through the test and I found a critical error in the data loading code (see below). I fixed that and I got the following results:
regression accuracy is equal to 67.3267
RBM accuracy is equal to 26.2695.

I tried to use the digits dataset (the dataset form the scikit example) and I got 76.1825 and 27.6155 for the regression and the RBM respectively. So, I think you did an error somewhere. I didn't' dig into that, but I pointed out some issues below.

lozhnikov · 2017-07-03T10:10:29Z

src/mlpack/tests/rbm_network_test.cpp

+
+  for (size_t i = 0; i < testLabels.n_rows; ++i)
+  {
+    testLabels(i) = arma::as_scalar(testLabels.row(i));


I found a critical error. That's equal to testLables = testLabels.

I did not understand your point here can you elaborate a bit.
I have used arma::as_scalar for converting the 1x1 matrix to a scalar quantity

You don't initialize testLabels properly. You initialize testLabelsTemp but you don't copy that to testLabels, so testLabels is equal to zero. Moreover arma::as_scalar(testLabels.row(i)) is equal to testLabels(i) since testLabels is a vector.

okay got your point so
testLabels(i) = arma::as_scalar(testLabelsTemp.row(i));
even with this i am getting the classification accuarcy for softmax as 10%
and for rbm as 84%.
Are you using the same dataset that i commited.

Yes, I use the same dataset. And I build the project from scratch.
If I change
testLabels(i) = arma::as_scalar(testLabels.row(i));
to
testLabels(i) = testLabelsTemp(i);
the accuracy becomes 27% for the RBM.
But I think you didn't notice another error:
for (size_t i = 0; i < testLabels.n_rows; ++i)
testLabels is a row, so testLabels.n_rows is equal to 1.
In such a way, you should write for (size_t i = 0; i < testLabelsTemp.n_rows; ++i).

When I fixed that I got 67% and 27% for regression and RBM respectively.

lozhnikov · 2017-07-03T10:15:49Z

src/mlpack/tests/rbm_network_test.cpp

+  arma::mat trainData, testData, dataset;
+  arma::vec trainLabelsTemp, testLabelsTemp;
+  data::Load("mnist_data_small.csv", dataset, true);
+  trainData = dataset.submat(0, 0,


As for me it is much better to divide the dataset by four parts e.g. test.csv, train.csv, test_labels.csv, train_labels.csv since your way doesn't allow anyone else to use the dataset as the the splitting algorithm is not documented anywhere.

lozhnikov · 2017-07-03T10:26:18Z

src/mlpack/methods/ann/rbm_impl.hpp

+  weightNegativeGrad = hiddenBiasNegativeGrad * negativeSamples.t();
+  visibleBiasNegativeGrad = negativeSamples;
+
+  output = (positiveGradient - negativeGradient);


I guess the sign is invalid since you switched to the mini-batch SGD optimizer.

lozhnikov · 2017-07-03T10:32:42Z

src/mlpack/methods/ann/rbm_impl.hpp

+
+  // Train the model.
+  Timer::Start("rbm_optimization");
+  const double out = optimizer.Optimize(*this, parameter);


Tthe out variable is not used.

lozhnikov · 2017-07-03T10:35:09Z

src/mlpack/tests/rbm_network_test.cpp

+      "-0.23224054, -0.23000632, -0.25701271, -0.25122418, -0.20716651"
+      "-0.20962217, -0.59922456, -0.60003836, -0.6, -0.625, -0.475;");
+  // Check Weight Shared
+  BOOST_REQUIRE_CLOSE(arma::accu(model.VisibleLayer().Weight() -


I think the weights should be equal. Why did you choose 1e-2 accuracy? The precision of double around 1 is equal to 1e-16 so you can set something like 1e-14.

lozhnikov · 2017-07-03T10:36:08Z

src/mlpack/tests/rbm_network_test.cpp

+  RBM<GaussianInitialization, BinaryLayer<>, BinaryLayer<>> model(trainData,
+      gaussian, visible, hidden, 1,  true);
+  MiniBatchSGD msgd(10, 0.06, trainData.n_cols * 20, 0, true);
+  model.Train(trainData, msgd);


I don't understand why you train the model if you don't use the parameters.

lozhnikov · 2017-07-03T10:37:46Z

src/mlpack/tests/rbm_network_test.cpp

+    calcultedFreeEnergy(i) = model.FreeEnergy(std::move(trainData.col(i)));
+  }
+
+  BOOST_REQUIRE_LE(arma::accu(calcultedFreeEnergy - freeEnergy), 1);


I think here should be BOOST_REQUIRE_CLOSE()

kris-singh · 2017-07-03T14:36:44Z

Strange the accuray i was getting was around 90% i will check again and let you know..

lozhnikov · 2017-07-04T15:05:57Z

src/mlpack/tests/rbm_network_test.cpp

+  testData = testData.t();
+
+  arma::mat output, XRbm(hiddenLayerSize, trainData.n_cols),
+      YRbm(hiddenLayerSize, trainLabels.n_cols);


I think I have found the fix. You don't define the dimensions of YRbm properly.
Try the following:
YRbm(hiddenLayerSize, testData.n_cols)

kris-singh · 2017-07-06T12:36:06Z

Ready to merge.

lozhnikov · 2017-07-06T21:28:53Z

src/mlpack/tests/rbm_network_test.cpp

+  int hiddenLayerSize = 100;
+  arma::mat trainData, testData, dataset;
+  arma::vec trainLabelsTemp, testLabelsTemp;
+  data::Load("mnisttrain.txt", trainData, true);


If I am not mistaken, you didn't upload the MNIST dataset, you uploaded the Digit dataset that I sent you. In such a way the file names should look like digit_train.txt, digit_train_labels.txt etc.

Moreover you uploaded the same binary files two times: in 646c6c6 and in c2bd07c. So, I guess you should do git-rebase.

The official webpage of the dataset (http://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits) points out the citation policy (http://archive.ics.uci.edu/ml/citation_policy.html). So, I think we have to refer to the repo in the source code.

I'm curious, have you tested to store the datasets as arma_binary? I think it would be great if we could keep the size of the repo and datasets low.

zoq · 2017-07-06T22:09:45Z

src/mlpack/methods/ann/layer/binary_layer_impl.hpp

+{
+  Forward(std::move(input), std::move(output));
+  for (size_t i = 0; i < output.n_elem; i++)
+    output(i) = math::RandBernoulli(output(i));


I'm wondering if using arma::randi<arma::mat>(10, 10, arma::distr_param(0, 1)); is faster.

I tested this and rand Bernoulli is faster

Okay, thanks for checking.

kris-singh · 2017-07-07T18:35:46Z

If they are no further comments i think you can merge this.

lozhnikov · 2017-07-08T07:06:43Z

Could you add binary data in a separate commit?

lozhnikov · 2017-07-08T09:27:30Z

src/mlpack/tests/rbm_network_test.cpp

+  arma::mat trainLabelsTemp, testLabelsTemp;
+  trainData.load("digits_train.arm");
+  testData.load("digits_test.arm");
+  trainLabelsTemp.load("digits_train_label.arm");


It seems I overlooked that. There is no sense in binary labels since each double utilize 8 bytes and each label is a digit. So, plain labels should take 2*numLabels bytes (digit+separator). In my opinion it is better to save arma::Row<size_t> in plain format. That should simplify code since that allow to avoid temporary matrices (such as trainLabelsTemp and testLabelsTemp).

P.S. On second thoughts, that is not critical, so I think you can leave that as is since the dataset is added in a separate commit.

lozhnikov · 2017-07-08T09:41:53Z

src/mlpack/tests/rbm_network_test.cpp

+  trainLabelsTemp.load("digits_train_label.arm");
+  testLabelsTemp.load("digits_test_label.arm");
+
+  std::cout << arma::size(testLabelsTemp) << std::endl;


Could you remove all unnecessary output?

lozhnikov · 2017-07-12T21:49:13Z

I'll close this PR since the code was merged into PR #1046 (SpikeSlabRBM) and the code was much refactored in #1046 . So the binary RBM will be merged together with the spike and slab RBM.

kris-singh changed the title ~~[WIP}Restricted Boltzmann Machine~~ [WIP]Restricted Boltzmann Machine Jun 14, 2017

kris-singh force-pushed the Binary_RBM branch from 2206c6b to a13b0a2 Compare June 14, 2017 19:23

Binary Rbm

5a51869

kris-singh force-pushed the Binary_RBM branch from a13b0a2 to 5a51869 Compare June 14, 2017 19:32

lozhnikov reviewed Jun 14, 2017

View reviewed changes

Make minor fixes with changed version of sharing parameters

4836743

lozhnikov reviewed Jun 16, 2017

View reviewed changes

kris-singh added 3 commits June 18, 2017 04:00

Add test for training the model

529a299

Add weight tranpose in hidden layer

8c8a236

Add Batch Training& Gibbs

0ea3dfd

lozhnikov reviewed Jun 19, 2017

View reviewed changes

kris-singh added 2 commits June 20, 2017 01:05

Fix Style

6bf07ef

Minor Style Fix

eca7e32

kris-singh force-pushed the Binary_RBM branch from b92bf06 to eca7e32 Compare June 19, 2017 20:34

Merge Master

acabb9f

kris-singh force-pushed the Binary_RBM branch from 1b59bf3 to f4260f4 Compare June 19, 2017 23:32

Minor Style Fix and Merge with Master

44b0866

zoq reviewed Jun 20, 2017

View reviewed changes

Change the evaluation function

d223942

kris-singh force-pushed the Binary_RBM branch from f4260f4 to d223942 Compare June 21, 2017 00:22

lozhnikov reviewed Jun 21, 2017

View reviewed changes

Fix intializer

030903a

lozhnikov reviewed Jun 30, 2017

View reviewed changes

kris-singh added 2 commits June 30, 2017 23:37

Fix serialisation test

0a962fd

Minor Style Fix

65a5cde

lozhnikov reviewed Jul 3, 2017

View reviewed changes

kris-singh force-pushed the Binary_RBM branch from 1529c00 to 5e3fe9b Compare July 4, 2017 12:43

lozhnikov reviewed Jul 4, 2017

View reviewed changes

Fix the classification test

2d7a1b8

kris-singh force-pushed the Binary_RBM branch 2 times, most recently from 1425a40 to c2bd07c Compare July 6, 2017 07:57

lozhnikov reviewed Jul 6, 2017

View reviewed changes

zoq reviewed Jul 6, 2017

View reviewed changes

kris-singh force-pushed the Binary_RBM branch from c2bd07c to dddd129 Compare July 7, 2017 16:59

kris-singh force-pushed the Binary_RBM branch from f59180c to ad37f4b Compare July 7, 2017 21:54

Rename dataset, Remove old dataset

a3eda16

kris-singh force-pushed the Binary_RBM branch from ad37f4b to 6494add Compare July 7, 2017 22:02

Add Binary Data

8b575c3

kris-singh force-pushed the Binary_RBM branch from 6494add to 8b575c3 Compare July 8, 2017 08:16

Remove std::cout form network test

8724892

lozhnikov reviewed Jul 8, 2017

View reviewed changes

lozhnikov closed this Jul 12, 2017

[WIP]Restricted Boltzmann Machine #1027

[WIP]Restricted Boltzmann Machine #1027

Conversation

kris-singh commented Jun 14, 2017

kris-singh commented Jun 14, 2017

lozhnikov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lozhnikov Jun 15, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kris-singh Jun 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kris-singh Jun 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lozhnikov commented Jun 15, 2017 • edited Loading

kris-singh commented Jun 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lozhnikov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kris-singh commented Jun 19, 2017

zoq commented Jun 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kris-singh commented Jun 21, 2017

lozhnikov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lozhnikov Jun 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lozhnikov commented Jun 21, 2017

lozhnikov commented Jun 29, 2017 • edited Loading

Choose a reason for hiding this comment

lozhnikov Jun 15, 2017 •

edited

Loading

kris-singh Jun 16, 2017 •

edited

Loading

kris-singh Jun 16, 2017 •

edited

Loading

lozhnikov commented Jun 15, 2017 •

edited

Loading

lozhnikov Jun 21, 2017 •

edited

Loading

lozhnikov commented Jun 29, 2017 •

edited

Loading

lozhnikov left a comment •

edited

Loading