New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpikeSlabRBM #1046

Closed
wants to merge 62 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@kris-singh
Contributor

kris-singh commented Jul 3, 2017

1)Right now this implements ssRBM layer types.(ssRBM.hpp is not ready i will finish that by tonight or so)

kris-singh added some commits Jun 14, 2017

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Aug 7, 2017

Contributor

Should I add a test for float data type that checks the classification accuracy with float type.

Looks reasonable to me.

Should i add the test for cifar images or should we should only test this for mnist.

I don't like this idea since it requires to add new dataset to the repo.

Contributor

lozhnikov commented Aug 7, 2017

Should I add a test for float data type that checks the classification accuracy with float type.

Looks reasonable to me.

Should i add the test for cifar images or should we should only test this for mnist.

I don't like this idea since it requires to add new dataset to the repo.

@lozhnikov

I pointed out some issues.

Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
namespace ann {
template<typename DataType>
BinaryRBMPolicy<DataType>

This comment has been minimized.

@lozhnikov

lozhnikov Aug 7, 2017

Contributor

Correct if I am mistaken, but it seems according to the style guide we should write
BinaryRBMPolicy<DataType>::BinaryRBMPolicy(
without line-breaks.

@lozhnikov

lozhnikov Aug 7, 2017

Contributor

Correct if I am mistaken, but it seems according to the style guide we should write
BinaryRBMPolicy<DataType>::BinaryRBMPolicy(
without line-breaks.

This comment has been minimized.

@zoq

zoq Aug 15, 2017

Member

I agree, but I guess I'm biased.

@zoq

zoq Aug 15, 2017

Member

I agree, but I guess I'm biased.

Show outdated Hide outdated src/mlpack/methods/ann/rbm/binary_rbm_policy_impl.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm/spike_slab_rbm_policy.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm/spike_slab_rbm_policy_impl.hpp
Show outdated Hide outdated src/mlpack/tests/rbm_network_test.cpp
@zoq

Mainly comment on a bunch of pedantic style issues, nice work!

Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm/binary_rbm_policy.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm/binary_rbm_policy_impl.hpp
// visible penalty 1 * 1 => D * D(when used)
visiblePenalty = DataType(parameter.memptr() + weight.n_elem +
spikeBias.n_elem, 1, 1, false, false);

This comment has been minimized.

@zoq

zoq Aug 8, 2017

Member

Should we use visibleSize size here, since that's what is used here:

parameter.set_size(visibleSize * hiddenSize * poolSize + visibleSize + hiddenSize);
@zoq

zoq Aug 8, 2017

Member

Should we use visibleSize size here, since that's what is used here:

parameter.set_size(visibleSize * hiddenSize * poolSize + visibleSize + hiddenSize);

This comment has been minimized.

@kris-singh

kris-singh Aug 8, 2017

Contributor

visibleSize indicated the number of neurons in the visible layer.
Though the visiblePenalty we are using is 1 * 1 scalar. This has to be actually DD diagonal matrix but we reduced it to 11 scalar since for better performance we end using scalar * Identity matrix for visiblePenalty

@kris-singh

kris-singh Aug 8, 2017

Contributor

visibleSize indicated the number of neurons in the visible layer.
Though the visiblePenalty we are using is 1 * 1 scalar. This has to be actually DD diagonal matrix but we reduced it to 11 scalar since for better performance we end using scalar * Identity matrix for visiblePenalty

Show outdated Hide outdated src/mlpack/methods/ann/rbm/spike_slab_rbm_policy_impl.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm/spike_slab_rbm_policy_impl.hpp

kris-singh added some commits Aug 7, 2017

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 10, 2017

Contributor

Fixed all of the changes requested. Let me know if some other changes are needed.

Contributor

kris-singh commented Aug 10, 2017

Fixed all of the changes requested. Let me know if some other changes are needed.

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 11, 2017

Contributor

Don't understand why Travis second test case is failing.

Contributor

kris-singh commented Aug 11, 2017

Don't understand why Travis second test case is failing.

@zoq

This comment has been minimized.

Show comment
Hide comment
@zoq

zoq Aug 15, 2017

Member

About the travis error, it looks like the time to run the RBM test is pretty high, do you think we can reduce the time shomehow?

Member

zoq commented Aug 15, 2017

About the travis error, it looks like the time to run the RBM test is pretty high, do you think we can reduce the time shomehow?

Show outdated Hide outdated src/mlpack/methods/ann/layer/layer.hpp
* RBM class
*
* @tparam IntialiserType rule to intialise the parameters of the network
* @tparam RBMPolicy type of rbm

This comment has been minimized.

@zoq

zoq Aug 15, 2017

Member

Do you mind to be a little bit more descriptive about the parameter.

@zoq

zoq Aug 15, 2017

Member

Do you mind to be a little bit more descriptive about the parameter.

/*
* RBM class
*
* @tparam IntialiserType rule to intialise the parameters of the network

This comment has been minimized.

@zoq

zoq Aug 15, 2017

Member

Pedantic style issue; missing . at the end and start with a capital letter.

@zoq

zoq Aug 15, 2017

Member

Pedantic style issue; missing . at the end and start with a capital letter.

Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm/spike_slab_rbm_policy_impl.hpp
Show outdated Hide outdated src/mlpack/methods/ann/rbm_impl.hpp
RBM<GaussianInitialization, SpikeSlabRBMPolicy<>> modelssRBM(trainData,
gaussian, ss_rbm, 1, 1, true, false);
size_t numRBMIterations = trainData.n_cols * numEpoches;

This comment has been minimized.

@zoq

zoq Aug 15, 2017

Member

Maybe we decrease the number of epochs to decrease the test time?

@zoq

zoq Aug 15, 2017

Member

Maybe we decrease the number of epochs to decrease the test time?

This comment has been minimized.

@zoq

zoq Aug 23, 2017

Member

Have you tested the code with a lower number of iterations? Currently you use 17970 iterations, but maybe we can get reasonable result with less iterations?

@zoq

zoq Aug 23, 2017

Member

Have you tested the code with a lower number of iterations? Currently you use 17970 iterations, but maybe we can get reasonable result with less iterations?

This comment has been minimized.

@kris-singh

kris-singh Aug 23, 2017

Contributor

So i was able to reduce the num epochs to 22 with the hidden layer size 100. And this was still beating the softmax classification function. But yes the time decrease was not that much on my local machine. Though i would push that code and see the results of travis.

@kris-singh

kris-singh Aug 23, 2017

Contributor

So i was able to reduce the num epochs to 22 with the hidden layer size 100. And this was still beating the softmax classification function. But yes the time decrease was not that much on my local machine. Though i would push that code and see the results of travis.

This comment has been minimized.

@zoq

zoq Aug 24, 2017

Member

I think we don't have to beat the Softmax function, if the results are reasonable that's probably fine in this case.

@zoq

zoq Aug 24, 2017

Member

I think we don't have to beat the Softmax function, if the results are reasonable that's probably fine in this case.

Show outdated Hide outdated src/mlpack/tests/rbm_network_test.cpp
Show outdated Hide outdated src/mlpack/tests/serialization_test.cpp
@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Aug 16, 2017

Contributor

@zoq That's a known issue. Right now the ssRBM is ten times slower than the binary RBM. I tried to investigate why the ssRBM is so slow. gprof tells that arma::eop_core::apply_inplace_plus wastes 69% of time. My guess is that armadillo allocates some temporary memory. I didn't dig into that yet.

Contributor

lozhnikov commented Aug 16, 2017

@zoq That's a known issue. Right now the ssRBM is ten times slower than the binary RBM. I tried to investigate why the ssRBM is so slow. gprof tells that arma::eop_core::apply_inplace_plus wastes 69% of time. My guess is that armadillo allocates some temporary memory. I didn't dig into that yet.

void SpikeSlabRBMPolicy<DataType>::Reset()
{
// Weight shape D * K * N
weight = arma::Cube<ElemType>(parameter.memptr(),

This comment has been minimized.

@zoq

zoq Aug 23, 2017

Member

Do you think it might be possible to use arma::mat instead of arma::cube for the weights, that way we could vectorize some expressions (avoid a for loop), since you can't e.g. multiply a cube with a matrix and store the result as matrix. It looks like there are a lot of places that would come in handy.

@zoq

zoq Aug 23, 2017

Member

Do you think it might be possible to use arma::mat instead of arma::cube for the weights, that way we could vectorize some expressions (avoid a for loop), since you can't e.g. multiply a cube with a matrix and store the result as matrix. It looks like there are a lot of places that would come in handy.

This comment has been minimized.

@kris-singh

kris-singh Aug 23, 2017

Contributor

So, if understand this right i think you are proposing the we weight = arma::Mat<ElemType>(parameter.memeptr(), visibleSize, poolSize * hiddenSize);
and replace all the expr that were usign the weight matrix by weight.subMat(....,....). I still think we would require the for loop though.

@kris-singh

kris-singh Aug 23, 2017

Contributor

So, if understand this right i think you are proposing the we weight = arma::Mat<ElemType>(parameter.memeptr(), visibleSize, poolSize * hiddenSize);
and replace all the expr that were usign the weight matrix by weight.subMat(....,....). I still think we would require the for loop though.

This comment has been minimized.

@zoq

zoq Aug 24, 2017

Member

Using .cols and .each_col but you are right we can't avoid for loops entirely.

@zoq

zoq Aug 24, 2017

Member

Using .cols and .each_col but you are right we can't avoid for loops entirely.

This comment has been minimized.

@kris-singh

kris-singh Aug 25, 2017

Contributor

@zoq Well, I just saw the code again. Most(All) of the function's actually use weight.slice(i)
so I think we would not be able to vectorize most of the expressions. We could use submat something like this weight.submat(i * visibleSize * poolSize, 0, (i + 1) * visibleSize * poolSize , poolSize) that would not get rid of the loop but i don't if armadillo matrix is preferred over cubes for performance considerations.

@kris-singh

kris-singh Aug 25, 2017

Contributor

@zoq Well, I just saw the code again. Most(All) of the function's actually use weight.slice(i)
so I think we would not be able to vectorize most of the expressions. We could use submat something like this weight.submat(i * visibleSize * poolSize, 0, (i + 1) * visibleSize * poolSize , poolSize) that would not get rid of the loop but i don't if armadillo matrix is preferred over cubes for performance considerations.

This comment has been minimized.

@zoq

zoq Aug 25, 2017

Member

I'll take a second look, and let you know if I find an expression that could benefit from arma::mat over arma::cube, in the meantime, can you test different numbers of iterations?

@zoq

zoq Aug 25, 2017

Member

I'll take a second look, and let you know if I find an expression that could benefit from arma::mat over arma::cube, in the meantime, can you test different numbers of iterations?

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 26, 2017

Contributor

Finally, I was able to make it work with time constraints. The accuracy is also good around ~77-78%.

Contributor

kris-singh commented Aug 26, 2017

Finally, I was able to make it work with time constraints. The accuracy is also good around ~77-78%.

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Aug 26, 2017

Contributor

@kris-singh I played with parameters a little. Try the following patch
ssrbm_test.zip

Contributor

lozhnikov commented Aug 26, 2017

@kris-singh I played with parameters a little. Try the following patch
ssrbm_test.zip

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 27, 2017

Contributor

Should we merge this if everything is okay.

Contributor

kris-singh commented Aug 27, 2017

Should we merge this if everything is okay.

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Aug 27, 2017

Contributor

@kris-singh How do you think is it better to remove useMonitoringCost at all? As for me it is better to rewrite the Evaluate() function in the following way:

template<typename InitializationRuleType, typename RBMPolicy>
 double RBM<InitializationRuleType, RBMPolicy>::Evaluate(
     const arma::Mat<ElemType>& /* parameters*/,
     const size_t i)
{
  Gibbs(std::move(predictors.col(i)), std::move(negativeSamples));
  return std::fabs(FreeEnergy(std::move(predictors.col(i))) -
      FreeEnergy(std::move(negativeSamples)));
}

and remove the rest of code. Tell me if you'll fix that or I can refactor that later as soon as I merge the PR.

Regarding the merge: let me try the implementation on the CIFAR dataset, I didn't try yet, I was focused on the GAN PR.

Contributor

lozhnikov commented Aug 27, 2017

@kris-singh How do you think is it better to remove useMonitoringCost at all? As for me it is better to rewrite the Evaluate() function in the following way:

template<typename InitializationRuleType, typename RBMPolicy>
 double RBM<InitializationRuleType, RBMPolicy>::Evaluate(
     const arma::Mat<ElemType>& /* parameters*/,
     const size_t i)
{
  Gibbs(std::move(predictors.col(i)), std::move(negativeSamples));
  return std::fabs(FreeEnergy(std::move(predictors.col(i))) -
      FreeEnergy(std::move(negativeSamples)));
}

and remove the rest of code. Tell me if you'll fix that or I can refactor that later as soon as I merge the PR.

Regarding the merge: let me try the implementation on the CIFAR dataset, I didn't try yet, I was focused on the GAN PR.

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 27, 2017

Contributor

Okay i will remove the use monitoring cost. Regarding the cifar data set here is code for both creating patches and test. https://gist.github.com/kris-singh/55a4934c463bbddb3c8d321dd889d194.
I remembered that i was getting good accuracy only for a subset of data set > 75% but as I increased the size the accuracy kept falling.

Contributor

kris-singh commented Aug 27, 2017

Okay i will remove the use monitoring cost. Regarding the cifar data set here is code for both creating patches and test. https://gist.github.com/kris-singh/55a4934c463bbddb3c8d321dd889d194.
I remembered that i was getting good accuracy only for a subset of data set > 75% but as I increased the size the accuracy kept falling.

@ShikharJ

This comment has been minimized.

Show comment
Hide comment
@ShikharJ

ShikharJ Jul 27, 2018

Member

@kris-singh Thanks for your work! These commits have been re-used in #1208.
@zoq This can be closed now.

Member

ShikharJ commented Jul 27, 2018

@kris-singh Thanks for your work! These commits have been re-used in #1208.
@zoq This can be closed now.

@zoq

This comment has been minimized.

Show comment
Hide comment
@zoq

zoq Jul 27, 2018

Member

Agreed, thanks @kris-singh and @lozhnikov for the all the time you put into the code.

Member

zoq commented Jul 27, 2018

Agreed, thanks @kris-singh and @lozhnikov for the all the time you put into the code.

@zoq zoq closed this Jul 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment