New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation - Find and Fill Method for Dropout Layer #3684
base: master
Are you sure you want to change the base?
Implementation - Find and Fill Method for Dropout Layer #3684
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we split this one into two implementations, like #3683? The numbers in #3662 show that this implementation is actually a slowdown, although the implementation here is general enough to be used with Bandicoot too.
(Requesting changes just because I don't want to merge a slowdown without a bit of discussion)
it is up to you, usually the dropout is happening once to my understanding per epoch, while there is a slow down, but it is not going to affect the general speed of training. |
It will happen every step of the forward pass, so the slowdown may be noticeable. One quick test would just be to run the |
okay it makes sense then to do the benchmark |
@rcurtin @shrit Absolutely, I'm on the same page. I was actually leaning towards running additional benchmarks for that exact reason before. I've got a few ideas to potentially optimize the algorithm, and I'm eager to test them out in some upcoming benchmarks. Fingers crossed, I should be able to run them later today :) |
@rcurtin @shrit The new algorithm appears slightly faster than the old one when processing small matrices but significantly slower with larger matrices (also visible in the graph in the discussion before). If it's not necessary to replace it, I wouldn't. However, if we need to eliminate the transform function, this could be a viable alternative. For mnist_simple:
For mnist_cnn:
|
@MarkFischinger Okay, I see the benchmark, in this case it would make sense to have both implementation. Please update this PR by following a similar structure that is done in this #3683 |
@shrit In the latest commit, I've implemented both algorithms using a structure similar to what you did in #3683. I have a question about that: Why do we wrap the entire function within the #ifndef directive? Could it be placed inside the function instead? Is there a specific reason for this that I might not know about yet? I'd appreciate understanding your reasoning behind that :) |
@MarkFischinger the other implementation that you added is for bandicoot library. It is a linear algebra library that we are developing (it has the same API as armadillo) to provide GPU acceleration for matrices operations. |
@shrit Thanks for your response! :) Sorry if my initial question wasn’t as clear. I've understood the general concept, but I'm specifically curious about the decision to place the Because my first thought was to implement it directly inside like this (for #3683 for example): template<typename MatType>
void LogSoftMaxType<MatType>::ForwardImpl(const MatType& input,
MatType& output)
{
MatType maxInput = repmat(max(input), input.n_rows, 1);
output = (maxInput - input);
ApplyExponentialFunction(output);
maxInput.each_row() += log(sum(output));
output = input - maxInput;
}
template<typename MatType>
void LogSoftMaxType<MatType>::ApplyExponentialFunction(MatType& output)
{
#ifdef MLPACK_HAS_COOT // <-- Inside the function like this
if constexpr (coot::is_coot_type<MatType>::value)
{
output = exp(output * -1);
}
else
#endif
{
// Approximation of the base-e exponential function. Credits to Leon Bottou.
output.transform([](double x)
{
static constexpr double A0 = 1.0, A1 = 0.125, A2 = 0.0078125, A3 = 0.00032552083, A4 = 1.0172526e-5;
if (x < 13.0)
{
double y = A0 + x * (A1 + x * (A2 + x * (A3 + x * A4)));
y *= y; y *= y; y *= y; return 1 / y;
}
return 0.0;
});
}
} I couldn't find any best practices in C++ on this topic on the web. Do you know which is the preferred implementation? |
@MarkFischinger we can not use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just some comments on the multiple implementations whenever you have a chance. 👍
* @param input Input data used for evaluating the specified function. | ||
* @param output Resulting output activation. | ||
*/ | ||
void ForwardImpl(const MatType& input, MatType& output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll have to use SFINAE here, e.g. add an argument like const typename std::enable_if_t<arma::is_arma_type<MatType>::value>::type* = 0
(I just did that from memory, it may not be exactly right).
@@ -74,10 +74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other) | |||
return *this; | |||
} | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for an extra line 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this one is still missing.
*/ | ||
void ForwardImpl(const MatType& input, MatType& output); | ||
|
||
#ifdef MLPACK_HAS_COOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we actually need this guard here: we want to use the optimized implementation when we have an Armadillo matrix, and the general implementation otherwise. So long as the second implementation of ForwardImpl()
doesn't use the name coot::
inside of it (and it is possible to avoid that), we should be able to avoid the need for MLPACK_HAS_COOT
, and just use SFINAE here too with a negated condition (i.e. std::enable_if_t<!arma::is_arma_type< ...
).
mask.randu(input.n_rows, input.n_cols); | ||
arma::uvec indices = arma::find(mask > ratio); | ||
mask.zeros(); | ||
mask.elem(indices).fill(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mask.elem(indices).fill(1); | |
mask.elem(find(mask > ratio)).fill(1); |
Would this work? It avoids the use of arma::
here, and would make this function fully general. But I think then we would need to find a way to get rid of the .zeros()
call... maybe can we just do mask = (mask > ratio)
? Or something along those lines? (It would be interesting to see the speed of that approach.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rcurtin You can take a look at my implementation in the latest commit. It should be a bit faster and is fully functional. I avoided using arma:: to keep it more general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, it looks good to me! Thanks for doing that.
@@ -74,10 +74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other) | |||
return *this; | |||
} | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this one is still missing.
template<typename T = MatType, typename std::enable_if_t<arma::is_arma_type<T>::value, int> = 0> | ||
void ForwardImpl(const T& input, T& output); | ||
|
||
/** | ||
* General implementation of the forward pass of the dropout layer. | ||
* | ||
* @param input Input data used for evaluating the specified function. | ||
* @param output Resulting output activation. | ||
*/ | ||
template<typename T = MatType, typename std::enable_if_t<!arma::is_arma_type<T>::value, int> = 0> | ||
void ForwardImpl(const T& input, T& output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarkFischinger
Looks better, just use directly MatType
instead of T
. It is better for readability because T
does not mean a lot.
Also, could you break the line? mlpack code base is usually under 80 lines of code.
Also, the syntax is correct, I would remove int
and instead add a *
after >
so the signature looks as follows:
typename std::enable_if_t<!arma::is_arma_type<T>::value>* = 0>
The reason for this is to have the same signature over all the code base, making it easier for everyone.
Following our discussion in issue #3662, I've implemented the new algorithm in the dropout layer.