Implementation - Find and Fill Method for Dropout Layer #3684

MarkFischinger · 2024-04-10T13:42:25Z

Following our discussion in issue #3662, I've implemented the new algorithm in the dropout layer.

rcurtin

Should we split this one into two implementations, like #3683? The numbers in #3662 show that this implementation is actually a slowdown, although the implementation here is general enough to be used with Bandicoot too.

(Requesting changes just because I don't want to merge a slowdown without a bit of discussion)

shrit · 2024-04-11T14:52:21Z

it is up to you, usually the dropout is happening once to my understanding per epoch, while there is a slow down, but it is not going to affect the general speed of training.
We can do the split, but I would trade the code maintenance and size compared to the number of nanoseconds we will lose 😄

rcurtin · 2024-04-11T15:05:52Z

It will happen every step of the forward pass, so the slowdown may be noticeable. One quick test would just be to run the mnist_simple and/or mnist_cnn examples with both implementations and see what the overall time per epoch is.

shrit · 2024-04-11T15:30:09Z

okay it makes sense then to do the benchmark
@MarkFischinger I hope you have some time for this if you do not mind

MarkFischinger · 2024-04-11T15:58:29Z

@rcurtin @shrit Absolutely, I'm on the same page. I was actually leaning towards running additional benchmarks for that exact reason before. I've got a few ideas to potentially optimize the algorithm, and I'm eager to test them out in some upcoming benchmarks. Fingers crossed, I should be able to run them later today :)

MarkFischinger · 2024-04-12T23:31:04Z

@rcurtin @shrit The new algorithm appears slightly faster than the old one when processing small matrices but significantly slower with larger matrices (also visible in the graph in the discussion before). If it's not necessary to replace it, I wouldn't. However, if we need to eliminate the transform function, this could be a viable alternative.

For mnist_simple:

New Algorithm:
    Duration: 78910ms
    Loss: 0.0331713
Old Algorithm:
    Duration: 79709ms
    Loss: 0.0331713

For mnist_cnn:

Old Algorithm:
    Duration: 403866ms
    Loss: 0.919632 (Epoch: 32.9s, Step: 170ms)
    Validation loss: 2894.57
New Algorithm:
    Duration: 415430ms
    Loss: 0.919632 (Epoch: 31.93s, Step: 174ms)
    Validation loss: 2894.57
    Accuracy: Train = 86.9947%, Valid = 86.5476%

shrit · 2024-04-14T10:17:04Z

@MarkFischinger Okay, I see the benchmark, in this case it would make sense to have both implementation. Please update this PR by following a similar structure that is done in this #3683

MarkFischinger · 2024-04-14T16:43:07Z

@shrit In the latest commit, I've implemented both algorithms using a structure similar to what you did in #3683. I have a question about that: Why do we wrap the entire function within the #ifndef directive? Could it be placed inside the function instead? Is there a specific reason for this that I might not know about yet? I'd appreciate understanding your reasoning behind that :)

shrit · 2024-04-14T20:38:05Z

@MarkFischinger the other implementation that you added is for bandicoot library. It is a linear algebra library that we are developing (it has the same API as armadillo) to provide GPU acceleration for matrices operations.
The idea of putting all of that in a #ifdef is to have no dependency on bandicoot, otherwise, you will not be able to run mlpack or the tests without including bandicoot header which is something we are trying to avoid.
Technically, in the future, the user only needs to define MLPACK_HAS_COOT and then the makefile will look for bandicoot.
Our objective is too not add dependencies to mlpack, since they add a massive burden regarding maintenance and compilation, and when necessary we are only adding single file header only dependencies.

MarkFischinger · 2024-04-14T22:18:14Z

@shrit Thanks for your response! :) Sorry if my initial question wasn’t as clear. I've understood the general concept, but I'm specifically curious about the decision to place the ifdef outside the function. Is this meant to make it easier to distinguish between GPU and CPU operations, thereby improving maintainability?

Because my first thought was to implement it directly inside like this (for #3683 for example):

template<typename MatType>
void LogSoftMaxType<MatType>::ForwardImpl(const MatType& input,
                                          MatType& output)
{
  MatType maxInput = repmat(max(input), input.n_rows, 1);
  output = (maxInput - input);

  ApplyExponentialFunction(output);

  maxInput.each_row() += log(sum(output));
  output = input - maxInput;
}

template<typename MatType>
void LogSoftMaxType<MatType>::ApplyExponentialFunction(MatType& output)
{
#ifdef MLPACK_HAS_COOT // <-- Inside the function like this 
  if constexpr (coot::is_coot_type<MatType>::value)
  {
    output = exp(output * -1);
  }
  else
#endif
  {
    // Approximation of the base-e exponential function. Credits to Leon Bottou.
    output.transform([](double x)
    {
      static constexpr double A0 = 1.0, A1 = 0.125, A2 = 0.0078125, A3 = 0.00032552083, A4 = 1.0172526e-5;
      if (x < 13.0)
      {
        double y = A0 + x * (A1 + x * (A2 + x * (A3 + x * A4)));
        y *= y; y *= y; y *= y; return 1 / y;
      }
      return 0.0;
    });
  }
}

I couldn't find any best practices in C++ on this topic on the web. Do you know which is the preferred implementation?

shrit · 2024-04-15T15:33:21Z

@MarkFischinger we can not use if constexpr because we do not support yet C++17

rcurtin

Looking good, just some comments on the multiple implementations whenever you have a chance. 👍

rcurtin · 2024-04-16T20:35:01Z

src/mlpack/methods/ann/layer/dropout.hpp

+   * @param input Input data used for evaluating the specified function.
+   * @param output Resulting output activation.
+   */
+  void ForwardImpl(const MatType& input, MatType& output);


You'll have to use SFINAE here, e.g. add an argument like const typename std::enable_if_t<arma::is_arma_type<MatType>::value>::type* = 0 (I just did that from memory, it may not be exactly right).

rcurtin · 2024-04-16T20:35:11Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

@@ -74,10 +74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other)
  return *this;
 }

+


Suggested change

No need for an extra line 👍

Also, this one is still missing.

rcurtin · 2024-04-16T20:36:57Z

src/mlpack/methods/ann/layer/dropout.hpp

+   */
+  void ForwardImpl(const MatType& input, MatType& output);
+
+  #ifdef MLPACK_HAS_COOT


I don't think we actually need this guard here: we want to use the optimized implementation when we have an Armadillo matrix, and the general implementation otherwise. So long as the second implementation of ForwardImpl() doesn't use the name coot:: inside of it (and it is possible to avoid that), we should be able to avoid the need for MLPACK_HAS_COOT, and just use SFINAE here too with a negated condition (i.e. std::enable_if_t<!arma::is_arma_type< ...).

rcurtin · 2024-04-16T20:37:23Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

+    mask.randu(input.n_rows, input.n_cols);
+    arma::uvec indices = arma::find(mask > ratio);
+    mask.zeros();
+    mask.elem(indices).fill(1);


Suggested change

mask.elem(indices).fill(1);

mask.elem(find(mask > ratio)).fill(1);

Would this work? It avoids the use of arma:: here, and would make this function fully general. But I think then we would need to find a way to get rid of the .zeros() call... maybe can we just do mask = (mask > ratio)? Or something along those lines? (It would be interesting to see the speed of that approach.)

@rcurtin You can take a look at my implementation in the latest commit. It should be a bit faster and is fully functional. I avoided using arma:: to keep it more general.

Nice, it looks good to me! Thanks for doing that.

shrit · 2024-04-23T13:36:33Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

@@ -74,10 +74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other)
  return *this;
 }

+


Also, this one is still missing.

shrit · 2024-04-23T13:37:52Z

src/mlpack/methods/ann/layer/dropout.hpp

+  template<typename T = MatType, typename std::enable_if_t<arma::is_arma_type<T>::value, int> = 0>
+  void ForwardImpl(const T& input, T& output);
+
+  /**
+   * General implementation of the forward pass of the dropout layer.
+   *
+   * @param input Input data used for evaluating the specified function.
+   * @param output Resulting output activation.
+   */
+  template<typename T = MatType, typename std::enable_if_t<!arma::is_arma_type<T>::value, int> = 0>
+  void ForwardImpl(const T& input, T& output);


@MarkFischinger
Looks better, just use directly MatType instead of T. It is better for readability because T does not mean a lot.
Also, could you break the line? mlpack code base is usually under 80 lines of code.
Also, the syntax is correct, I would remove int and instead add a * after > so the signature looks as follows:

typename std::enable_if_t<!arma::is_arma_type<T>::value>* = 0>

The reason for this is to have the same signature over all the code base, making it easier for everyone.

feat: updated Dropout Forward with find and fill

f2660a7

mlpack-bot bot added s: needs review s: unanswered s: unlabeled labels Apr 10, 2024

shrit approved these changes Apr 10, 2024

View reviewed changes

shrit removed s: unanswered s: unlabeled labels Apr 11, 2024

rcurtin requested changes Apr 11, 2024

View reviewed changes

MarkFischinger and others added 2 commits April 14, 2024 18:19

Merge branch 'mlpack:master' into feat/fill_and_fill_implemenation

cf6aa30

struct: implemented both versions of the algorithm

65e9914

fixed structure

7511561

shrit approved these changes Apr 16, 2024

View reviewed changes

rcurtin reviewed Apr 16, 2024

View reviewed changes

MarkFischinger added 3 commits April 23, 2024 01:38

refactor dropout layer implementation

20cf02e

fix

862c05e

remove line break

52b6e29

shrit requested changes Apr 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation - Find and Fill Method for Dropout Layer #3684

Implementation - Find and Fill Method for Dropout Layer #3684

MarkFischinger commented Apr 10, 2024

rcurtin left a comment

shrit commented Apr 11, 2024

rcurtin commented Apr 11, 2024

shrit commented Apr 11, 2024

MarkFischinger commented Apr 11, 2024

MarkFischinger commented Apr 12, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024 •

edited

shrit commented Apr 15, 2024

rcurtin left a comment

rcurtin Apr 16, 2024

rcurtin Apr 16, 2024

shrit Apr 23, 2024

rcurtin Apr 16, 2024

rcurtin Apr 16, 2024

MarkFischinger Apr 22, 2024

rcurtin Apr 24, 2024

shrit Apr 23, 2024

shrit Apr 23, 2024

		@@ -74,10 +74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other)
		return *this;
		}

	mask.elem(indices).fill(1);
	mask.elem(find(mask > ratio)).fill(1);

Implementation - Find and Fill Method for Dropout Layer #3684

Are you sure you want to change the base?

Implementation - Find and Fill Method for Dropout Layer #3684

Conversation

MarkFischinger commented Apr 10, 2024

rcurtin left a comment

Choose a reason for hiding this comment

shrit commented Apr 11, 2024

rcurtin commented Apr 11, 2024

shrit commented Apr 11, 2024

MarkFischinger commented Apr 11, 2024

MarkFischinger commented Apr 12, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024 • edited

shrit commented Apr 15, 2024

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarkFischinger commented Apr 14, 2024 •

edited