Experiment to see the advantage of boost::visitor for the ANN code #2647

rcurtin · 2020-10-02T00:32:20Z

One of the things that we are finding in #2458 is that the ANN code takes a lot of time and a huge amount of memory to compile. This is probably because we use boost::visitor throughout that code to avoid the runtime overhead that may be associated with virtual inheritance. But, the problem is that the use of boost::visitor (and other boost libraries) is really hard on the compiler, and so we have a situation where a lot of people can't compile mlpack in a reasonable amount of time (if at all!). We're also removing other Boost dependencies (see #2440).

One possible way around this is to use virtual inheritance... so, the structure of the Linear<> class in src/mlpack/methods/ann/layer/linear.hpp would become something like this:

template<
    typename InputDataType = arma::mat,
    typename OutputDataType = arma::mat,
    typename RegularizerType = NoRegularizer
>
class Linear : Layer
{
 public:
  Linear(const size_t inSize, const size_t outSize, RegularizerType regularizer = RegularizerType());

  virtual void Reset();

  template<typename eT>
  virtual void Forward(const arma::Mat<eT>& input, const arma::Mat<eT>& output);

  template<typename eT>
  virtual void Backward(const arma::Mat<eT>& input, const arma::Mat<eT>& gy, arma::Mat<eT>& g);

  template<typename eT>
  virtual void Gradient(const arma::Mat<eT>& input, const arma::Mat<eT>& error, arma::Mat<eT>& gradient);

  ...
};

So, the question is, what kind of runtime effects would this have? (It would almost certainly have really big impacts on compilation time, reducing it significantly.)

We should run an experiment to see what the effect is. I would suggest that one way to do this would be to try and set up a situation where we are calling very many virtual functions, in order to incur the maximum overhead. Then we can compare this with the original code, and see if there is any significant runtime difference.

So, here's a way to do that:

Change the Linear class in something like the way suggested above. You'll need to create a Layer base class, of course. You can probably ignore some functions like serialization, but, up to you. All we need the network to do in the end is train and predict. Ignore all the other layers (unless you want to try other layers too).
Change the FFN class to use std::vector<Layer*> throughout the code, and replace uses of visitors with calls directly to a Layer* object. Note that we only need to do training and then evaluation, so you can probably leave some members of FFN commented out. Don't worry about RNN or WGAN or any other classes.
Write a simple test program that creates an FFN with many small linear layers (maybe try 5, 10, 25, 100?). When training the network, use a reasonably small low-dimensional dataset, and a batch size of 1 for the optimizer. This will hopefully cause a lot of virtual function calls.
Write that same network for the existing mlpack implementation.
Run the programs from (3) and (4) and see how they compare in terms of runtime! You'll want to make sure the initialization and number of iterations is exactly the same, of course.

This is just an experiment, so you can leave a lot of stuff commented out for the sake of the experiment. Actually refactoring the ANN codebase will be a much larger effort (and we could split it up so lots of people could work on it, too), and all we want to see here is whether or not it would be a good idea to embark on this refactoring.

Anyway, I have been meaning to do this for some time, but increasingly I'm realizing that I'm not going to find the time anytime soon, so @shrit suggested that I write this up as an issue today. So here it is! 👍

Let me know if I can clarify anything, and if you're going to give it a shot, get ready for lots of compiler errors. 😄

The text was updated successfully, but these errors were encountered:

kartikdutt18 · 2020-10-02T17:00:01Z

Hey @rcurtin, @shrit, Thanks for opening this issue. Maybe we can have a couple of people working on the experiment. We can have a person do changes in the Linear layer and the other one in FFN class simultaneously. This would reduce their workload and be easier to test. Then they can test it out.

geekypathak21 · 2020-10-02T17:38:51Z

Hey @rcurtin, @shrit, Thanks for opening this issue. Maybe we can have a couple of people working on the experiment. We can have a person do changes in the Linear layer and the other one in FFN class simultaneously. This would reduce their workload and be easier to test. Then they can test it out.

@kartikdutt18 I think it is better if only one person try this because we have to ignore many things here it is more like a hack. I would like to do this but currently I can't promise 😞 just busy with other stuff.

Aakash-kaushik · 2020-10-06T13:16:52Z

Hey @kartikdutt18, @himanshupathak21061998 is it okay if i work on this experimentation the way that himashu suggested.

rcurtin · 2020-10-06T13:57:18Z

@Aakash-kaushik I think you should go for it---I do agree that it should be done kind of like a hack. All we need to verify is whether the approach has any runtime drawbacks, and then we can decide what to do. 👍

Aakash-kaushik · 2020-10-06T14:14:15Z

@rcurtin I will start working on it. 🚀

Aakash-kaushik · 2020-10-08T18:39:58Z

I needed some help to discuss on how the base class should be implemented but i somehow can't figure out on how to implement the base class as a abstract class with virtual functions that take templates, one thing i did was to make the base class an abstract class is that i took the function template parameters in the class itself but then i had some declaration problems.
The second thing is as we make the base class members virtual and derive linear class from that we can't really use templates for the derived members there too.

Also an addition: i made the virtual functions pure virtual functions because i don't think we will need a empty layer object anywhere.

shrit · 2020-10-08T20:07:35Z

@Aakash-kaushik Would it possible to open a pull request with the modification you have made? even if it far to be complete, this will allow us to visualize better the work you have done and will provide you with better help and support 👍

Aakash-kaushik · 2020-10-09T03:03:06Z

@shrit yup i will do that, thanks.

mlpack-bot · 2020-11-08T03:24:30Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

Aakash-kaushik · 2020-11-08T04:35:15Z

Please keep this open. I was working on this but got extremely busy. And i will start working on this again.

Aakash-kaushik · 2020-11-10T04:29:15Z

@zoq can you keep this open ?

zoq · 2020-12-13T20:46:23Z

Ideally we can sort this out and make this part of mlpack 4.0 @Aakash-kaushik do you think you could open a PR with the code you already have? Happy to help you solve the remaining issues.

Aakash-kaushik · 2020-12-13T21:09:29Z

Ideally we can sort this out and make this part of mlpack 4.0 @Aakash-kaushik do you think you could open a PR with the code you already have? Happy to help you solve the remaining issues.

The thing right now is that I am going through my university exams, and to create a PR with the the code that i have right now i will have to sort a lot more bugs. If you can feel free to use the code from the repo i have and if we can wait until the starting of January i will be able to complete this experiment and if we don't see a performance drop or other criterion we were considering i can help with the porting of other methods too.

zoq · 2020-12-16T21:52:26Z

@Aakash-kaushik no problem, I can pick up what you did and continue, I'll keep you updated. Best of luck with your exams.

Aakash-kaushik · 2020-12-17T08:15:46Z

@Aakash-kaushik no problem, I can pick up what you did and continue, I'll keep you updated. Best of luck with your exams.

Thank you so much.

Aakash-kaushik · 2020-12-18T05:18:22Z

Hey @zoq surprisingly I have got time gaps between my university exams so I will keep updating that repo.

rcurtin · 2022-05-29T21:16:49Z

Since #2777 is merged, this one is completed now. 👍

rcurtin added t: question c: methods labels Oct 2, 2020

Aakash-kaushik mentioned this issue Oct 9, 2020

ANN experiment for boost::visitor #2666

Closed

mlpack-bot bot added the s: stale label Nov 8, 2020

mlpack-bot bot removed the s: stale label Nov 8, 2020

rcurtin added the s: keep open label Nov 14, 2020

zoq added this to the mlpack 4.0.0 milestone Dec 13, 2020

This was referenced Dec 21, 2020

Swap boost::variant with vtable. #2777

Merged

Remove boost::variant from neural network code #2779

Closed

rcurtin mentioned this issue Jan 11, 2021

Remove boost::visitor from Model classes #2803

Merged

rcurtin closed this as completed May 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment to see the advantage of boost::visitor for the ANN code #2647

Experiment to see the advantage of boost::visitor for the ANN code #2647

rcurtin commented Oct 2, 2020

kartikdutt18 commented Oct 2, 2020

geekypathak21 commented Oct 2, 2020

Aakash-kaushik commented Oct 6, 2020

rcurtin commented Oct 6, 2020

Aakash-kaushik commented Oct 6, 2020

Aakash-kaushik commented Oct 8, 2020 •

edited

shrit commented Oct 8, 2020

Aakash-kaushik commented Oct 9, 2020

mlpack-bot bot commented Nov 8, 2020

Aakash-kaushik commented Nov 8, 2020

Aakash-kaushik commented Nov 10, 2020

zoq commented Dec 13, 2020

Aakash-kaushik commented Dec 13, 2020

zoq commented Dec 16, 2020

Aakash-kaushik commented Dec 17, 2020

Aakash-kaushik commented Dec 18, 2020

rcurtin commented May 29, 2022

Experiment to see the advantage of boost::visitor for the ANN code #2647

Experiment to see the advantage of boost::visitor for the ANN code #2647

Comments

rcurtin commented Oct 2, 2020

kartikdutt18 commented Oct 2, 2020

geekypathak21 commented Oct 2, 2020

Aakash-kaushik commented Oct 6, 2020

rcurtin commented Oct 6, 2020

Aakash-kaushik commented Oct 6, 2020

Aakash-kaushik commented Oct 8, 2020 • edited

shrit commented Oct 8, 2020

Aakash-kaushik commented Oct 9, 2020

mlpack-bot bot commented Nov 8, 2020

Aakash-kaushik commented Nov 8, 2020

Aakash-kaushik commented Nov 10, 2020

zoq commented Dec 13, 2020

Aakash-kaushik commented Dec 13, 2020

zoq commented Dec 16, 2020

Aakash-kaushik commented Dec 17, 2020

Aakash-kaushik commented Dec 18, 2020

rcurtin commented May 29, 2022

Aakash-kaushik commented Oct 8, 2020 •

edited