-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment to see the advantage of boost::visitor for the ANN code #2647
Comments
@kartikdutt18 I think it is better if only one person try this because we have to ignore many things here it is more like a hack. I would like to do this but currently I can't promise 😞 just busy with other stuff. |
Hey @kartikdutt18, @himanshupathak21061998 is it okay if i work on this experimentation the way that himashu suggested. |
@Aakash-kaushik I think you should go for it---I do agree that it should be done kind of like a hack. All we need to verify is whether the approach has any runtime drawbacks, and then we can decide what to do. 👍 |
@rcurtin I will start working on it. 🚀 |
I needed some help to discuss on how the base class should be implemented but i somehow can't figure out on how to implement the base class as a abstract class with virtual functions that take templates, one thing i did was to make the base class an abstract class is that i took the function template parameters in the class itself but then i had some declaration problems. Also an addition: i made the virtual functions pure virtual functions because i don't think we will need a empty layer object anywhere. |
@Aakash-kaushik Would it possible to open a pull request with the modification you have made? even if it far to be complete, this will allow us to visualize better the work you have done and will provide you with better help and support 👍 |
@shrit yup i will do that, thanks. |
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍 |
Please keep this open. I was working on this but got extremely busy. And i will start working on this again. |
@zoq can you keep this open ? |
Ideally we can sort this out and make this part of mlpack 4.0 @Aakash-kaushik do you think you could open a PR with the code you already have? Happy to help you solve the remaining issues. |
The thing right now is that I am going through my university exams, and to create a PR with the the code that i have right now i will have to sort a lot more bugs. If you can feel free to use the code from the repo i have and if we can wait until the starting of January i will be able to complete this experiment and if we don't see a performance drop or other criterion we were considering i can help with the porting of other methods too. |
@Aakash-kaushik no problem, I can pick up what you did and continue, I'll keep you updated. Best of luck with your exams. |
Thank you so much. |
Hey @zoq surprisingly I have got time gaps between my university exams so I will keep updating that repo. |
Since #2777 is merged, this one is completed now. 👍 |
One of the things that we are finding in #2458 is that the ANN code takes a lot of time and a huge amount of memory to compile. This is probably because we use
boost::visitor
throughout that code to avoid the runtime overhead that may be associated with virtual inheritance. But, the problem is that the use ofboost::visitor
(and other boost libraries) is really hard on the compiler, and so we have a situation where a lot of people can't compile mlpack in a reasonable amount of time (if at all!). We're also removing other Boost dependencies (see #2440).One possible way around this is to use virtual inheritance... so, the structure of the
Linear<>
class insrc/mlpack/methods/ann/layer/linear.hpp
would become something like this:So, the question is, what kind of runtime effects would this have? (It would almost certainly have really big impacts on compilation time, reducing it significantly.)
We should run an experiment to see what the effect is. I would suggest that one way to do this would be to try and set up a situation where we are calling very many
virtual
functions, in order to incur the maximum overhead. Then we can compare this with the original code, and see if there is any significant runtime difference.So, here's a way to do that:
Change the
Linear
class in something like the way suggested above. You'll need to create aLayer
base class, of course. You can probably ignore some functions like serialization, but, up to you. All we need the network to do in the end is train and predict. Ignore all the other layers (unless you want to try other layers too).Change the
FFN
class to usestd::vector<Layer*>
throughout the code, and replace uses of visitors with calls directly to aLayer*
object. Note that we only need to do training and then evaluation, so you can probably leave some members ofFFN
commented out. Don't worry aboutRNN
orWGAN
or any other classes.Write a simple test program that creates an
FFN
with many small linear layers (maybe try 5, 10, 25, 100?). When training the network, use a reasonably small low-dimensional dataset, and a batch size of 1 for the optimizer. This will hopefully cause a lot ofvirtual
function calls.Write that same network for the existing mlpack implementation.
Run the programs from (3) and (4) and see how they compare in terms of runtime! You'll want to make sure the initialization and number of iterations is exactly the same, of course.
This is just an experiment, so you can leave a lot of stuff commented out for the sake of the experiment. Actually refactoring the ANN codebase will be a much larger effort (and we could split it up so lots of people could work on it, too), and all we want to see here is whether or not it would be a good idea to embark on this refactoring.
Anyway, I have been meaning to do this for some time, but increasingly I'm realizing that I'm not going to find the time anytime soon, so @shrit suggested that I write this up as an issue today. So here it is! 👍
Let me know if I can clarify anything, and if you're going to give it a shot, get ready for lots of compiler errors. 😄
The text was updated successfully, but these errors were encountered: