Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove boost::visitor from Model classes #2803

Merged
merged 17 commits into from Feb 7, 2021

Conversation

rcurtin
Copy link
Member

@rcurtin rcurtin commented Jan 11, 2021

We have a number of bindings that use different template parameters for various model types. For instance, the knn binding allows usage of different tree types for nearest neighbor search. These correspond to different template parameters for the NeighborSearch class. That in turn means that the knn binding (and other bindings like cf, krann, kfn, range_search, and kde) could use one of many template types.

In the past, we've solved that issue by using boost::variant<> to encapsulate all the different template types we might use, and then we use boost::visitor to provide a standardized interface to each of them.

In various discussions in #2647, #2777, #2779, and #2440, we've talked about a desire to improve our compilation times in part by removing usage of boost::visitor in favor of virtual methods and inheritance. Therefore, I've refactored the following classes:

  • NSModel
  • RSModel
  • RAModel
  • KDEModel (@robertohueso would you mind taking a look at this one and making sure I didn't break anything? :))
  • CFModel

The refactoring is fairly similar for all five of those classes. To take an example, the idea is that NSModel now holds an NSWrapperBase*, where NSWrapperBase is a base class from which another type is derived. So, for that binding, the NSWrapper<TreeType, ...> class is derived, but the NSWrapperBase class provides a virtual interface that the derived class will implement. In some cases, even NSWrapper is derived for particular tree types, in cases where the behavior must be different.

While virtual methods may have overhead associated with them, the virtual methods associated with each class above might be called once or twice in a single run of a binding---so the effect of a vtable lookup will be entirely negligible.

The compilation time savings are noticeable. With mlpack configured to build only the command-line executables and the tests, time make -j4 on my system on this branch gives:

real    16m56.780s
user    62m33.058s
sys     1m38.195s

and on the master branch, the same thing gives:

real    17m38.051s
user    65m22.727s
sys     1m39.126s

At a more granular level, after I do make clean, the output for time make -j1 mlpack_knn mlpack_kfn mlpack_kde mlpack_range_search mlpack_krann mlpack_cf on this branch is:

real    10m28.012s
user    9m43.960s
sys     0m26.909s

and on the master branch, the same thing gives:

real    10m34.075s
user    9m51.537s
sys     0m26.892s

(Those results are a bit more marginal than I would have expected, actually. Part of the complexity here is that there are now additional .cpp files. Those don't need to be recompiled for the tests, so that may account for more of the savings above.)

I didn't test RAM usage during compilation (since these models don't appear to be the biggest bottleneck for mlpack compilation; I think that's in the ANN code for now), but I assume that is improved too.

This will break serialization compatibility for those bindings, but we already did that with the cereal transition and haven't had any mlpack releases since then, so that shouldn't be an issue.

@rcurtin
Copy link
Member Author

rcurtin commented Jan 14, 2021

It seems that all the static code analysis warnings are false positives.

Comment on lines 376 to 377
Log::Fatal << "List of query users must be one-dimensional!"
<< std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to use {} here to improve readability, also in like 379-380 we use 4 spaces instead of aligning the <<.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed---fixed in 2f504f5. 👍

@jeffin143
Copy link
Member

@rcurtin , Can you take a look at the merge conflicts and we can quickly get this upstream so that there are more conflicts

@rcurtin
Copy link
Member Author

rcurtin commented Jan 22, 2021

Weird, that was a strange merge conflict. I would have expected git to have no problem with it. Anyway, fixed now. 👍

@zoq
Copy link
Member

zoq commented Jan 28, 2021

Looks like all 'valid' static code analysis issues are fixed in #2807.

Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No further comments from my side.

Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

Copy link
Member

@shrit shrit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, everything looks great and easier to read without boost::visitor

@jeffin143 jeffin143 merged commit f96934d into mlpack:master Feb 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants