Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to C++17 #3704

Merged
merged 42 commits into from
May 24, 2024
Merged

Move to C++17 #3704

merged 42 commits into from
May 24, 2024

Conversation

shrit
Copy link
Member

@shrit shrit commented May 14, 2024

No description provided.

shrit added 2 commits May 14, 2024 13:00
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
@rcurtin
Copy link
Member

rcurtin commented May 14, 2024

Nice, it will be good to get this merged in.

There are also a number of methods that were recently modified to have a huge pile of constructors and Train() methods, but with C++17 we can use std::optional to reduce the number of overloads. For instance: #3578, #3572. We should definitely do that as part of this PR. I think any method that has documentation in user/methods/ should be checked for this issue.

Also grepping through the documentation and source code for words like C++14 or C++17 (or even C++11...) might help uncover some other things that can be updated.

@shrit
Copy link
Member Author

shrit commented May 14, 2024

We should open another two issues for things we should do using C++17 features. I would prefer to merge this as it is and then we can think slowly of things to do later.

@rcurtin
Copy link
Member

rcurtin commented May 14, 2024

The issue is that then we have only half-transitioned to C++17, and then you have a codebase that is inconsistent (it's like having ifdefs for ancient Armadillo versions that are less than the required version). There are a handful of things that we wanted to do when we made this transition, so why not do them now? There's not a particular hurry to get this merged, and I don't think it will take that long anyway---the cleanups and refactorings should be pretty straightforward.

We also need to update the documentation where relevant to indicate that C++17 is the minimum required standard now.

shrit added 10 commits May 15, 2024 11:13
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
@shrit
Copy link
Member Author

shrit commented May 16, 2024

@rcurtin this is ready for review.

src/mlpack/base.hpp Outdated Show resolved Hide resolved
Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, everything here is looking good to me. I wasn't sure why the tests failed, so I built locally and dug in a bit. The comments I left detail what's wrong, it actually turned out to be a pretty simple issue.

I also went through all the methods I had documented to find more where I had written a bunch of Train() methods because I couldn't use std::optional. Here are the others I found:

  • AdaBoost
  • LinearRegression
  • LinearSVM
  • NaiveBayesClassifier
  • Perceptron

Happy to help out with the conversion process, just let me know. I think the new versions of the Train() functions could be simplified with the ternary operator; I left a comment with a suggestion.

The only other main thing on my list was documentation so that users know C++17 is required. I think these places need to be modified:

  • HISTORY.md (just a changelog note should be good)
  • section 2 of README.md (C++14 compiler -> C++17 compiler)
  • The R bindings specify C++14, but we should bump that in src/mlpack/bindings/R/mlpack/R/inline.R, to settings$env$USE_CXX17 (I think that will work)
  • The documentation for the R bindings in src/mlpack/bindings/R/mlpack/DESCRIPTION.in

That's just what I found by grepping, I think that should be everything but I'm not 100% sure.

src/mlpack/methods/hoeffding_trees/hoeffding_tree_impl.hpp Outdated Show resolved Hide resolved
shrit added 5 commits May 19, 2024 17:50
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
Signed-off-by: Omar Shrit <omar@avontech.fr>
@shrit
Copy link
Member Author

shrit commented May 19, 2024

@rcurtin This is done on my side.
NaiveBayesClassifier can not be refactored, looking at the functions each Train function is different from the others.

LinearSVM Train functions were more complicated the Callbacks... are at the end of the function parameters and are shadowing the function calls, so I was only able to remove two functions, anything else will trigger an error.

@conradsnicta
Copy link
Contributor

@shrit What about detection of C++17 as mentioned in comment #3704 (comment) ?

@shrit
Copy link
Member Author

shrit commented May 20, 2024

@conradsnicta Yes I will add it, sorry, I forgot about it.
Not to mention that the tests are still failing

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, everything is looking good to me. Just a couple comments throughout, hopefully they are helpful.

I took a look at NaiveBayesClassifier, it seems like we should be able to combine the first two Train() overloads; did I overlook a reason why that doesn't work?

HISTORY.md Outdated Show resolved Hide resolved
# C++14 is required
if (getRversion() < "4.2.0") settings$env$USE_CXX14 <- "yes"
# C++17 is required
if (getRversion() < "4.2.0") settings$env$USE_CXX17 <- "yes"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddelbuettel @coatless can you sanity check this change here? And are there any other things than this (and the update to DESCRIPTION.in) that need to be updated if we are migrating to C++17?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is technically correct ... but inline is not something we use when we build the CRAN package it is simply a (very nice to have, now that we're header-only) convenience for users. We could just say "R (>= 4.3.0)" in DESCRIPTION. Either way works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should do both.

src/mlpack/methods/adaboost/adaboost.hpp Show resolved Hide resolved
src/mlpack/methods/hoeffding_trees/hoeffding_tree_impl.hpp Outdated Show resolved Hide resolved
@@ -268,7 +194,7 @@ class LinearRegression
ElemType Train(const MatType& predictors,
const ResponsesType& responses,
const WeightsType& weights,
const double lambda);
const std::optional<double> lambda = std::nullopt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one more overload below with intercept, can we combine that here too?

const double lambda,
const double delta);
const std::optional<double> lambda = std::nullopt,
const std::optional<double> delta = std::nullopt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell it is also possible to combine this with the version that has callbacks below; I tested it locally. The signature is this:

  template<typename MatType,
           typename... CallbackTypes,
           typename = typename std::enable_if<IsEnsCallbackTypes<
               CallbackTypes...
           >::value>::type>
  ElemType Train(const MatType& data,
                 const arma::Row<size_t>& labels,
                 const size_t numClasses,
                 const std::optional<double> lambda = std::nullopt,
                 const std::optional<double> delta = std::nullopt,
                 const std::optional<bool> fitIntercept = std::nullopt,
                 CallbackTypes&&... callbacks);

Did you encounter any problems when trying that? It could allow us to remove another overload.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, most of the problems were in the tests, especially if you remove the overload with the optimiser parameter and the callbacks and then try using an optimizer and not adding callbacks then the compiler will not be able to find the correct one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tried the code above, I had no compilation issues in the tests. Do you have a particular test case that fails to compile? I can try and debug it.

@eddelbuettel
Copy link
Contributor

eddelbuettel commented May 21, 2024

R has no issues with C++17. It has been the default for the last two releases (conditional on a suitable compiler present). We should be able to remove all checks (at the minimal "cost" of no longer supporting ancient R 3.6.* or so).

The manual says that C++17 became the minimum with R 4.3.0 last year. C++17 as a standard is not uncommon at CRAN. (A few packages use C++20, that is much dicier because of compiler support).

The package I look after for $work has been C++17 for some time now (and our internal code [which the R package may at time consume via pre-made library blobs we make available] uses C++20 but we keep the API at C++17).

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly there, I think from my side the only thing that remains is #3704 (comment) 👍

(I requested changes just to prevent any accidental merge before handling that)

@shrit
Copy link
Member Author

shrit commented May 22, 2024

Well the linear regression tests are failing, and GDB is not of a greater help 😄 , I need to compile with debug option to see what is really happening

shrit and others added 2 commits May 22, 2024 16:58
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
shrit and others added 7 commits May 23, 2024 15:27
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
@rcurtin
Copy link
Member

rcurtin commented May 23, 2024

I pushed a change that should fix the build; the compiler was using the deprecated version for some LinearRegression::Train() calls due to implicit conversions, so I made the deprecated versions "explicit" by using SFINAE to disable implicit conversions. It's an ugly workaround for the lack of explicit keyword for arbitrary member functions...

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the build passes, everything looks good to me. 👍

@conradsnicta conradsnicta merged commit 9389197 into mlpack:master May 24, 2024
19 of 20 checks passed
@shrit shrit deleted the cpp17 branch May 24, 2024 11:40
This was referenced May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants