Skip to content

Conversation

@akleeman
Copy link
Contributor

This PR contains three different options that I've managed to get working each of which allow us to reuse some logic from a different class, effectively inheritance but not always as I'll describe below.

It's probably easiest to start with the first commit, understand it, then look at diffs between the subsequent commits to see what changed. The summary is:

I want to be able to create a model for some standard algorithm (Gaussian Processes, Least Squares, ...) but I want those models to be extensible. A good example is probably LeastSquares.

class LeastSquares : public ModelBase<LeastSquaresRegression> {
public:

  void fit_(const std::vector<Eigen::VectorXd> &features,
                const MarginalDistribution &targets) const {
    // Build the design matrix
    int m = static_cast<int>(features.size());
    int n = static_cast<int>(features[0].size());
    Eigen::MatrixXd A(m, n);
    for (int i = 0; i < m; i++) {
      A.row(i) = features[static_cast<std::size_t>(i)];
    }
    this->model_fit = {least_squares_solver(A, targets.mean)};
  }
}

There ^ we have a LeastSquares class which when given features which consist of row vectors in a design matrix it can perform least squares to estimate the coefficients. It uses the CRTP ModelBase class (see below) to introduce functionality based on which fit_ methods are defined. This LeastSquares class is useful if you already have the rows of the design matrix, but often that's not the case and you instead need to build the design matrix from something else. Take LinearRegression as an example,

class LinearRegression : public LeastSquares<LinearRegression> {

public:
  Eigen::VectorXd convert_feature(const double &feature) const {
    Eigen::VectorXd converted(2);
    converted << 1., feature;
    return converted;
  }

  void fit_(const std::vector<double> &features,
                    const MarginalDistribution &targets) const {
    std::vector<Eigen::VectorXd> rows;
    for (std::size_t i = 0; i < features.size(); ++i) {
        rows.push_back(convert_feature(features[i]));
    }
    fit_(rows, targets);
  }

}

If that worked (which it doesn't) we could then call fit which would first convert the features, then use the LeastSquares class to produce the fit.

Question

Now for the question: "How do we best organize CRTP classes to allow for this sort of shared functionality".

Here's a great summary of the problem and a couple options outlied here

Option 1: Cascading Inheritance

If A is the base class (ModelBase), B is the middle class (LeastSquares) and C is the final class ('LinearRegression) then we have can do something like:

template <typename Derived>
class A {};

template <typename Derived>
class B : public A<B<Derived>> {};

class C : public B<C> {};

This approach let's A inspect B and B inspect C so we can achieve what we need, but there are a few cons. This means that A is not the sole class in charge of performing CRTP style inspection and extension. See the link above for an example, but it effectively means that A needs to deal with all possible implementations of B and B needs to deal with all possible implementations of C. This sort of nested structure would make the LeastSquares class difficult to implement and would require redundant (and complicated) code shared between a LeastSquares class and a GaussianProcess class which each wanted to allow for subsequent specialization of which feature types are supported.

See commit c89563aab3f5.

Option 2: Transferred Inheritance

In this case we would have:

template <typename Derived>
class A {};

template <typename Derived>
class B : public A<Derived> {};

class C : public B<C> {};

Note that B now inherits from A<Derived> not A<B<Derived>>. This is nice because it let's all the CRTP inspection happen inside A which now has full access to Derived. The downside is that we loose a bit of safety, having B inherit from A<C> is a bit strange. As you can see in the code examples in option 1 I had a safety catch which prevented that sort of (often accidental) inheritance.

See commit e0414deec34.

Option 3: No inheritance

Avoid inheritance all together.

template <typename Derived>
class A {};

class B : public A<B> {};

class C : public A<C> {};

Here both B and C are straight forward CRTP classes. The plus side here is a much simpler structure, the downside is that if we want C to behave like B but with some extra functionality it would need to duplicate all functions available in B. If done right that could happen by calling a bunch of free functions ... but it could still involve quite a bit of boiler plate and make the final step in using these models less approachable.

See commit ae408c85579

Option 4

Any ideas?

Model Base

The CRTP base class ModelBase looks something like:

template <typename ModelType> class ModelBase {
private:
  // This is nice to have because it makes sure you don't
  // accidentally do something like:
  //     class A : public ModelBase<B>
  ModelBase() {};
  friend ModelType;

public:

  template <typename FeatureType>
  typename std::enable_if<has_possible_fit_impl<ModelType, FeatureType>::value, void>::type
  fit(const FeatureType &feature) {
    derived().fit_(feature);
  }

  /*
   * CRTP Helpers
   */
  ModelType &derived() { return *static_cast<ModelType *>(this); }
  const ModelType &derived() const {
    return *static_cast<const ModelType *>(this);
  }
};

@jbangelo
Copy link
Contributor

If I'm following things correctly, it would seem that Option 1 is the most powerful option.

I'm not sure if this would be a real use case, but if we designed the "middle" classes to be composable then you'd be able to easily chain together your different classes, or more simply mix and match them. Could you expand a bit on what the complicated code that would need to be shared between LeastSquares and GaussianProcess and other "middle" classes? Is that more than just the CRTP "forwarding" calls to the derived class? Could there be potential ways to reduce the shared code?

From a software architecture stand point it would also seem like Option 1 would be preferable. It is most like a linear inheritance tree. Option 2 isn't analogous to multiple inheritance since it's still linear, but the lack of ability for A to use B seems like it's begging to be misused. And Option 3 doesn't provide much of anything that is preferable from what I can see.

@akleeman
Copy link
Contributor Author

@jbangelo thanks for the input! I think I agree about Option 1. The negative point I mentioned about that option is this:

In ModelBase heremodel base needs to be aware that there may be a convert_ method to go from OtherType to FeatureType.

that on it's own isn't a problem, but the middle classes also need to be aware of that fact and need to do some inspection of the it's implementation class. See this method.

So basically, instead of having the ModelBase class aware that some subsequent implementation may provide a convert_ method you also need any layer in between to deal with that possibility. This means the CRTP logic could no longer be contained to one class ModelBase but would instead by required by anything but a final implementation.

@jbangelo
Copy link
Contributor

@akleeman I see what you mean. I think there could be ways to keep that boiler-plate code to a minimum, or maybe not if we want to allow multiple inheritance.

Again, I'm not sure if it makes sense to chain multiple "middle" classes in this situation. If it's not a real use case than option 2 might be the simpler path since it avoids requiring so much boiler plate code at each level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants