Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace lapack with linalg in MahalanobisDistance #4085

Merged
merged 9 commits into from
Feb 19, 2018
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 0 additions & 1 deletion cmake/FindMetaExamples.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ function(get_excluded_meta_examples)

IF(NOT HAVE_LAPACK)
LIST(APPEND EXCLUDED_META_EXAMPLES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep the block, so we can add things there that pop up in the future, just don't append anything

distance/mahalanobis.sg
)
ENDIF()

Expand Down
29 changes: 13 additions & 16 deletions src/shogun/distance/MahalanobisDistance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,12 @@

#include <shogun/lib/config.h>

#ifdef HAVE_LAPACK

#include <shogun/lib/common.h>
#include <shogun/io/SGIO.h>
#include <shogun/distance/MahalanobisDistance.h>
#include <shogun/features/Features.h>
#include <shogun/io/SGIO.h>
#include <shogun/lib/common.h>
#include <shogun/mathematics/Math.h>
#include <shogun/mathematics/lapack.h>
#include <shogun/mathematics/linalg/LinalgNamespace.h>

using namespace shogun;

Expand All @@ -38,21 +36,25 @@ CMahalanobisDistance::~CMahalanobisDistance()
bool CMahalanobisDistance::init(CFeatures* l, CFeatures* r)
{
CRealDistance::init(l, r);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this call covers all the checks, so we don't need them again in this call (just convolutes the code)
@vigsterkr both REAL and DENSE is checked. That was the whole point of having the base class here iirc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do a cast after the call and add a comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would actually change the design of this and have instead of init returning a boolean the actual typecasted values available... as i dont see really the point of an assertation failure that will just throw an exception and in case of success have a true :)
this function call should actually tell us that yes the types are cool and here you go this is what you wanted anyways

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vinx13 you can ignore most of this... just make sure that you do an ASSERT(CRealDistance::init(l, r)); and add a //FIXME: above with reference to the above comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vigsterkr I mean this whole type system in the features is messed up. I would just rely on dynamic_cast everywhere. The bool is also pointless. And the enums are even worse (too many types, I would make those strings or even better drop them)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the REQUIRED sections it should be just check the return type of CRealDistance::init(l, r) as it checks for everything you need want....


SGMatrix<float64_t> cov;

if ( l == r)
{
mean = ((CDenseFeatures<float64_t>*) l)->get_mean();
icov = ((CDenseFeatures<float64_t>*) l)->get_cov();
cov = ((CDenseFeatures<float64_t>*)l)->get_cov();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you are changing these lines, could you please replace these explicit casts to dynamic_cast at least plz? :)
or even better do a REQUIRED assertation for the type

}
else
{
mean = ((CDenseFeatures<float64_t>*)l)
->compute_mean((CDotFeatures*)lhs, (CDotFeatures*)rhs);
icov = CDotFeatures::compute_cov((CDotFeatures*) lhs, (CDotFeatures*) rhs);
cov = CDotFeatures::compute_cov((CDotFeatures*)lhs, (CDotFeatures*)rhs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another casting

}

SGMatrix<float64_t>::inverse(icov);
auto num_features = cov.num_rows;
chol_cov_L = SGMatrix<float64_t>(num_features, num_features);
chol_cov_d = SGVector<float64_t>(num_features);
chol_cov_p = SGVector<index_t>(num_features);
linalg::ldlt_factor(cov, chol_cov_L, chol_cov_d, chol_cov_p);

return true;
}
Expand Down Expand Up @@ -83,12 +85,8 @@ float64_t CMahalanobisDistance::compute(int32_t idx_a, int32_t idx_b)
for (int32_t i=0; i < diff.vlen; i++)
diff[i] = bvec.vector[i] - diff[i];

SGVector<float64_t> v = diff.clone();
cblas_dgemv(CblasColMajor, CblasNoTrans,
icov.num_rows, icov.num_cols, 1.0, icov.matrix,
diff.vlen, diff.vector, 1, 0.0, v.vector, 1);

float64_t result = cblas_ddot(v.vlen, v.vector, 1, diff.vector, 1);
auto v = linalg::ldlt_solver(chol_cov_L, chol_cov_d, chol_cov_p, diff);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is much much from a numerical perspective!

float64_t result = linalg::dot(v, diff);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be auto or? :)


if (!use_mean)
((CDenseFeatures<float64_t>*) lhs)->free_feature_vector(avec, idx_a);
Expand All @@ -110,4 +108,3 @@ void CMahalanobisDistance::init()
m_parameters->add(&use_mean, "use_mean", "If distance shall be computed between mean vector and vector from rhs or between lhs and rhs.");
}

#endif /* HAVE_LAPACK */
76 changes: 40 additions & 36 deletions src/shogun/distance/MahalanobisDistance.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,44 +10,45 @@

#include <shogun/lib/config.h>

#ifdef HAVE_LAPACK

#include <shogun/lib/common.h>
#include <shogun/distance/RealDistance.h>

namespace shogun
{
/** @brief class MahalanobisDistance
*
* The Mahalanobis distance for real valued features computes the distance
* between a feature vector and a distribution of features characterized by its
* mean and covariance.
*
* \f[\displaystyle
* D = \sqrt{ (x_i - \mu)^T \Sigma^{-1} (x_i - \mu) }
* \f]
*
* The Mahalanobis Squared distance does not take the square root:
*
* \f[\displaystyle
* D = (x_i - \mu)^T \Sigma^{-1} (x_i - \mu)
* \f]
*
* If use_mean is set to false (which it is by default) the distance is computed
* as
*
* \f[\displaystyle
* D = \sqrt{ (x_i - x_i')^T \Sigma^{-1} (x_i - x_i') }
* \f]
*
* i.e., instead of the mean as reference two vector \f$x_i\f$ and \f$x_i'\f$
* are compared.
*
* @see <a href="en.wikipedia.org/wiki/Mahalanobis_distance">
* Wikipedia: Mahalanobis Distance</a>
*/
class CMahalanobisDistance: public CRealDistance
{
/** @brief class MahalanobisDistance
*
* The Mahalanobis distance for real valued features computes the distance
* between a feature vector and a distribution of features characterized by
* its
* mean and covariance.
*
* \f[\displaystyle
* D = \sqrt{ (x_i - \mu)^T \Sigma^{-1} (x_i - \mu) }
* \f]
*
* The Mahalanobis Squared distance does not take the square root:
*
* \f[\displaystyle
* D = (x_i - \mu)^T \Sigma^{-1} (x_i - \mu)
* \f]
*
* If use_mean is set to false (which it is by default) the distance is
* computed
* as
*
* \f[\displaystyle
* D = \sqrt{ (x_i - x_i')^T \Sigma^{-1} (x_i - x_i') }
* \f]
*
* i.e., instead of the mean as reference two vector \f$x_i\f$ and
* \f$x_i'\f$
* are compared.
*
* @see <a href="http://en.wikipedia.org/wiki/Mahalanobis_distance">
* Wikipedia: Mahalanobis Distance</a>
*/
class CMahalanobisDistance : public CRealDistance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW why this reformatting?
Pls don't put this into a PR that changes small bits as this one. Makes everything much harder to review

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the style checker changed other codes than mine. I'm using command like git clang-format-3.8 --commit xxx , is it right ?

{
public:
/** default constructor */
CMahalanobisDistance();
Expand Down Expand Up @@ -141,10 +142,13 @@ class CMahalanobisDistance: public CRealDistance

/** vector mean of the lhs feature vectors */
SGVector<float64_t> mean;
/** inverse of the covariance matrix of lhs feature vectors */
SGMatrix<float64_t> icov;

/** LDLT decomposition of the covariance matrix of lhs feature vectors
*/
SGMatrix<float64_t> chol_cov_L;
SGVector<float64_t> chol_cov_d;
SGVector<index_t> chol_cov_p;
};

} // namespace shogun
#endif /* HAVE_LAPACK */
#endif /* _MAHALANOBISDISTANCE_H__ */
33 changes: 33 additions & 0 deletions tests/unit/distance/MahalanobisDistance_unittest.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
/*
* This software is distributed under BSD 3-clause license (see LICENSE file).
*
* Authors: Wuwei Lin
*/

#include <gtest/gtest.h>
#include <shogun/base/some.h>
#include <shogun/distance/MahalanobisDistance.h>
#include <shogun/features/DenseFeatures.h>
#include <shogun/lib/common.h>

using namespace shogun;

TEST(MahalanobisDistance, compute_distance)
{
// create four points (0,0) (2,10) (2,8) (5,5)
SGMatrix<float64_t> rect(2, 4);
rect(0, 0) = 0;
rect(1, 0) = 0;
rect(0, 1) = 2;
rect(1, 1) = 10;
rect(0, 2) = 2;
rect(1, 2) = 8;
rect(0, 3) = 5;
rect(1, 3) = 5;

auto feature = some<CDenseFeatures<float64_t>>(rect);
auto distance = some<CMahalanobisDistance>(feature, feature);
EXPECT_NEAR(distance->distance(1, 1), 0.0, 1e-10);
EXPECT_NEAR(distance->distance(1, 3), 2.63447126986, 1e-10);
EXPECT_NEAR(distance->distance(2, 3), 2.22834405812, 1e-10);
}