Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify NSModel to use boost variant. #693

Merged
merged 6 commits into from Jun 20, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/mlpack/methods/neighbor_search/neighbor_search.hpp
Expand Up @@ -28,7 +28,7 @@ namespace neighbor /** Neighbor-search routines. These include

// Forward declaration.
template<typename SortPolicy>
class NSModel;
class TrainVisitor;

/**
* The NeighborSearch class is a template class for performing distance-based
Expand Down Expand Up @@ -308,7 +308,7 @@ class NeighborSearch
bool treeNeedsReset;

//! The NSModel class should have access to internal members.
friend class NSModel<SortPolicy>;
friend class TrainVisitor<SortPolicy>;
}; // class NeighborSearch

} // namespace neighbor
Expand Down
210 changes: 189 additions & 21 deletions src/mlpack/methods/neighbor_search/ns_model.hpp
Expand Up @@ -13,12 +13,27 @@
#include <mlpack/core/tree/binary_space_tree.hpp>
#include <mlpack/core/tree/cover_tree.hpp>
#include <mlpack/core/tree/rectangle_tree.hpp>

#include <boost/variant.hpp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add 'variant' to the list of Boost libraries we require in the documentation and in the root CMakeLists.txt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a header only library, so we don't need to include it in CMakeList.txt.
I can see that all boost dependencies mentioned in the README file: program_options, math_c99, unit_test_framework, and serialization, are not header only. Should I include variant there anyway? even if it is header only?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it is header only. I am not sure if it is possible to have a Boost installation where some of the header-only libraries (but not all) are available; I haven't played with b2 (Boost's build system) enough recently to be sure. If it's possible to have a Boost installation with some of the header-only libraries omitted, then we should include variant in there; otherwise, I don't think it's necessary. But I'm not sure whether or not that's possible...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick search, it looks like there is no such option to omit some header files...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks like there is nothing special we need to do with our CMake configuration for this change then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we don't have to add boost::variant to required libraries?? Even if its header only we need to make sure the headers are present. Or is it the part of basic boost which is included in every boost package?? I mean if I install just boost::math in ubuntu with apt-get do I get boost::variant??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see a note about this in boost's documentation: [ref]
"Other Packages
RedHat, Debian, and other distribution packagers supply Boost library packages, however you may need to adapt these instructions if you use third-party packages, because their creators usually choose to break Boost up into several packages, reorganize the directory structure of the Boost distribution, and/or rename the library binaries. If you have any trouble, we suggest using an official Boost distribution from SourceForge. "

I am not sure. But it looks like all partial packages depends on libboost-dev package, so they will finally include all headers files.
For example: libboost-program-options-dev --depends on--> libbooost-dev

#include "neighbor_search.hpp"

namespace mlpack {
namespace neighbor {

/**
* Alias template for euclidean neighbor search.
*/
template<typename SortPolicy,
template<typename TreeMetricType,
typename TreeStatType,
typename TreeMatType> class TreeType>
using NSType = NeighborSearch<SortPolicy,
metric::EuclideanDistance,
arma::mat,
TreeType,
TreeType<metric::EuclideanDistance,
NeighborSearchStat<SortPolicy>,
arma::mat>::template DualTreeTraverser>;

template<typename SortPolicy>
struct NSModelName
{
Expand All @@ -37,10 +52,162 @@ struct NSModelName<FurthestNeighborSort>
static const std::string Name() { return "furthest_neighbor_search_model"; }
};

/**
* MonoSearchVisitor executes a monochromatic neighbor search on the given
* NSType. We don't make any difference for different instantiations of NSType.
*/
class MonoSearchVisitor : public boost::static_visitor<void>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I hate to be picky, so I hope I am not giving too much work, but do you think you could add documentation for these classes and their members and member functions? I am not sure everyone who looks through this file will be familiar with the visitor paradigm; there is no need to explain what that is in your comments, but it may be useful to have comments along the lines of "MonoSearchVisitor executes a monochromatic neighbor search on the given NSType", or something like this. Maybe a few more words are useful to explain that better. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I should add, if you don't want to or are busy doing other things, I can do the documentation here, it will only take a few minutes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcurtin Sure!! I am here for this! I agree it can be confusing for someone with no knowledge of boost variants. I will add documentation.
Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in: 6c2c3ca

{
private:
const size_t k;
arma::Mat<size_t>& neighbors;
arma::mat& distances;

public:
template<typename NSType>
void operator()(NSType* ns) const;

MonoSearchVisitor(const size_t k,
arma::Mat<size_t>& neighbors,
arma::mat& distances);
};

/**
* BiSearchVisitor executes a bichromatic neighbor search on the given NSType.
* We use template specialization to differenciate those tree types that
* accept leafSize as a parameter. In these cases, before doing neighbor search,
* a query tree with proper leafSize is built from the querySet.
*/
template<typename SortPolicy>
class BiSearchVisitor : public boost::static_visitor<void>
{
private:
const arma::mat& querySet;
const size_t k;
arma::Mat<size_t>& neighbors;
arma::mat& distances;
const size_t leafSize;

//! Bichromatic neighbor search on the given NSType considering the leafSize.
template<typename NSType>
void SearchLeaf(NSType* ns) const;

public:
//! Alias template necessary for visual c++ compiler.
template<template<typename TreeMetricType,
typename TreeStatType,
typename TreeMatType> class TreeType>
using NSTypeT = NSType<SortPolicy, TreeType>;

//! Default Bichromatic neighbor search on the given NSType instance.
template<template<typename TreeMetricType,
typename TreeStatType,
typename TreeMatType> class TreeType>
void operator()(NSTypeT<TreeType>* ns) const;

//! Bichromatic neighbor search on the given NSType specialized for KDTrees.
void operator()(NSTypeT<tree::KDTree>* ns) const;

//! Bichromatic neighbor search on the given NSType specialized for BallTrees.
void operator()(NSTypeT<tree::BallTree>* ns) const;

BiSearchVisitor(const arma::mat& querySet,
const size_t k,
arma::Mat<size_t>& neighbors,
arma::mat& distances,
const size_t leafSize);
};

/**
* TrainVisitor sets the reference set to a new reference set on the given
* NSType. We use template specialization to differenciate those tree types that
* accept leafSize as a parameter. In these cases, a reference tree with proper
* leafSize is built from the referenceSet.
*/
template<typename SortPolicy>
class TrainVisitor : public boost::static_visitor<void>
{
private:
arma::mat&& referenceSet;
size_t leafSize;

//! Train on the given NSType considering the leafSize.
template<typename NSType>
void TrainLeaf(NSType* ns) const;

public:
//! Alias template necessary for visual c++ compiler.
template<template<typename TreeMetricType,
typename TreeStatType,
typename TreeMatType> class TreeType>
using NSTypeT = NSType<SortPolicy, TreeType>;

//! Default Train on the given NSType instance.
template<template<typename TreeMetricType,
typename TreeStatType,
typename TreeMatType> class TreeType>
void operator()(NSTypeT<TreeType>* ns) const;

//! Train on the given NSType specialized for KDTrees.
void operator()(NSTypeT<tree::KDTree>* ns) const;

//! Train on the given NSType specialized for BallTrees.
void operator()(NSTypeT<tree::BallTree>* ns) const;

TrainVisitor(arma::mat&& referenceSet, const size_t leafSize);
};

/**
* SingleModeVisitor exposes the SingleMode method of the given NSType.
*/
class SingleModeVisitor : public boost::static_visitor<bool&>
{
public:
template<typename NSType>
bool& operator()(NSType* ns) const;
};

/**
* NaiveVisitor exposes the Naive method of the given NSType.
*/
class NaiveVisitor : public boost::static_visitor<bool&>
{
public:
template<typename NSType>
bool& operator()(NSType *ns) const;
};

/**
* ReferenceSetVisitor exposes the referenceSet of the given NSType.
*/
class ReferenceSetVisitor : public boost::static_visitor<const arma::mat&>
{
public:
template<typename NSType>
const arma::mat& operator()(NSType *ns) const;
};

/**
* DeleteVisitor deletes the given NSType instance.
*/
class DeleteVisitor : public boost::static_visitor<void>
{
public:
template<typename NSType>
void operator()(NSType *ns) const;
};

/**
* The NSModel class provides an easy way to serialize a model, abstracts away
* the different types of trees, and also reflects the NeighborSearch API.
*
* @tparam SortPolicy The sort policy for distances; see NearestNeighborSort.
*/
template<typename SortPolicy>
class NSModel
{
public:
//! Enum type to identify each accepted tree type.
enum TreeTypes
{
KD_TREE,
Expand All @@ -52,31 +219,27 @@ class NSModel
};

private:
//! Tree type considered for neighbor search.
TreeTypes treeType;

//! For tree types that accept the maxLeafSize parameter.
size_t leafSize;

// For random projections.
//! For random projections.
bool randomBasis;
arma::mat q;

template<template<typename TreeMetricType,
typename TreeStatType,
typename TreeMatType> class TreeType>
using NSType = NeighborSearch<SortPolicy,
metric::EuclideanDistance,
arma::mat,
TreeType,
TreeType<metric::EuclideanDistance,
NeighborSearchStat<SortPolicy>,
arma::mat>::template DualTreeTraverser>;

// Only one of these pointers will be non-NULL.
NSType<tree::KDTree>* kdTreeNS;
NSType<tree::StandardCoverTree>* coverTreeNS;
NSType<tree::RTree>* rTreeNS;
NSType<tree::RStarTree>* rStarTreeNS;
NSType<tree::BallTree>* ballTreeNS;
NSType<tree::XTree>* xTreeNS;
/**
* nSearch holds an instance of the NeigborSearch class for the current
* treeType. It is initialized every time BuildModel is executed.
* We access to the contained value through the visitor classes defined above.
*/
boost::variant<NSType<SortPolicy, tree::KDTree>*,
NSType<SortPolicy, tree::StandardCoverTree>*,
NSType<SortPolicy, tree::RTree>*,
NSType<SortPolicy, tree::RStarTree>*,
NSType<SortPolicy, tree::BallTree>*,
NSType<SortPolicy, tree::XTree>*> nSearch;

public:
/**
Expand All @@ -99,15 +262,19 @@ class NSModel
bool SingleMode() const;
bool& SingleMode();

//! Expose naiveMode.
bool Naive() const;
bool& Naive();

//! Expose leafSize.
size_t LeafSize() const { return leafSize; }
size_t& LeafSize() { return leafSize; }

//! Expose treeType.
TreeTypes TreeType() const { return treeType; }
TreeTypes& TreeType() { return treeType; }

//! Expose randomBasis.
bool RandomBasis() const { return randomBasis; }
bool& RandomBasis() { return randomBasis; }

Expand All @@ -123,11 +290,12 @@ class NSModel
arma::Mat<size_t>& neighbors,
arma::mat& distances);

//! Perform neighbor search.
//! Perform monochromatic neighbor search.
void Search(const size_t k,
arma::Mat<size_t>& neighbors,
arma::mat& distances);

//! Return a string representation of the current tree type.
std::string TreeName() const;
};

Expand Down