Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement KDE #1301

Merged
merged 155 commits into from Jan 18, 2019
Merged
Show file tree
Hide file tree
Changes from 99 commits
Commits
Show all changes
155 commits
Select commit Hold shift + click to select a range
db059c4
Add a first implementation of KDE
robertohueso Mar 9, 2018
b99fddf
Merge branch 'master' into kde
robertohueso Mar 9, 2018
099ca31
Merge branch 'master' into kde
robertohueso Apr 6, 2018
245e3c3
Style fix
robertohueso Apr 6, 2018
f973e8d
Add KDE output to file
robertohueso Apr 6, 2018
28d0d76
Add KDE simple test
robertohueso Apr 7, 2018
5dcbcb8
Fix KDE dual-tree algorithm
robertohueso Apr 13, 2018
8908438
Avoid matrix copy in KDE main
robertohueso Apr 25, 2018
6302607
Delete unused variable
robertohueso Apr 26, 2018
164c22d
Delete leafSize parameter for KDE trees
robertohueso Apr 26, 2018
ba9f83d
Improve KDE API
robertohueso May 2, 2018
ead87a5
Handle FirstPointIsCentroid and RearrangesDataset
robertohueso May 3, 2018
df4e030
Fix tree building
robertohueso May 3, 2018
9fa129b
Fix uninitialized pointer
robertohueso May 4, 2018
2cca9f5
Implement relative error tolerance
robertohueso May 4, 2018
8398686
Implement Evaluate(Tree...)
robertohueso May 4, 2018
c1dedcc
Add methods to get and modify KDE parameters
robertohueso May 4, 2018
1bd1eec
Add KDE copy constructor
robertohueso May 4, 2018
ca36323
Add KDE operator=
robertohueso May 4, 2018
e9a652a
Add KDE move constructor
robertohueso May 5, 2018
731d8e8
Remove const requirement from KernelType
robertohueso May 10, 2018
3052579
Use unsafe_col to speed up KDE score
robertohueso May 10, 2018
d172dbf
Handle kernel and metric as KDE member objects
robertohueso May 10, 2018
c3dd7fa
Fix small mistake
robertohueso May 10, 2018
8382083
Add KDE custom kernel and metric constructor
robertohueso May 11, 2018
4824d06
Add KDE breadth-first support
robertohueso May 11, 2018
9c39afb
Fix constructor error
robertohueso Jul 8, 2018
4c9aaff
Add gaussian kernel support in KDE main
robertohueso Jul 8, 2018
9c39fc2
Merge branch 'master' into kde
robertohueso Jul 8, 2018
6b4733d
Fix KDE main typo
robertohueso Jul 8, 2018
18fb714
Add epanechnikov kernel support in KDE main
robertohueso Jul 8, 2018
8734d3c
Add brute force gaussian KDE algorithm
robertohueso Jul 16, 2018
88de6b2
Add gaussian KDE brute force test
robertohueso Jul 16, 2018
ca70157
Generic KDE brute force for all kernels
robertohueso Jul 17, 2018
8595033
Add KDE gaussian ball-tree test
robertohueso Jul 17, 2018
91ad4f7
Add duplicated reference value KDE test
robertohueso Jul 17, 2018
a58a72d
Add duplicated query value KDE test
robertohueso Jul 17, 2018
ae18fff
Add breadth-first KDE test
robertohueso Jul 17, 2018
3b5bd37
Add 1D KDE test
robertohueso Jul 17, 2018
cbee486
Handle empty reference dataset in KDE training
robertohueso Jul 17, 2018
cca7d36
Add empty reference dataset KDE test
robertohueso Jul 17, 2018
6123075
Handle dimension mismatch in KDE evaluation
robertohueso Jul 18, 2018
07f0df1
Add dimension mismatch KDE test
robertohueso Jul 18, 2018
e9efbd6
Handle empty querySet in KDE evaluation
robertohueso Jul 18, 2018
128e176
Add empty querySet KDE test
robertohueso Jul 18, 2018
be84c73
Assert KDE trees have not HasDuplicatedPoints
robertohueso Jul 20, 2018
2dce2ca
Assert KDE trees have UniqueNumDescendants
robertohueso Jul 20, 2018
d00bb33
Add KDEStat as a TreeStatType for KDE
robertohueso Jul 20, 2018
6ca5788
Add EvaluateKernel for KDE rules
robertohueso Jul 20, 2018
f25eb27
Improve KDE dual-tree score using stats
robertohueso Jul 20, 2018
4daecf6
Adjust existing code to KDEStat
robertohueso Jul 20, 2018
8e9573e
Add KDE default constructor
robertohueso Jul 21, 2018
8194527
Add KDE serialization method
robertohueso Jul 21, 2018
bebf37f
Add KDE serialization test
robertohueso Jul 21, 2018
8c0f61b
Prepare estimation vectors on KDE evaluate
robertohueso Jul 22, 2018
cdabad0
Add KDE documentation
robertohueso Jul 22, 2018
dcec680
Improve KDE error tolerance handling
robertohueso Jul 22, 2018
3dcec63
Improve KDE api to fit #1021
robertohueso Jul 22, 2018
dbc368b
Small simplification
robertohueso Jul 24, 2018
5533821
Normalize in KDE module
robertohueso Jul 26, 2018
09448c8
Add KDEModel a KDE api abstraction
robertohueso Jul 26, 2018
51ad93c
Add KDEModel to CMake
robertohueso Jul 26, 2018
b86ec2a
Rewrite KDE main to make use of KDEModel
robertohueso Jul 26, 2018
6002910
Add load/save KDE models
robertohueso Jul 27, 2018
7ad8022
Improve KDERules style
robertohueso Jul 27, 2018
fb0972e
Store centroids in KDEStat
robertohueso Jul 27, 2018
d39f121
Fix HasSelfChildren KDE and improve style
robertohueso Jul 27, 2018
071767d
Add openmp KDE optimization
robertohueso Jul 29, 2018
8d5729d
Improve KDE SerializationTest
robertohueso Sep 16, 2018
fc145c2
Merge branch 'master' into kde
robertohueso Sep 17, 2018
0c22d56
Reuse KDE evaluate
robertohueso Sep 17, 2018
473b38c
Fix style issue
robertohueso Sep 17, 2018
3635fb3
Delete unnecessary warning
robertohueso Sep 17, 2018
4343d09
Avoid copy reference matrix in KDE training
robertohueso Sep 17, 2018
15e0127
Improve KDE api to fit #1021
robertohueso Sep 18, 2018
e97d1bf
Fix memory leak in KDE main
robertohueso Sep 18, 2018
aa2e85b
Improve KDE model docs
robertohueso Sep 19, 2018
f0af9b4
Improve KDE main docs
robertohueso Sep 20, 2018
018ff13
Delete KDE main stdout option
robertohueso Sep 20, 2018
8148562
Delete normalization from KDE module
robertohueso Sep 21, 2018
a42ea53
Fix minor error
robertohueso Sep 21, 2018
02955e1
Add KDEModel visitor specialization
robertohueso Sep 21, 2018
7dbaf03
Add KDE Laplacian Kernel support
robertohueso Sep 21, 2018
e15ef14
Add KDE Spherical Kernel support
robertohueso Sep 21, 2018
e7b7b57
Add KDE Triangular Kernel support
robertohueso Sep 21, 2018
1d1b34a
Add KDE same set support
robertohueso Sep 23, 2018
89f11f8
Add monochromatic KDE main support
robertohueso Sep 23, 2018
ab3e1f0
Change default relative KDE error
robertohueso Sep 24, 2018
f6396da
Use custom traversal for KDE
robertohueso Sep 28, 2018
43849de
Add KDE Octree gaussian test
robertohueso Sep 28, 2018
a26dedd
Add KDE RTree gaussian test
robertohueso Sep 28, 2018
2415b11
Add KDE rules Cover tree support
robertohueso Oct 14, 2018
13de1a3
Add KDE StandardCoverTree gaussian test
robertohueso Oct 14, 2018
9cff9c5
Add KDE main support for Cover-tree, Octree and RTree
robertohueso Oct 15, 2018
843968a
Rewrite KDE dual-tree Score
robertohueso Oct 16, 2018
9b95d01
Improve centroid handling in KDEStat
robertohueso Oct 16, 2018
4654def
Add KDE main tests
robertohueso Oct 17, 2018
68bf18c
Add KDE main output size test
robertohueso Nov 6, 2018
3b1fe74
Add KDE main model reuse test
robertohueso Nov 7, 2018
d023b82
Implement KDE single tree score
robertohueso Nov 8, 2018
93c8191
Fix KDE serialization test evaluation
robertohueso Nov 8, 2018
1700394
Handle KDE kernel normalization using explicit specialization
robertohueso Nov 14, 2018
cc515f6
Add KDE main results without normalzation test
robertohueso Nov 19, 2018
d9a4dc6
Add KDE main results mono test
robertohueso Nov 20, 2018
56bfaa5
Add KDE timers
robertohueso Nov 20, 2018
1f5584b
Add some KDE log information
robertohueso Nov 20, 2018
01a8043
Fix KDE includes
robertohueso Dec 25, 2018
c4e0e1d
Merge master
robertohueso Dec 25, 2018
5ab6234
Fix style issues
robertohueso Dec 25, 2018
1d38bf4
Improve KDE log messages
robertohueso Dec 25, 2018
0bb023a
Refactor KDE main predictions output
robertohueso Dec 29, 2018
dee4468
Improve KDE predictions vector preparation
robertohueso Dec 30, 2018
e57fb1c
Improve KDE EmptyQuerySetTest
robertohueso Dec 30, 2018
c4f501b
Manage KDE normalizers using SFINAE
robertohueso Dec 30, 2018
972580f
Compute centroids in KDEStat constructor
robertohueso Jan 3, 2019
a3f1012
Save unnecessary calculations in KDE rules
robertohueso Jan 3, 2019
bd2e970
Rearrange KDE predictions on evaluation
robertohueso Jan 4, 2019
2465332
Improve KDE KernelNormalizer SFINAE
robertohueso Jan 4, 2019
323fd1c
Add KDE class single-tree support
robertohueso Jan 4, 2019
036a3a4
Unify all KDE constructors
robertohueso Jan 4, 2019
fc5bf64
Adapt KDE tests to the new constructor
robertohueso Jan 4, 2019
21e4b89
Adapt KDEModel to the new constructor
robertohueso Jan 4, 2019
83f3b11
Add KDEModel single-tree support
robertohueso Jan 4, 2019
213c207
Add KDEMain single-tree support
robertohueso Jan 4, 2019
4082f6f
Add GaussianSingleKDEBruteForceTest
robertohueso Jan 4, 2019
565c8ec
Add KDEGaussianSingleKDTreeResultsMain
robertohueso Jan 4, 2019
bb4b175
Fix computing_kde timer
robertohueso Jan 4, 2019
d9cb3ba
Add KDEMainInvalidKernel test
robertohueso Jan 5, 2019
84e81ae
Add KDEMainInvalidTree test
robertohueso Jan 5, 2019
82c3fb5
Add KDEMainInvalidAlgorithm test
robertohueso Jan 5, 2019
27d6d5e
Add KDEMainReferenceAndModel test
robertohueso Jan 5, 2019
1d12b6e
Improve KDE main docs
robertohueso Jan 5, 2019
cc74c27
Add KDEMainInvalidAbsoluteError test
robertohueso Jan 5, 2019
dd3a1f9
Add KDEMainInvalidRelativeError test
robertohueso Jan 5, 2019
83c5a4e
Add EpanechnikovCoverSingleKDETest test
robertohueso Jan 5, 2019
b9e26e2
Add EpanechnikovOctreeSingleKDETest test
robertohueso Jan 5, 2019
cf94b96
Fix KDE tests error tolerance
robertohueso Jan 6, 2019
7380f06
Fix KDE copy constructor
robertohueso Jan 6, 2019
cb45c43
Add KDE CopyConstructor test
robertohueso Jan 6, 2019
2398997
Change KDE template order
robertohueso Jan 6, 2019
658a05a
Adapt KDEModel to new KDE template order
robertohueso Jan 6, 2019
73050d6
Adapt KDE tests to new KDE template order
robertohueso Jan 6, 2019
0a588e3
Adapt KDE main tests to new KDE template order
robertohueso Jan 6, 2019
0ac2843
Add methods to get and modify KDE metric
robertohueso Jan 7, 2019
b6fee24
Fix KDE move constructor
robertohueso Jan 7, 2019
2381230
Add MoveConstructor KDE test
robertohueso Jan 7, 2019
76a1398
Check KDE is trained before evaluation
robertohueso Jan 7, 2019
e119f6d
Add NotTrained KDE test
robertohueso Jan 7, 2019
8cc5a03
Small KDE coding style improvements
robertohueso Jan 7, 2019
7bf036b
KDE style improvements
robertohueso Jan 7, 2019
bc39232
Remove KDE comment
robertohueso Jan 7, 2019
c191e9b
Update KDE author
robertohueso Jan 13, 2019
e3a5eee
Update KDE docs
robertohueso Jan 13, 2019
0d5bb46
Improve KDE docs
robertohueso Jan 16, 2019
20230ae
Improve KDE docs
robertohueso Jan 18, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions src/mlpack/methods/CMakeLists.txt
Expand Up @@ -69,6 +69,7 @@ set(DIRS
sparse_coding
sparse_svm
svdplusplus
kde
)

foreach(dir ${DIRS})
Expand Down
23 changes: 23 additions & 0 deletions src/mlpack/methods/kde/CMakeLists.txt
@@ -0,0 +1,23 @@
# Define the files we need to compile.
# Anything not in this list will not be compiled into mlpack.
set(SOURCES
kde.hpp
kde_impl.hpp
kde_rules.hpp
kde_rules_impl.hpp
kde_stat.hpp
kde_model.hpp
kde_model_impl.hpp
)

# Add directory name to sources.
set(DIR_SRCS)
foreach(file ${SOURCES})
set(DIR_SRCS ${DIR_SRCS} ${CMAKE_CURRENT_SOURCE_DIR}/${file})
endforeach()
# Append sources (with directory name) to list of all mlpack sources (used at
# the parent scope).
set(MLPACK_SRCS ${MLPACK_SRCS} ${DIR_SRCS} PARENT_SCOPE)

add_cli_executable(kde)
add_python_binding(kde)
256 changes: 256 additions & 0 deletions src/mlpack/methods/kde/kde.hpp
@@ -0,0 +1,256 @@
/**
* @file kde.hpp
* @author Roberto Hueso (robertohueso96@gmail.com)
robertohueso marked this conversation as resolved.
Show resolved Hide resolved
*
* Kernel Density Estimation.
*
* mlpack is free software; you may redistribute it and/or modify it under the
* terms of the 3-clause BSD license. You should have received a copy of the
* 3-clause BSD license along with mlpack. If not, see
* http://www.opensource.org/licenses/BSD-3-Clause for more information.
*/

#ifndef MLPACK_METHODS_KDE_KDE_HPP
#define MLPACK_METHODS_KDE_KDE_HPP

#include <mlpack/prereqs.hpp>
#include <mlpack/core/metrics/lmetric.hpp>
#include <mlpack/core/tree/binary_space_tree.hpp>

#include "kde_stat.hpp"

namespace mlpack {
namespace kde /** Kernel Density Estimation. */ {

/**
* The KDE class is a template class for performing Kernel Density Estimations.
* In statistics, kernel density estimation, is a way to estimate the
robertohueso marked this conversation as resolved.
Show resolved Hide resolved
* probability density function of a variable in a non parametric way.
* This implementation performs this estimation using a tree-independent
* dual-tree algorithm. Details about this algorithm are available in KDERules.
*
* @tparam MetricType Metric to use for KDE calculations.
* @tparam MatType Type of data to use.
* @tparam KernelType Kernel function to use for KDE calculations.
* @tparam TreeType Type of tree to use; must satisfy the TreeType policy API.
*/
template<typename MetricType = mlpack::metric::EuclideanDistance,
typename MatType = arma::mat,
typename KernelType = kernel::GaussianKernel,
template<typename TreeMetricType,
typename TreeStatType,
typename TreeMatType> class TreeType = tree::KDTree,
template<typename RuleType> class DualTreeTraversalType =
TreeType<MetricType,
kde::KDEStat,
MatType>::template DualTreeTraverser>
class KDE
{
public:
//! Convenience typedef.
typedef TreeType<MetricType, kde::KDEStat, MatType> Tree;

/**
* Initialize KDE object with the default Kernel and Metric parameters.
* Relative error tolernce is initialized to 0.05 (5%), absolute error
* tolerance is 0.0 and uses a depth-first approach.
*/
KDE();

/**
* Initialize KDE object using the default Metric parameters and a given
* Kernel bandwidth (<b>only for kernels that require a bandwidth and are
* constructed like kernel(bandwidth)</b>).
*
* @param bandwidth Bandwidth of the kernel.
* @param relError Relative error tolerance of the model.
* @param absError Absolute error tolerance of the model.
*/
KDE(const double bandwidth,
const double relError = 0.05,
const double absError = 0);

/**
* Initialize KDE object using custom instantiated Metric and Kernel objects.
*
* @param metric Instantiated metric object.
* @param kernel Instantiated kernel object.
* @param relError Relative error tolerance of the model.
* @param absError Absolute error tolerance of the model.
*/
KDE(MetricType& metric,
KernelType& kernel,
const double relError = 0.05,
const double absError = 0);
robertohueso marked this conversation as resolved.
Show resolved Hide resolved

/**
* Construct KDE object as a copy of the given model. This may be
* computationally intensive!
*
* @param other KDE object to copy.
*/
KDE(const KDE& other);

/**
* Construct KDE object taking ownership of the given model.
*
* @param other KDE object to take ownership of.
*/
KDE(KDE&& other);

/**
* Copy a KDE model.
*
* Use std::move if the object to copy is no longer needed.
*
* @param other KDE model to copy.
*/
KDE& operator=(KDE other);

/**
* Destroy the KDE object. If this object created any trees, they will be
* deleted. If you created the trees then you have to delete them yourself.
*/
~KDE();

/**
* Trains the KDE model. It builds a tree using a reference set.
*
* Use std::move if the reference set is no longer needed.
*
* @param referenceSet Set of reference data.
*/
void Train(MatType referenceSet);

/**
* Trains the KDE model. Sets the reference tree to an already created tree.
*
* - If TreeTraits<TreeType>::RearrangesDataset is False then it is possible
* to use an empty oldFromNewReferences vector.
*
* @param referenceTree New already created reference tree.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if New already created reference tree makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to improve it in 20230ae :)

* @param oldFromNewReferences Permutations of reference points obtained
* during tree generation.
*/
void Train(Tree* referenceTree, std::vector<size_t>* oldFromNewReferences);

/**
* Estimate density of each point in the query set given the data of the
* reference set. The result is stored in an estimations vector.
* Estimations might not be normalized.
*
* - Dimension of each point in the query set must match the dimension of each
* point in the reference set.
*
* - Use std::move if the query set is no longer needed.
robertohueso marked this conversation as resolved.
Show resolved Hide resolved
*
* @pre The model has to be previously trained.
* @param querySet Set of query points to get the density of.
* @param estimations Object which will hold the density of each query point.
*/
void Evaluate(MatType querySet, arma::vec& estimations);

/**
* Estimate density of each point in the query set given the data of an
* already created query tree. The result is stored in an estimations vector.
* Estimations might not be normalized.
*
* - Dimension of each point in the queryTree dataset must match the dimension
* of each point in the reference set.
*
* - Use std::move if the query tree is no longer needed.
*
* @pre The model has to be previously trained.
* @param queryTree Tree of query points to get the density of.
* @param oldFromNewQueries Mappings of query points to the tree dataset.
* @param estimations Object which will hold the density of each query point.
*/
void Evaluate(Tree* queryTree,
const std::vector<size_t>& oldFromNewQueries,
arma::vec& estimations);

/**
* Estimate density of each point in the reference set given the data of the
* reference set. It does not compute the estimation of a point with itself.
* The result is stored in an estimations vector. Estimations might not be
* normalized.
*
* @pre The model has to be previously trained.
* @param estimations Object which will hold the density of each reference
* point.
*/
void Evaluate(arma::vec& estimations);

//! Get the kernel.
const KernelType& Kernel() const { return *kernel; }

//! Modify the kernel.
KernelType& Kernel() { return *kernel; }

//! Get the reference tree.
Tree* ReferenceTree() { return referenceTree; }

//! Get relative error tolerance.
double RelativeError() const { return relError; }

//! Modify relative error tolerance (0 <= newError <= 1).
void RelativeError(const double newError);

//! Get absolute error tolerance.
double AbsoluteError() const { return absError; }

//! Modify absolute error tolerance (0 <= newError).
void AbsoluteError(const double newError);

//! Check whether reference tree is owned by the KDE model.
bool OwnsReferenceTree() const { return ownsReferenceTree; }

//! Check whether KDE model is trained or not.
bool IsTrained() const { return trained; }

//! Serialize the model.
template<typename Archive>
void serialize(Archive& ar, const unsigned int /* version */);

private:
//! Kernel.
KernelType* kernel;

//! Metric.
MetricType* metric;

//! Reference tree.
Tree* referenceTree;

//! Permutations of reference points.
std::vector<size_t>* oldFromNewReferences;

//! Relative error tolerance.
double relError;

//! Absolute error tolerance.
double absError;

//! If true, the KDE object is responsible for deleting the kernel.
bool ownsKernel;

//! If true, the KDE object is responsible for deleting the metric.
bool ownsMetric;

//! If true, the KDE object is responsible for deleting the reference tree.
bool ownsReferenceTree;

//! If true, the KDE object is trained.
bool trained;

//! Check whether absolute and relative error values are compatible.
void CheckErrorValues(const double relError, const double absError) const;
};

} // namespace kde
} // namespace mlpack

// Include implementation.
#include "kde_impl.hpp"

#endif // MLPACK_METHODS_KDE_KDE_HPP