Skip to content

Commit

Permalink
Documentation for stats dist objects with image problems
Browse files Browse the repository at this point in the history
  • Loading branch information
eduardojsbarroso committed Aug 4, 2021
1 parent 5c3ee90 commit 70c66ae
Show file tree
Hide file tree
Showing 6 changed files with 80 additions and 63 deletions.
4 changes: 3 additions & 1 deletion docs/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,9 @@ IGNORE_HFILES= \
# Images to copy into HTML directory.
# e.g. HTML_IMAGES=$(top_srcdir)/gtk/stock-icons/stock_about_24.png
HTML_IMAGES= \
$(srcdir)/images/spline_func_knots_evolution.png
$(srcdir)/images/spline_func_knots_evolution.png \
$(srcdir)/images/vkde.png \
$(srcdir)/images/kde.png

# Extra SGML files that are included by $(DOC_MAIN_SGML_FILE).
# e.g. content_files=running.sgml building.sgml changes-2.0.sgml
Expand Down
Binary file added docs/images/kde.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/vkde.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
59 changes: 33 additions & 26 deletions numcosmo/math/ncm_stats_dist.c
Original file line number Diff line number Diff line change
Expand Up @@ -28,66 +28,73 @@
/**
* SECTION:ncm_stats_dist
* @title: NcmStatsDist
* @short_description: Abstract class for implementing N dimensional probability distributions
* @short_description: Abstract class for implementing N-dimensional probability distributions.
*
* Abstract class to reconstruct an arbitrary N dimensional probability distribution.
* Abstract class to reconstruct an arbitrary N-dimensional probability distribution.
* This class provides the tools to perform a radial basis interpolation
* in a multidimensional function using a radial basis function, and then
* in a multidimensional function using a radial basis function and then
* generates a new sample using the interpolation function as the kernel.
* This method generates a sample that is distributed by the original distribution,
* but in a more simple way, since the used kernels are easier to sample from.
* For more informations about radial basis interpolation,
* but in a more simple way since the used kernels are easier to sample from.
* For more information about radial basis interpolation,
* check [[Radial Basis Function Interpolation, Wilna du Toit](https://core.ac.uk/download/pdf/37320748.pdf)].
* A brief description of the radial basis interpolation method can be found below.
*
* Given a d-simensional function $g(x): \mathbf{R}^d \rightarrow \mathbf{R}$, a radial basis
* function $\phi(x, \Sigma)$ is used such that
* \begin{align}
* \label{Interpolation_eq}
* s(x) = \sum_i^n \lambda \phi(|x-x_i|, \Sigma_i), \quad x~ \in~ \mathbf{R}
* s(x) = \sum_i^n \lambda_i \phi(|x-x_i|, \Sigma_i), \quad x~ \in~ \mathbf{R}
*. \end{align}
* The matrix lambda is found such that
* The variablse $\lambda_i$ represent the weights and are found such that
\begin{align}
* \label{eqnnls}
* \label{eqnnls1}
* s(x_i) = g(x_i)
*, \end{align}
* being $x_i$ the sample points.
* The values generated by $\phi(|x-x_i|, \Sigma_i)$ generate a symmetric $n \times n$ matrix using a radial basis function.
* This function depend on the norm of the points and on the covariance matrix $\Sigma$ associated to each point.
* Once the Lambda matrix is found, one may use $s(x)$ to sample values from $g(x)$, which is easier to do since $s(x)$ is
* The values generated by $\phi(|x-x_i|, \Sigma_i)$ are displayed in a symmetric $n \times n$ matrix $\Phi$.
* This function depends on the norm of the points and on the covariance matrix $\Sigma$ associated with each point.
* The weights $\lambda_i$ are also organised in a matrix representation such that equation \eqref{eqnnls1} becomes
* \begin{align}
* \label{eqnnls}
* G = \lambda \times \Phi
* ,\end{align}
* where $G$ is a matrix containing all the function values $g(xi)$. Once the Lambda matrix is found,
* one may use $s(x)$ to sample values from $g(x)$, which is easier to do since $s(x)$ is
* a polynomial function.
*
* We want $s(x)$ to be a probability distribution so we can sample from it. Therefore the Lambda matrix containing the
* weights is seen as the probability density and it must be minimized such that its values are always positive and sum up to one. To solve equation this problem,
* this algorithm has the tools to solve equation \eqref{eqnnls} for $\lambda$, which is a least squares problem,
* using the NNLS method, which can be found in nnls.c file. Thus, the algorithm is able to randomly choose a kernel $\phi(|x-x_i|, \Sigma_i)$ associated
* this algorithm has the tools to solve equation \eqref{eqnnls} for $\lambda$, which is a least-squares problem,
* using the NNLS method, which can be found in nnls.c file. Thus, the algorithm can randomly choose a kernel $\phi(|x-x_i|, \Sigma_i)$ associated
* to a probability contained in $\lambda$ and sample a point from it.
*
*
* In this file, the radial basis interpolation function is not defined. One must choose one of the instances of the class, the
* #NcmStatsDistKDEStudentt object or the #NcmStatsDistKDEGauss object, which use a multivariate Student's t function and a Gaussian function respectively.
* In this oobject, the radial basis interpolation function is not completely defined. One must choose one of the instances of the class, the
* #NcmStatsDistKDEStudentt object or the #NcmStatsDistKDEGauss object, which uses a multivariate Student's t function and a Gaussian function as the kernel.
* After initializing the desired object for the interpolation function, one may use the methods of this file to generate the interpolation and to
* sample from the new interpolated function.
*
* The user must provide input the values: @over_smooth - ncm_stats_dist_set_over_smooth(), @split_frac - ncm_stats_dist_set_split_frac(),
* The user must provide the input the values: @over_smooth - ncm_stats_dist_set_over_smooth(), @split_frac - ncm_stats_dist_set_split_frac(),
* @over_smooth - ncm_stats_dist_set_over_smooth(), $v(x)$ - ncm_stats_dist_prepare_interp(). The other parameters
* must be inserted when the instance for the #NcmStatsDistKDE or the #NcmStatsDistVKDE object is initialized. To perform a calculation of this class, one
* needs to initialize the class within one of its childs (#NcmStatsDistGauss or #NcmStatsDistST), along with the input of a child object of the class
* #NcmStatsDistKernel. For more information about the algorithm, see the description below.
*
* -Since this class does not define what type of kernel will be used in the calculation (the fixed kernel in the #NcmStatsDistKDE class or the variable kernel in #NcmStatsDistVKDE class),
* one cannot comput the sample just using this instance. Also, it must be provided the function to be used as the kernel, which are implemented in the childs from the class #NcmStatsDistKernel.
* When initializing the #NcmStatsDistKDE or #NcmStatsDistVKDE classes, the function to be used as the kernel is defined in the object initialization function.
* -Since this class does not define what type of kernel will be used in the calculation (the fixed kernel in the #NcmStatsDistKDE class or the variable kernel in #NcmStatsDistVKDE class),
* one cannot compute the sample just using this instance. Also, it must be provided the function to be used as the kernel, which is implemented in the children from the class #NcmStatsDistKernel.
* When initializing the #NcmStatsDistKDE or #NcmStatsDistVKDE classes, the function to be used as the kernel is defined in the object initialization function.
*
* -This class also needs a child object to compute the interpolation matrix $IM$ and the covariance matrices stored in $cov_decomp$ to perform the interpolation,
* which are kernel dependent and therefore also computed by the class child objects.
* -This class also needs a child object to compute the interpolation matrix $IM$ and the covariance matrices stored in @cov_decomp to perform the interpolation,
* which is kernel dependent and therefore also computed by the class child objects.
* -Regarding the kernel types based on the radial basis function, $\phi(|x-x_i|)$, and how the sample points in ncm_stats_dist_sample are generated, see the different implementations of #NcmStatsDistKernel, e.g., #NcmStatsDistKernelGauss and #NcmStatsDistKernelST
* -Regarding the kernel types based on the radial basis function, $\phi(|x-x_i|)$, and how the sample points in ncm_stats_dist_sample() are generated,
* see the different implementations of #NcmStatsDistKernel, e.g., #NcmStatsDistKernelGauss and #NcmStatsDistKernelST
*
* -Regarding how the functions ncm_stats_dist_eval_weights() and ncm_stats_dist_eval_weights_m2lnp() are implemented, see
* the different implementations of #NcmStatsDist, i.e., #NcmStatsDistKDE and #NcmStatsDistVKDE. These objects also
* compute the covariance matrix of each sample point and other objects needed for the least squares problem, when
* computing the weights matrix ($\lambda$).
* -Regarding how the functions ncm_stats_dist_eval() and ncm_stats_dist_eval_m2lnp() are implemented, see
* the different implementations of #NcmStatsDist, i.e., #NcmStatsDistKDE and #NcmStatsDistVKDE. These objects also
* compute the covariance matrix of each sample point and other objects needed for the least-squares problem, when
* computing the weights matrix ($\lambda$).
*
*/

Expand Down
31 changes: 18 additions & 13 deletions numcosmo/math/ncm_stats_dist_kde.c
Original file line number Diff line number Diff line change
Expand Up @@ -28,28 +28,29 @@
/**
* SECTION:ncm_stats_dist_kde
* @title: NcmStatsDistKDE
* @short_description: Abstract class for implementing N dimensional probability distributions with a fixed density estimator kernel.
* @short_description: Abstract class for implementing N-dimensional probability distributions with a fixed density estimator kernel.
*
* Abstract object to reconstruct an arbitrary N dimensional probability distribution.
* Abstract object to reconstruct an arbitrary N-dimensional probability distribution.
* This object provides the complementary tools to perform a radial basis interpolation
* in a multidimensional function using the #NcmStatsDist class.
*
* This object sets the kernel $\phi$ to be used in the radial basis interpolation. This object also implements some
* calculations needed in the #NcmStatsDist class, such as: the covariance matrix of the whole sample and its cholesky decomposition,
* calculations needed in the #NcmStatsDist class, such as the covariance matrix of the whole sample and its Cholesky decomposition,
* the preparation of the interpolation matrix $IM$, the kernel normalization factor, and given a sample vector $\vec{x}$, the distribution
* evaluated in these points. Some of these calculations are explained below.
*
* The #NcmStatsDistKDE class uses one covariance matrix for all the sample points. So, given $n$ points, there is only
* one covariance matrix $\Sigma$ that is used for all the i\textit{th} kernels $\phi(|x-x_i|, \Sigma)$. After the covariance
* matrix is computed, the algorithm computes the cholesky decomposition, that is
* one covariance matrix $\Sigma$ that is used for all the i$th$ kernels $\phi(|x-x_i|, \Sigma)$. After the covariance
* matrix is computed, the algorithm computes the Cholesky decomposition, that is
* \begin{align}
* \Sigma &= AA^T
* ,\end{align}
* where $A$ is a triangular positive defined matrix and $A^T$ is its transpose. The $A$ matrix is used in the least square squares
* calculation method that is called in the #NcmStatsDist class.
*
* The object also prepares the interpolation matrix to be implemented in the least squares problem, that is, given the relation
* \left[\begin{array}{cccc}
*
* The object also prepares the interpolation matrix to be implemented in the least-squares problem, that is, given the relation
* $\left[\begin{array}{cccc}
* \phi\left(\left\|\mathbf{x}_{1}-\mathbf{x}_{1}\right\|\right) & \phi\left(\left\|\mathbf{x}_{2}-\mathbf{x}_{1}\right\|\right) & \ldots & \phi\left(\left\|\mathbf{x}_{n}-\mathbf{x}_{1}\right\|\right) \\
* \phi\left(\left\|\mathbf{x}_{1}-\mathbf{x}_{2}\right\|\right) & \phi\left(\left\|\mathbf{x}_{2}-\mathbf{x}_{2}\right\|\right) & \ldots & \phi\left(\left\|\mathbf{x}_{n}-\mathbf{x}_{2}\right\|\right) \\
* \vdots & \vdots & & \vdots \\
Expand All @@ -60,13 +61,13 @@
* \vdots \\
* \lambda_{n}
* \end{array}\right]=\left[\begin{array}{c}
* f_{1} \\
* f_{2} \\
* g_{1} \\
* g_{2} \\
* \vdots \\
* f_{n}
* ,\end{array}\right]
* this object prepares the first matrix for all the $n$ points in the sample, using the covariance matrix and the defined kernel.
* The #NcmStatsDist class implements the solution for this relation and then one is able to compute the distribution for a given
* g_{n}
* ,\end{array}\right]$
* which is explained in the #NcmStatsDist class, this object prepares the first matrix for all the $n$ points in the sample, using the covariance matrix and the defined kernel.
* The #NcmStatsDist class implements the solution for this relation and then one can compute the distribution for a given
* vector $\vec{x}$ using a method of the #NcmStatsDist class but that is implemented in this object.
*
*
Expand Down Expand Up @@ -561,3 +562,7 @@ ncm_stats_dist_kde_get_nearPD_maxiter (NcmStatsDistKDE *sdkde)
return self->nearPD_maxiter;
}

/** ![an inline image](kde.png)
*
* <inlinegraphic fileref="kde.png" format="PNG" scale="98" align="right"/>
**/
49 changes: 26 additions & 23 deletions numcosmo/math/ncm_stats_dist_vkde.c
Original file line number Diff line number Diff line change
Expand Up @@ -28,30 +28,30 @@
/**
* SECTION:ncm_stats_dist_vkde
* @title: NcmStatsDist
* @short_description: Abstract class for implementing N dimensional probability distributions with a variable density estimator kernel.
* @short_description: Abstract class for implementing N-dimensional probability distributions with a variable density estimator kernel.
*
* Abstract object to reconstruct an arbitrary N dimensional probability distribution.
* This object provides the complementary tools to perform a radial basis interpolation
* in a multidimensional function using the #NcmStatsDist class.
*
* This object sets the kernel $\phi$ to be used in the radial basis interpolation. This object also implements some
* calculations needed in the #NcmStatsDist class, such as: the covariance matrix of the whole sample and its cholesky decomposition,
* the preparation of the interpolation matrix $IM$, the kernel normalization factor, and given a sample vector $\vec{x}$, the distributio
* evaluated in these points. Some of these calculations are explained below.
*
* The #NcmStatsDistVKDE uses a different covariance matrix for each sample point. This feature is computed
* in the ncm_stats_dist_vkde_prepare_kernel() function. In this algorithm, one should define the @local_frac parameter, that is,
* the fraction of nearest sample points that will be used to compute each covariance matrix of each
* sample point. This is done by calling the function ncm_stats_dist_vkde_set_local_frac().
* The rest of the calculation follows the same procedure as the #NcmStatsDist and #NcmStatsDistKDE objects,
* using now a different covariance matrix and normalization factor for each kernel. For more information about
* how the #NcmStatsDist class works, check #NcmStatsDist and #NcmStatsDistKDE objects.
*
* The user must provide input the values: @sdk, @CV_type - ncm_stats_dist_vkde_new(), @y - ncm_stats_dist_add_obs(), @split_frac - ncm_stats_dist_set_split_frac(),
* @over_smooth - ncm_stats_dist_set_over_smooth(), @local_Frac - ncm_stats_dist_vkde_set_local_frac(), $v(x)$ - ncm_stats_dist_prepare_interp().
* To see an example of how to use this object and the main functions that are called within each function, check the fluxogram at the end of this documentation,
* where the order of the functions that should be called by the user and some of the functions that the algorithm calls.
*/
* Abstract object to reconstruct an arbitrary N-dimensional probability distribution.
* This object provides the complementary tools to perform a radial basis interpolation
* in a multidimensional function using the #NcmStatsDist class.
*
* This object sets the kernel $\phi$ to be used in the radial basis interpolation. This object also implements some
* calculations needed in the #NcmStatsDist class, such as the covariance matrices of the whole sample points and its Cholesky decompositions,
* the preparation of the interpolation matrix $IM$, the kernel normalization factors, and given a sample vector $\vec{x}$, the distribution
* evaluated in these points. Some of these calculations are explained below.
*
* The #NcmStatsDistVKDE uses a different covariance matrix for each sample point. This feature is computed
* in the ncm_stats_dist_vkde_prepare_kernel() function. In this algorithm, one should define the @local_frac parameter, that is,
* the fraction of nearest sample points that will be used to compute each covariance matrix of each
* sample point. This is done by calling the function ncm_stats_dist_vkde_set_local_frac().
* The rest of the calculation follows the same procedure as the #NcmStatsDist and #NcmStatsDistKDE objects,
* using now a different covariance matrix and normalization factor for each kernel. For more information about
* how the #NcmStatsDist class works, check #NcmStatsDist and #NcmStatsDistKDE objects.
*
* The user must provide input the values: @sdk, @CV_type - ncm_stats_dist_vkde_new(), @y - ncm_stats_dist_add_obs(), @split_frac - ncm_stats_dist_set_split_frac(),
* @over_smooth - ncm_stats_dist_set_over_smooth(), @local_Frac - ncm_stats_dist_vkde_set_local_frac(), $v(x)$ - ncm_stats_dist_prepare_interp().
* To see an example of how to use this object and the main functions that are called within each function, check the fluxogram at the end of this documentation,
* where the order of the functions that should be called by the user and some of the functions that the algorithm calls.
*/

#ifdef HAVE_CONFIG_H
# include "config.h"
Expand Down Expand Up @@ -626,3 +626,6 @@ ncm_stats_dist_vkde_get_local_frac (NcmStatsDistVKDE *sdvkde)
return self->local_frac;
}

/** ![an inline image](vkde.png)
*
* <inlinegraphic fileref="vkde.png" format="PNG" scale="98" align="right"/> **/

0 comments on commit 70c66ae

Please sign in to comment.