Skip to content

Commit

Permalink
minor documentation fix
Browse files Browse the repository at this point in the history
  • Loading branch information
lambday committed Mar 6, 2014
1 parent 5b60cb6 commit e1aa552
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 12 deletions.
8 changes: 4 additions & 4 deletions doc/ipython-notebooks/statistics/mmd_two_sample_testing.ipynb
Expand Up @@ -91,15 +91,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Shogun implements statistical testing in the abstract class <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTestStatistic.html\">CTestStatistic</a>. All implemented methods will work with this interface at their most basic level. This class offers methods to\n",
"Shogun implements statistical testing in the abstract class <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CHypothesisTest.html\">CHypothesisTest</a>. All implemented methods will work with this interface at their most basic level. This class offers methods to\n",
"\n",
" * compute the implemented test statistic,\n",
" * compute p-values for a given value of the test statistic,\n",
" * compute a test threshold for a given p-value,\n",
" * sampling the null distribution, i.e. perform the permutation test or bootstrappig of the null-distribution, and\n",
" * performing a full two-sample test, and either returning a p-value or a binary rejection decision. This method is most useful in practice. Note that, depending on the used test statistic, it might be faster to call this than to compute threshold and test statistic seperately with the above methods.\n",
" \n",
"There are special subclasses for testing two distributions against each other (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTwoDistributionsTestStatistic.html\">CTwoDistributionsTestStatistic</a>), kernel two-sample testing (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelTwoSampleTestStatistic.html\">CKernelTwoSampleTestStatistic</a>), and kernel independence testing (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelIndependenceTestStatistic.html\">CKernelIndependenceTestStatistic</a>), which however mostly differ in internals and constructors."
"There are special subclasses for testing two distributions against each other (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTwoSampleTest.html\">CTwoSampleTest</a>, <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CIndependenceTest.html\">CIndependenceTest</a>), kernel two-sample testing (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelTwoSampleTest.html\">CKernelTwoSampleTest</a>), and kernel independence testing (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelIndependenceTest.html\">CKernelIndependenceTest</a>), which however mostly differ in internals and constructors."
]
},
{
Expand Down Expand Up @@ -295,7 +295,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Any sub-class of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTwoDistributionsTestStatistic.html\">CTwoDistributionsTestStatistic</a> can compute approximate the null distribution using permutation/bootstrapping. This way always is guaranteed to produce constitent results, however, it might take a long time as for each sample of the null distribution, the test statistic has to be computed for a different permutation of the data. Note that each of the below calls samples from the null distribution. It is wise to choose one method in practice. Also not that we set the number of samples from the null distribution to a low value to reduce runtume. Choose larger in practice, it is in fact good to plot the samples."
"Any sub-class of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CHypothesisTest.html\">CHypothesisTest</a> can compute approximate the null distribution using permutation/bootstrapping. This way always is guaranteed to produce constitent results, however, it might take a long time as for each sample of the null distribution, the test statistic has to be computed for a different permutation of the data. Note that each of the below calls samples from the null distribution. It is wise to choose one method in practice. Also not that we set the number of samples from the null distribution to a low value to reduce runtume. Choose larger in practice, it is in fact good to plot the samples."
]
},
{
Expand Down Expand Up @@ -380,7 +380,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let us visualise distribution of MMD statistic under $H_0:p=q$ and $H_A:p\\neq q$. Sample both null and alternative distribution for that. Use the interface of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTwoDistributionsTestStatistic.html\">CTwoDistributionsTestStatistic</a> to sample from the null distribution (permutations, re-computing of test statistic is done internally). For the alternative distribution, compute the test statistic for a new sample set of $X$ and $Y$ in a loop. Note that the latter is expensive, as the kernel cannot be precomputed, and infinite data is needed. Though it is not needed in practice but only for illustrational purposes here."
"Now let us visualise distribution of MMD statistic under $H_0:p=q$ and $H_A:p\\neq q$. Sample both null and alternative distribution for that. Use the interface of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTwoSampleTest.html\">CTwoSampleTest</a> to sample from the null distribution (permutations, re-computing of test statistic is done internally). For the alternative distribution, compute the test statistic for a new sample set of $X$ and $Y$ in a loop. Note that the latter is expensive, as the kernel cannot be precomputed, and infinite data is needed. Though it is not needed in practice but only for illustrational purposes here."
]
},
{
Expand Down
6 changes: 0 additions & 6 deletions src/shogun/statistics/LinearTimeMMD.cpp
Expand Up @@ -190,18 +190,14 @@ void CLinearTimeMMD::compute_statistic_and_variance(
* only once */
CKernel* kernel=m_kernel;
if (multiple_kernels)
{
SG_DEBUG("using multiple kernels\n");
}

/* iterate through all kernels for this data */
for (index_t i=0; i<num_kernels; ++i)
{
/* if multiple kernels should be computed, set next kernel */
if (multiple_kernels)
{
kernel=((CCombinedKernel*)m_kernel)->get_kernel(i);
}

/* compute kernel matrix diagonals */
kernel->init(p1, p2);
Expand Down Expand Up @@ -235,9 +231,7 @@ void CLinearTimeMMD::compute_statistic_and_variance(
}

if (multiple_kernels)
{
SG_UNREF(kernel);
}
}

/* clean up streamed data */
Expand Down
4 changes: 2 additions & 2 deletions src/shogun/statistics/LinearTimeMMD.h
Expand Up @@ -31,7 +31,7 @@ class CFeatures;
* The MMD is the distance of two probability distributions \f$p\f$ and \f$q\f$
* in a RKHS.
* \f[
* \text{MMD}}[\mathcal{F},p,q]^2=\textbf{E}_{x,x'}\left[ k(x,x')\right]-
* \text{MMD}[\mathcal{F},p,q]^2=\textbf{E}_{x,x'}\left[ k(x,x')\right]-
* 2\textbf{E}_{x,y}\left[ k(x,y)\right]
* +\textbf{E}_{y,y'}\left[ k(y,y')\right]=||\mu_p - \mu_q||^2_\mathcal{F}
* \f]
Expand Down Expand Up @@ -256,7 +256,7 @@ class CLinearTimeMMD: public CKernelTwoSampleTest
/** Number of examples processed at once, i.e. in one burst */
index_t m_blocksize;

/** If this is true, samples will be mixed between p and q ind any method
/** If this is true, samples will be mixed between p and q in any method
* that computes the statistic */
bool m_simulate_h0;
};
Expand Down

0 comments on commit e1aa552

Please sign in to comment.