diff --git a/doc/cookbook/source/examples/statistical_testing/linear_time_mmd.rst b/doc/cookbook/source/examples/statistical_testing/linear_time_mmd.rst
new file mode 100644
index 00000000000..2aef3a8a378
--- /dev/null
+++ b/doc/cookbook/source/examples/statistical_testing/linear_time_mmd.rst
@@ -0,0 +1,80 @@
+===============
+Linear Time MMD
+===============
+
+The linear time MMD implements a nonparametric statistical hypothesis test to reject the null hypothesis that to distributions :math:`p` and :math:`q`, each only observed via :math:`n` samples, are the same, i.e. :math:`H_0:p=q`.
+
+The (unbiased) statistic is given by
+
+.. math::
+
+  \frac{2}{n}\sum_{i=1}^n k(x_{2i},x_{2i}) + k(x_{2i+1}, x_{2i+1}) - 2k(x_{2i},x_{2i+1}).
+
+See :cite:`gretton2012kernel` for a detailed introduction.
+
+-------
+Example
+-------
+
+Imagine we have samples from :math:`p` and :math:`q`.
+As the linear time MMD is a streaming statistic, we need to pass it :sgclass:`CStreamingFeatures`.
+Here, we use synthetic data generators, but it is possible to construct :sgclass:`CStreamingFeatures` from (large) files.
+We create an instance of :sgclass:`CLinearTimeMMD`, passing it data and the kernel to use,
+
+.. sgexample:: linear_time_mmd.sg:create_instance
+
+An important parameter for controlling the efficiency of the linear time MMD is block size of the number of samples that is processed at once. As a guideline, set as large as memory allows.
+
+.. sgexample::linear_time_mmd.sg:set_burst
+
+Computing the statistic is done as
+
+.. sgexample::linear_time_mmd.sg:estimate_mmd
+
+We can perform the hypothesis test via computing a test threshold for a given :math:`\alpha`, or by directly computing a p-value.
+
+.. sgexample::linear_time_mmd.sg:perform_test_threshold
+
+---------------
+Kernel learning
+---------------
+
+There are various options to learn a kernel.
+All options allow to learn a single kernel among a number of provided baseline kernels.
+Furthermore, some of these criterions can be used to learn the coefficients of a convex combination of baseline kernels.
+
+There are different strategies to learn the kernel, see :sgclass:`CKernelSelectionStrategy`.
+
+We specify the desired baseline kernels to consider. Note the kernel above is not considered in the selection.
+
+.. sgexample:: linear_time_mmd.sg:add_kernels
+
+IMPORTANT: when learning the kernel for statistical testing, this needs to be done on different data than being used for performing the actual test.
+One way to accomplish this is to manually provide a different set of features for testing.
+In Shogun, it is also possible to automatically split the provided data by specifying the ratio between train and test data, via enabling the train-test mode.
+
+.. sgexample:: linear_time_mmd.sg:enable_train_test_mode
+
+A ratio of 1 means the data is split into half during learning the kernel, and subsequent tests are performed on the second half.
+
+We learn the kernel and extract the result, again see :sgclass:`CKernelSelectionStrategy` more available strategies. Note that the kernel of the mmd itself is replaced.
+If all kernels have the same type, we can convert the result into that type, for example to extract its parameters.
+
+.. sgexample:: linear_time_mmd.sg:select_kernel_single
+
+Note that in order to extract particular kernel parameters, we need to cast the kernel to its actual type.
+
+Similarly, a convex combination of kernels, in the form of :sgclass:`CCombinedKernel` can be learned and extracted as
+
+.. sgexample:: linear_time_mmd.sg:select_kernel_combined
+
+We can perform the test on the last learnt kernel.
+Since we enabled the train-test mode, this automatically is done on the held out test data.
+
+.. sgexample:: linear_time_mmd.sg:perform_test
+
+----------
+References
+----------
+.. bibliography:: ../../references.bib
+    :filter: docname in docnames
diff --git a/doc/cookbook/source/examples/statistical_testing/quadratic_time_mmd.rst b/doc/cookbook/source/examples/statistical_testing/quadratic_time_mmd.rst
new file mode 100644
index 00000000000..1882d19e7ae
--- /dev/null
+++ b/doc/cookbook/source/examples/statistical_testing/quadratic_time_mmd.rst
@@ -0,0 +1,93 @@
+==================
+Quadratic Time MMD
+==================
+
+The quadratic time MMD implements a nonparametric statistical hypothesis test to reject the null hypothesis that to distributions :math:`p` and :math:`q`, only observed via :math:`n` and :math:`m` samples respectively, are the same, i.e. :math:`H_0:p=q`.
+
+The (biased) test statistic is given by
+
+.. math::
+
+  \frac{1}{nm}\sum_{i=1}^n\sum_{j=1}^m k(x_i,x_i) + k(x_j, x_j) - 2k(x_i,x_j).
+  
+
+See :cite:`gretton2012kernel` for a detailed introduction.
+
+-------
+Example
+-------
+
+Imagine we have samples from :math:`p` and :math:`q`, here in the form of CDenseFeatures (here 64 bit floats aka RealFeatures).
+
+.. sgexample:: quadratic_time_mmd.sg:create_features
+
+We create an instance of :sgclass:`CQuadraticTimeMMD`, passing it data the kernel.
+
+.. sgexample:: quadratic_time_mmd.sg:create_instance
+
+We can select multiple ways to compute the test statistic, see :sgclass:`CQuadraticTimeMMD` for details. 
+The biased statistic is computed as
+
+.. sgexample:: quadratic_time_mmd.sg:estimate_mmd
+
+There are multiple ways to perform the actual hypothesis test, see :sgclass:`CQuadraticTimeMMD` for details. The permutation version simulates from :math:`H_0` via repeatedly permuting the samples from :math:`p` and :math:`q`. We can perform the test via computing a test threshold for a given :math:`\alpha`, or by directly computing a p-value.
+
+.. sgexample:: quadratic_time_mmd.sg:perform_test
+
+----------------
+Multiple kernels
+----------------
+
+It is possible to perform all operations (computing statistics, performing test, etc) for multiple kernels at once, via the :sgclass:`CMultiKernelQuadraticTimeMMD` interface.
+
+.. sgexample:: quadratic_time_mmd.sg:multi_kernel
+
+Note that the results are now a vector with one entry per kernel.
+Also note that the kernels for single and multiple are kept separately.
+
+---------------
+Kernel learning
+---------------
+
+There are various options to learn a kernel.
+All options allow to learn a single kernel among a number of provided baseline kernels.
+Furthermore, some of these criterions can be used to learn the coefficients of a convex combination of baseline kernels.
+
+There are different strategies to learn the kernel, see :sgclass:`CKernelSelectionStrategy`.
+
+We specify the desired baseline kernels to consider. Note the kernel above is not considered in the selection.
+
+.. sgexample:: quadratic_time_mmd.sg:add_kernels
+
+IMPORTANT: when learning the kernel for statistical testing, this needs to be done on different data than being used for performing the actual test.
+One way to accomplish this is to manually provide a different set of features for testing.
+In Shogun, it is also possible to automatically split the provided data by specifying the ratio between train and test data, via enabling the train-test mode.
+
+.. sgexample:: quadratic_time_mmd.sg:enable_train_test_mode
+
+A ratio of 1 means the data is split into half during learning the kernel, and subsequent tests are performed on the second half.
+
+We learn the kernel and extract the result, again see :sgclass:`CKernelSelectionStrategy` more available strategies.
+Note that the kernel of the mmd itself is replaced.
+If all kernels have the same type, we can convert the result into that type, for example to extract its parameters.
+
+.. sgexample:: quadratic_time_mmd.sg:select_kernel_single
+
+Note that in order to extract particular kernel parameters, we need to cast the kernel to its actual type.
+
+Similarly, a convex combination of kernels, in the form of :sgclass:`CCombinedKernel` can be learned and extracted as
+
+.. sgexample:: quadratic_time_mmd.sg:select_kernel_combined
+
+We can perform the test on the last learnt kernel.
+Since we enabled the train-test mode, this automatically is done on the held out test data.
+
+.. sgexample:: quadratic_time_mmd.sg:perform_test_optimized
+
+----------
+References
+----------
+.. bibliography:: ../../references.bib
+    :filter: docname in docnames
+
+:wiki:`Statistical_hypothesis_testing`
diff --git a/doc/cookbook/source/index.rst b/doc/cookbook/source/index.rst
index 30616938fcb..1b979cb0399 100644
--- a/doc/cookbook/source/index.rst
+++ b/doc/cookbook/source/index.rst
@@ -47,6 +47,15 @@ Regression
 
    examples/regression/**
 
+Statistical Testing
+-------------------
+
+.. toctree::
+   :maxdepth: 1
+   :glob:
+
+   examples/statistical_testing/**
+
 Kernels
 -------
 
diff --git a/doc/cookbook/source/references.bib b/doc/cookbook/source/references.bib
index 5cba98e83a3..b32bec37852 100644
--- a/doc/cookbook/source/references.bib
+++ b/doc/cookbook/source/references.bib
@@ -25,7 +25,7 @@ @book{cristianini2000introduction
   publisher={Cambridge University Press}
 }
 @article{fan2008liblinear,
-  title={LIBLINEAR: A Library for Large Linear Classification},
+  title={{LIBLINEAR: A Library for Large Linear Classification}},
   author={R.E. Fan and K.W. Chang and C.J. Hsieh and X.R. Wang and C.J. Lin},
   journal={Journal of Machine Learning Research},
   volume={9},
@@ -36,7 +36,18 @@ @book{Rasmussen2005GPM
   author = {Rasmussen, C. E. and Williams, C. K. I.},
   title = {Gaussian Processes for Machine Learning},
   year = {2005},
-  publisher = {The MIT Press}
+  publisher = {The MIT Press},
+  year={2008},
+}
+
+@article{gretton2012kernel,
+  title={A kernel two-sample test},
+  author={Gretton, A. and Borgwardt, K.M. and Rasch, M.J. and Sch{\"o}lkopf, B. and Smola, A.},
+  journal={The Journal of Machine Learning Research},
+  volume={13},
+  number={1},
+  pages={723--773},
+  year={2012},
 }
 @article{ueda2000smem,
   title={SMEM Algorithm for Mixture Models},
@@ -102,6 +113,13 @@ @inproceedings{shalev2011shareboost
   pages={1179--1187},
   year={2011}
 }
+
+@inproceedings{gretton2012optimal,
+  author={Gretton, A. and Sriperumbudur, B. and Sejdinovic, D. and Strathmann, H. and Balakrishnan, S. and Pontil, M. and Fukumizu, K.},
+  booktitle={Advances in Neural Information Processing Systems},
+  title={{Optimal kernel choice for large-scale two-sample tests}},
+  year={2012}
+}
 @article{sonnenburg2006large,
   title={Large scale multiple kernel learning},
   author={S. Sonnenburg and G. R{\"a}tsch and C. Sch{\"a}fer and B. Sch{\"o}lkopf},
diff --git a/doc/ipython-notebooks/statistics/mmd_two_sample_testing.ipynb b/doc/ipython-notebooks/statistical_testing/mmd_two_sample_testing.ipynb
similarity index 62%
rename from doc/ipython-notebooks/statistics/mmd_two_sample_testing.ipynb
rename to doc/ipython-notebooks/statistical_testing/mmd_two_sample_testing.ipynb
index 9169538737c..4b8acceaca5 100644
--- a/doc/ipython-notebooks/statistics/mmd_two_sample_testing.ipynb
+++ b/doc/ipython-notebooks/statistical_testing/mmd_two_sample_testing.ipynb
@@ -11,22 +11,23 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### By Heiko Strathmann - <a href=\"mailto:heiko.strathmann@gmail.com\">heiko.strathmann@gmail.com</a> - <a href=\"github.com/karlnapf\">github.com/karlnapf</a> - <a href=\"herrstrathmann.de\">herrstrathmann.de</a>"
+    "#### Heiko Strathmann - <a href=\"mailto:heiko.strathmann@gmail.com\">heiko.strathmann@gmail.com</a> - <a href=\"github.com/karlnapf\">github.com/karlnapf</a> - <a href=\"herrstrathmann.de\">herrstrathmann.de</a>\n",
+    "#### Soumyajit De - <a href=\"soumyajitde.cse@gmail.com \">soumyajitde.cse@gmail.com </a> - <a href=\"github.com/lambday\">github.com/lambday</a>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This notebook describes Shogun's framework for <a href=\"http://en.wikipedia.org/wiki/Statistical_hypothesis_testing\">statistical hypothesis testing</a>. We begin by giving a brief outline of the problem setting and then describe various implemented algorithms. All the algorithms discussed here are for <a href=\"http://en.wikipedia.org/wiki/Kernel_embedding_of_distributions#Kernel_two_sample_test\">Kernel two-sample testing</a> with Maximum Mean Discrepancy and are based on embedding probability distributions into <a href=\"http://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space\">Reproducing Kernel Hilbert Spaces</a>( RKHS )."
+    "This notebook describes Shogun's framework for <a href=\"http://en.wikipedia.org/wiki/Statistical_hypothesis_testing\">statistical hypothesis testing</a>. We begin by giving a brief outline of the problem setting and then describe various implemented algorithms.\n",
+    "All algorithms discussed here are instances of <a href=\"http://en.wikipedia.org/wiki/Kernel_embedding_of_distributions#Kernel_two_sample_test\">kernel two-sample testing</a> with the *maximum mean discrepancy*, and are based on embedding probability distributions into <a href=\"http://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space\">Reproducing Kernel Hilbert Spaces</a> (RKHS)."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Methods for two-sample testing currently consist of tests based on the *Maximum Mean Discrepancy*. There are two types of tests available, a quadratic time test and a linear time test. Both come in various flavours.\n",
-    "Independence testing is currently based in the *Hilbert Schmidt Independence Criterion*."
+    "There are two types of tests available, a quadratic time test and a linear time test. Both come in various flavours."
    ]
   },
   {
@@ -39,8 +40,8 @@
    "source": [
     "%pylab inline\n",
     "%matplotlib inline\n",
-    "# import all Shogun classes\n",
-    "from modshogun import *"
+    "import modshogun as sg\n",
+    "import numpy as np"
    ]
   },
   {
@@ -54,7 +55,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To set the context, we here briefly describe statistical hypothesis testing. Informally, one defines a hypothesis on a certain domain and then uses a statistical test to check whether this hypothesis is true. Formally, the goal is to reject a so-called *null-hypothesis* $H_0$, which is the complement of an *alternative-hypothesis* $H_A$. \n",
+    "To set the context, we here briefly describe statistical hypothesis testing. Informally, one defines a hypothesis on a certain domain and then uses a statistical test to check whether this hypothesis is true. Formally, the goal is to reject a so-called *null-hypothesis* $H_0:p=q$, which is the complement of an *alternative-hypothesis* $H_A$. \n",
     "\n",
     "To distinguish the hypotheses, a test statistic is computed on sample data. Since sample data is finite, this corresponds to sampling the true distribution of the test statistic. There are two different distributions of the test statistic -- one for each hypothesis. The *null-distribution* corresponds to test statistic samples under the model that $H_0$ holds; the *alternative-distribution* corresponds to test statistic samples under the model that $H_A$ holds.\n",
     "\n",
@@ -65,11 +66,11 @@
     " *  A *type I error* is made when $H_0: p=q$ is wrongly rejected. That is, the test says that the samples are from different distributions when they are not.\n",
     " *  A *type II error* is made when $H_A: p\\neq q$ is wrongly accepted. That is, the test says that the samples are from the same distribution when they are not.\n",
     "\n",
-    "A so-called *consistent* test achieves zero type II error for a fixed type I error.\n",
+    "A so-called *consistent* test achieves zero type II error for a fixed type I error, as it sees more data.\n",
     "\n",
     "To decide whether to reject $H_0$, one could set a threshold, say at the $95\\%$ quantile of the null-distribution, and reject $H_0$ when the test statistic lies below that threshold. This means that the chance that the samples were generated under $H_0$ are $5\\%$. We call this number the *test power* $\\alpha$ (in this case $\\alpha=0.05$). It is an upper bound on the probability for a type I error. An alternative way is simply to compute the quantile of the test statistic in the null-distribution, the so-called *p-value*, and to compare the p-value against a desired test power, say $\\alpha=0.05$, by hand. The advantage of the second method is that one not only gets a binary answer, but also an upper bound on the type I error.\n",
     "\n",
-    "In order to construct a two-sample test, the null-distribution of the test statistic has to be approximated. One way of doing this for any two-sample test is called *bootstrapping*, or the *permutation* test, where samples from both sources are mixed and permuted repeatedly and the test statistic is computed for every of those configurations. While this method works for every statistical hypothesis test, it might be very costly because the test statistic has to be re-computed many times. For many test statistics, there are more sophisticated methods of approximating the null distribution."
+    "In order to construct a two-sample test, the null-distribution of the test statistic has to be approximated. One way of doing this is called the *permutation test*, where samples from both sources are mixed and permuted repeatedly and the test statistic is computed for every of those configurations. While this method works for every statistical hypothesis test, it might be very costly because the test statistic has to be re-computed many times. Shogun comes with an extremely optimized implementation though. For completeness, Shogun also includes a number of more sohpisticated ways of approximating the null distribution."
    ]
   },
   {
@@ -83,15 +84,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Shogun implements statistical testing in the abstract class  <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CHypothesisTest.html\">CHypothesisTest</a>. All implemented methods will work with this interface at their most basic level.  This class offers methods to\n",
+    "Shogun implements statistical testing in the abstract class  <a href=\"http://shogun.ml/CHypothesisTest\">CHypothesisTest</a>. All implemented methods will work with this interface at their most basic level. We here focos on <a href=\"http://shogun.ml/CTwoSampleTest\">CTwoSampleTest</a>. This class offers methods to\n",
     "\n",
     " * compute the implemented test statistic,\n",
     " * compute p-values for a given value of the test statistic,\n",
     " * compute a test threshold for a given p-value,\n",
-    " * sampling the null distribution, i.e. perform the permutation test or bootstrappig of the null-distribution, and\n",
-    " * performing a full two-sample test, and either returning a p-value or a binary rejection decision. This method is most useful in practice. Note that, depending on the used test statistic, it might be faster to call this than to compute threshold and test statistic seperately with the above methods.\n",
-    " \n",
-    "There are special subclasses for testing two distributions against each other (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTwoSampleTest.html\">CTwoSampleTest</a>, <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CIndependenceTest.html\">CIndependenceTest</a>), kernel two-sample testing (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelTwoSampleTest.html\">CKernelTwoSampleTest</a>), and kernel independence testing (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelIndependenceTest.html\">CKernelIndependenceTest</a>), which however mostly differ in internals and constructors."
+    " * approximate the null distribution, e.g. perform the permutation test and\n",
+    " * performing a full two-sample test, and either returning a p-value or a binary rejection decision. This method is most useful in practice. Note that, depending on the used test statistic."
    ]
   },
   {
@@ -123,7 +122,7 @@
     "  +\\textbf{E}_{y,y'}\\left[ k(y,y')\\right]\n",
     "\\end{align*}\n",
     "\n",
-    "Note that this formulation does not assume any form of the input data, we just need a kernel function whose feature space is a RKHS, see [2, Section 2] for details. This has the consequence that in Shogun, we can do tests on any type of data (<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CDenseFeatures.html\">CDenseFeatures</a>, <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CSparseFeatures.html\">CSparseFeatures</a>, <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CStringFeatures.html\">CStringFeatures</a>, etc), as long as we or you provide a positive definite kernel function under the interface of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernel.html\">CKernel</a>.\n",
+    "Note that this formulation does not assume any form of the input data, we just need a kernel function whose feature space is a RKHS, see [2, Section 2] for details. This has the consequence that in Shogun, we can do tests on any type of data (<a href=\"http://shogun.ml/CDenseFeatures\">CDenseFeatures</a>, <a href=\"http://shogun.ml/CSparseFeatures\">CSparseFeatures</a>, <a href=\"http://shogun.ml/CStringFeatures\">CStringFeatures</a>, etc), as long as we or you provide a positive definite kernel function under the interface of <a href=\"http://shogun.ml/CKernel\">CKernel</a>.\n",
     "\n",
     "We here only describe how to use the MMD for two-sample testing. Shogun offers two types of test statistic based on the MMD, one with quadratic costs both in time and space, and one with linear time and constant space costs. Both come in different versions and with different methods how to approximate the null-distribution in order to construct a two-sample test."
    ]
@@ -159,11 +158,11 @@
    "outputs": [],
    "source": [
     "# use scipy for generating samples\n",
-    "from scipy.stats import norm, laplace\n",
+    "from scipy.stats import laplace, norm\n",
     "\n",
-    "def sample_gaussian_vs_laplace(n=220, mu=0.0, sigma2=1, b=sqrt(0.5)):    \n",
+    "def sample_gaussian_vs_laplace(n=220, mu=0.0, sigma2=1, b=np.sqrt(0.5)):    \n",
     "    # sample from both distributions\n",
-    "    X=norm.rvs(size=n, loc=mu, scale=sigma2)\n",
+    "    X=norm.rvs(size=n)*np.sqrt(sigma2)+mu\n",
     "    Y=laplace.rvs(size=n, loc=mu, scale=b)\n",
     "    \n",
     "    return X,Y"
@@ -179,31 +178,30 @@
    "source": [
     "mu=0.0\n",
     "sigma2=1\n",
-    "b=sqrt(0.5)\n",
+    "b=np.sqrt(0.5)\n",
     "n=220\n",
     "X,Y=sample_gaussian_vs_laplace(n, mu, sigma2, b)\n",
     "\n",
     "# plot both densities and histograms\n",
-    "figure(figsize=(18,5))\n",
-    "suptitle(\"Gaussian vs. Laplace\")\n",
-    "subplot(121)\n",
-    "Xs=linspace(-2, 2, 500)\n",
-    "plot(Xs, norm.pdf(Xs, loc=mu, scale=sigma2))\n",
-    "plot(Xs, laplace.pdf(Xs, loc=mu, scale=b))\n",
-    "title(\"Densities\")\n",
-    "xlabel(\"$x$\")\n",
-    "ylabel(\"$p(x)$\")\n",
-    "_=legend([ 'Gaussian','Laplace'])\n",
-    "\n",
-    "subplot(122)\n",
-    "hist(X, alpha=0.5)\n",
-    "xlim([-5,5])\n",
-    "ylim([0,100])\n",
-    "hist(Y,alpha=0.5)\n",
-    "xlim([-5,5])\n",
-    "ylim([0,100])\n",
-    "legend([\"Gaussian\", \"Laplace\"])\n",
-    "_=title('Histograms')"
+    "plt.figure(figsize=(18,5))\n",
+    "plt.suptitle(\"Gaussian vs. Laplace\")\n",
+    "plt.subplot(121)\n",
+    "Xs=np.linspace(-2, 2, 500)\n",
+    "plt.plot(Xs, norm.pdf(Xs, loc=mu, scale=sigma2))\n",
+    "plt.plot(Xs, laplace.pdf(Xs, loc=mu, scale=b))\n",
+    "plt.title(\"Densities\")\n",
+    "plt.xlabel(\"$x$\")\n",
+    "plt.ylabel(\"$p(x)$\")\n",
+    "\n",
+    "plt.subplot(122)\n",
+    "plt.hist(X, alpha=0.5)\n",
+    "plt.xlim([-5,5])\n",
+    "plt.ylim([0,100])\n",
+    "plt.hist(Y,alpha=0.5)\n",
+    "plt.xlim([-5,5])\n",
+    "plt.ylim([0,100])\n",
+    "plt.legend([\"Gaussian\", \"Laplace\"])\n",
+    "plt.title('Samples');"
    ]
   },
   {
@@ -222,8 +220,8 @@
    "outputs": [],
    "source": [
     "print \"Gaussian vs. Laplace\"\n",
-    "print \"Sample means: %.2f vs %.2f\" % (mean(X), mean(Y))\n",
-    "print \"Samples variances: %.2f vs %.2f\" % (var(X), var(Y))"
+    "print \"Sample means: %.2f vs %.2f\" % (np.mean(X), np.mean(Y))\n",
+    "print \"Samples variances: %.2f vs %.2f\" % (np.var(X), np.var(Y))"
    ]
   },
   {
@@ -237,7 +235,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We now describe the quadratic time MMD, as described in [1, Lemma 6], which is implemented in Shogun. All methods in this section are implemented in <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CQuadraticTimeMMD.html\">CQuadraticTimeMMD</a>, which accepts any type of features in Shogun, and use it on the above toy problem.\n",
+    "We now describe the quadratic time MMD, as described in [1, Lemma 6], which is implemented in Shogun. All methods in this section are implemented in <a href=\"http://shogun.ml/CQuadraticTimeMMD\">CQuadraticTimeMMD</a>, which accepts any type of features in Shogun, and use it on the above toy problem.\n",
     "\n",
     "An unbiased estimate for the MMD expression above can be obtained by estimating expected values with averaging over independent samples\n",
     "\n",
@@ -251,7 +249,7 @@
     "\\mmd_b[\\mathcal{F},X,Y]^2=\\frac{1}{m^2}\\sum_{i=1}^m\\sum_{j=1}^mk(x_i,x_j) + \\frac{1}{n^ 2}\\sum_{i=1}^n\\sum_{j=1}^nk(y_i,y_j)-\\frac{2}{mn}\\sum_{i=1}^m\\sum_{j\\neq i}^nk(x_i,y_j)\n",
     ".$$\n",
     "\n",
-    "Computing the test statistic using  <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CQuadraticTimeMMD.html\">CQuadraticTimeMMD</a> does exactly this, where it is possible to choose between the two above expressions. Note that some methods for approximating the null-distribution only work with one of both types. Both statistics' computational costs are quadratic both in time and space. Note that the method returns $m\\mmd_b[\\mathcal{F},X,Y]^2$ since null distribution approximations work on $m$ times null distribution. Here is how the test statistic itself is computed."
+    "Computing the test statistic using  <a href=\"http://shogun.ml/CQuadraticTimeMMD\">CQuadraticTimeMMD</a> does exactly this, where it is possible to choose between the two above expressions. Note that some methods for approximating the null-distribution only work with one of both types. Both statistics' computational costs are quadratic both in time and space. Note that the method returns $m\\mmd_b[\\mathcal{F},X,Y]^2$ since null distribution approximations work on $m$ times null distribution. Here is how the test statistic itself is computed."
    ]
   },
   {
@@ -263,22 +261,25 @@
    "outputs": [],
    "source": [
     "# turn data into Shogun representation (columns vectors)\n",
-    "feat_p=RealFeatures(X.reshape(1,len(X)))\n",
-    "feat_q=RealFeatures(Y.reshape(1,len(Y)))\n",
+    "feat_p=sg.RealFeatures(X.reshape(1,len(X)))\n",
+    "feat_q=sg.RealFeatures(Y.reshape(1,len(Y)))\n",
     "\n",
     "# choose kernel for testing. Here: Gaussian\n",
     "kernel_width=1\n",
-    "kernel=GaussianKernel(10, kernel_width)\n",
+    "kernel=sg.GaussianKernel(10, kernel_width)\n",
     "\n",
     "# create mmd instance of test-statistic\n",
-    "mmd=QuadraticTimeMMD(kernel, feat_p, feat_q)\n",
+    "mmd=sg.QuadraticTimeMMD()\n",
+    "mmd.set_kernel(kernel)\n",
+    "mmd.set_p(feat_p)\n",
+    "mmd.set_q(feat_q)\n",
     "\n",
     "# compute biased and unbiased test statistic (default is unbiased)\n",
-    "mmd.set_statistic_type(BIASED)\n",
+    "mmd.set_statistic_type(sg.ST_BIASED_FULL)\n",
     "biased_statistic=mmd.compute_statistic()\n",
     "\n",
-    "mmd.set_statistic_type(UNBIASED)\n",
-    "unbiased_statistic=mmd.compute_statistic()\n",
+    "mmd.set_statistic_type(sg.ST_UNBIASED_FULL)\n",
+    "statistic=unbiased_statistic=mmd.compute_statistic()\n",
     "\n",
     "print \"%d x MMD_b[X,Y]^2=%.2f\" % (len(X), biased_statistic)\n",
     "print \"%d x MMD_u[X,Y]^2=%.2f\" % (len(X), unbiased_statistic)"
@@ -288,7 +289,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Any sub-class of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CHypothesisTest.html\">CHypothesisTest</a> can compute approximate the null distribution using permutation/bootstrapping. This way always is guaranteed to produce consistent results, however, it might take a long time as for each sample of the null distribution, the test statistic has to be computed for a different permutation of the data. Note that each of the below calls samples from the null distribution. It is wise to choose one method in practice. Also note that we set the number of samples from the null distribution to a low value to reduce runtime. Choose larger in practice, it is in fact good to plot the samples."
+    "Any sub-class of <a href=\"http://www.shogun.ml/CHypothesisTest\">CHypothesisTest</a> can compute approximate the null distribution using permutation/bootstrapping. This way always is guaranteed to produce consistent results, however, it might take a long time as for each sample of the null distribution, the test statistic has to be computed for a different permutation of the data. Shogun's implementation is highly optimized, exploiting low-level CPU caching and multiple available cores."
    ]
   },
   {
@@ -299,18 +300,14 @@
    },
    "outputs": [],
    "source": [
-    "# this is not necessary as bootstrapping is the default\n",
-    "mmd.set_null_approximation_method(PERMUTATION)\n",
-    "mmd.set_statistic_type(UNBIASED)\n",
-    "\n",
-    "# to reduce runtime, should be larger practice\n",
-    "mmd.set_num_null_samples(100)\n",
+    "mmd.set_null_approximation_method(sg.NAM_PERMUTATION)\n",
+    "mmd.set_num_null_samples(200)\n",
     "\n",
     "# now show a couple of ways to compute the test\n",
     "\n",
     "# compute p-value for computed test statistic\n",
-    "p_value=mmd.compute_p_value(unbiased_statistic)\n",
-    "print \"P-value of MMD value %.2f is %.2f\" % (unbiased_statistic, p_value)\n",
+    "p_value=mmd.compute_p_value(statistic)\n",
+    "print \"P-value of MMD value %.2f is %.2f\" % (statistic, p_value)\n",
     "\n",
     "# compute threshold for rejecting H_0 for a given test power\n",
     "alpha=0.05\n",
@@ -318,7 +315,7 @@
     "print \"Threshold for rejecting H0 with a test power of %.2f is %.2f\" % (alpha, threshold)\n",
     "\n",
     "# performing the test by hand given the above results, note that those two are equivalent\n",
-    "if unbiased_statistic>threshold:\n",
+    "if statistic>threshold:\n",
     "    print \"H0 is rejected with confidence %.2f\" % alpha\n",
     "    \n",
     "if p_value<alpha:\n",
@@ -328,11 +325,6 @@
     "# fixed test power, binary decision\n",
     "binary_test_result=mmd.perform_test(alpha)\n",
     "if binary_test_result:\n",
-    "    print \"H0 is rejected with confidence %.2f\" % alpha\n",
-    "\n",
-    "significance_test_result=mmd.perform_test()\n",
-    "print \"P-value of MMD test is %.2f\" % significance_test_result\n",
-    "if significance_test_result<alpha:\n",
     "    print \"H0 is rejected with confidence %.2f\" % alpha"
    ]
   },
@@ -340,41 +332,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Precomputing Kernel Matrices"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Bootstrapping re-computes the test statistic for a bunch of permutations of the test data. For kernel two-sample test methods, in particular those of the MMD class, this means that only the joint kernel matrix of $X$ and $Y$ needs to be permuted. Thus, we can precompute the matrix, which gives a significant performance boost. Note that this is only possible if the matrix can be stored in memory. Below, we use Shogun's <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CCustomKernel.html\">CCustomKernel</a> class, which allows to precompute a kernel matrix (multithreaded) of a given kernel and store it in memory. Instances of this class can then be used as if they were standard kernels."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [],
-   "source": [
-    "# precompute kernel to be faster for null sampling\n",
-    "p_and_q=mmd.get_p_and_q()\n",
-    "kernel.init(p_and_q, p_and_q);\n",
-    "precomputed_kernel=CustomKernel(kernel);\n",
-    "mmd.set_kernel(precomputed_kernel);\n",
-    "\n",
-    "# increase number of iterations since should be faster now\n",
-    "mmd.set_num_null_samples(500);\n",
-    "p_value_boot=mmd.perform_test();\n",
-    "print \"P-value of MMD test is %.2f\" % p_value_boot"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Now let us visualise distribution of MMD statistic under $H_0:p=q$ and $H_A:p\\neq q$. Sample both null and alternative distribution for that. Use the interface of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CTwoSampleTest.html\">CTwoSampleTest</a> to sample from the null distribution (permutations, re-computing of test statistic is done internally). For the alternative distribution, compute the test statistic for a new sample set of $X$ and $Y$ in a loop. Note that the latter is expensive, as the kernel cannot be precomputed, and infinite data is needed. Though it is not needed in practice but only for illustrational purposes here."
+    "Now let us visualise distribution of MMD statistic under $H_0:p=q$ and $H_A:p\\neq q$. Sample both null and alternative distribution for that. Use the interface of <a href=\"http://www.shogun.ml/CHypothesisTest\">CHypothesisTest</a> to sample from the null distribution (permutations, re-computing of test statistic is done internally). For the alternative distribution, compute the test statistic for a new sample set of $X$ and $Y$ in a loop. Note that the latter is expensive, as the kernel cannot be precomputed, and infinite data is needed. Though it is not needed in practice but only for illustrational purposes here."
    ]
   },
   {
@@ -388,18 +346,21 @@
     "num_samples=500\n",
     "\n",
     "# sample null distribution\n",
-    "mmd.set_num_null_samples(num_samples)\n",
     "null_samples=mmd.sample_null()\n",
     "\n",
     "# sample alternative distribution, generate new data for that\n",
-    "alt_samples=zeros(num_samples)\n",
+    "alt_samples=np.zeros(num_samples)\n",
     "for i in range(num_samples):\n",
     "    X=norm.rvs(size=n, loc=mu, scale=sigma2)\n",
     "    Y=laplace.rvs(size=n, loc=mu, scale=b)\n",
-    "    feat_p=RealFeatures(reshape(X, (1,len(X))))\n",
-    "    feat_q=RealFeatures(reshape(Y, (1,len(Y))))\n",
-    "    mmd=QuadraticTimeMMD(kernel, feat_p, feat_q)\n",
-    "    alt_samples[i]=mmd.compute_statistic()"
+    "    feat_p=sg.RealFeatures(np.reshape(X, (1,len(X))))\n",
+    "    feat_q=sg.RealFeatures(np.reshape(Y, (1,len(Y))))\n",
+    "    # TODO: reset pre-computed kernel here\n",
+    "    mmd.set_p(feat_p)\n",
+    "    mmd.set_q(feat_q)\n",
+    "    alt_samples[i]=mmd.compute_statistic()\n",
+    "\n",
+    "np.std(alt_samples)"
    ]
   },
   {
@@ -428,26 +389,26 @@
    "outputs": [],
    "source": [
     "def plot_alt_vs_null(alt_samples, null_samples, alpha):\n",
-    "    figure(figsize=(18,5))\n",
+    "    plt.figure(figsize=(18,5))\n",
     "    \n",
-    "    subplot(131)\n",
-    "    hist(null_samples, 50, color='blue')\n",
-    "    title('Null distribution')\n",
-    "    subplot(132)\n",
-    "    title('Alternative distribution')\n",
-    "    hist(alt_samples, 50, color='green')\n",
+    "    plt.subplot(131)\n",
+    "    plt.hist(null_samples, 50, color='blue')\n",
+    "    plt.title('Null distribution')\n",
+    "    plt.subplot(132)\n",
+    "    plt.title('Alternative distribution')\n",
+    "    plt.hist(alt_samples, 50, color='green')\n",
     "    \n",
-    "    subplot(133)\n",
-    "    hist(null_samples, 50, color='blue')\n",
-    "    hist(alt_samples, 50, color='green', alpha=0.5)\n",
-    "    title('Null and alternative distriution')\n",
+    "    plt.subplot(133)\n",
+    "    plt.hist(null_samples, 50, color='blue')\n",
+    "    plt.hist(alt_samples, 50, color='green', alpha=0.5)\n",
+    "    plt.title('Null and alternative distriution')\n",
     "    \n",
     "    # find (1-alpha) element of null distribution\n",
-    "    null_samples_sorted=sort(null_samples)\n",
-    "    quantile_idx=int(num_samples*(1-alpha))\n",
+    "    null_samples_sorted=np.sort(null_samples)\n",
+    "    quantile_idx=int(len(null_samples)*(1-alpha))\n",
     "    quantile=null_samples_sorted[quantile_idx]\n",
-    "    axvline(x=quantile, ymin=0, ymax=100, color='red', label=str(int(round((1-alpha)*100))) + '% quantile of null')\n",
-    "    _=legend()"
+    "    plt.axvline(x=quantile, ymin=0, ymax=100, color='red', label=str(int(round((1-alpha)*100))) + '% quantile of null')\n",
+    "    legend();"
    ]
   },
   {
@@ -472,7 +433,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As already mentioned, bootstrapping the null distribution is expensive business. There exist a couple of methods that are more sophisticated and either allow very fast approximations without guarantees or reasonably fast approximations that are consistent. We present a selection from [2], which are implemented in Shogun.\n",
+    "As already mentioned, permuting the data to access the null distribution is probably the method of choice, due to the efficient implementation in Shogun. There exist a couple of methods that are more sophisticated (and slower) and either allow very fast approximations without guarantees or reasonably fast approximations that are consistent. We present a selection from [2], which are implemented in Shogun.\n",
     "\n",
     "The first one is a spectral method that is based around the Eigenspectrum of the kernel matrix of the joint samples. It is faster than bootstrapping while being a consistent test. Effectively, the null-distribution of the biased statistic is sampled, but in a more efficient way than the bootstrapping approach. The converges as\n",
     "\n",
@@ -482,12 +443,12 @@
     "\n",
     "where $z_l\\sim \\mathcal{N}(0,2)$ are i.i.d. normal samples and $\\lambda_l$ are Eigenvalues of expression 2 in [2], which can be empirically estimated by $\\hat\\lambda_l=\\frac{1}{m}\\nu_l$ where $\\nu_l$ are the Eigenvalues of the centred kernel matrix of the joint samples $X$ and $Y$. The distribution above can be easily sampled. Shogun's implementation has two parameters:\n",
     "\n",
-    " * Number of samples from null-distribution. The more, the more accurate. As a rule of thumb, use 250.\n",
+    " * Number of samples from null-distribution. The more, the more accurate.\n",
     " *  Number of Eigenvalues of the Eigen-decomposition of the kernel matrix to use. The more, the better the results get. However, the Eigen-spectrum of the joint gram matrix usually decreases very fast. Plotting the Spectrum can help. See [2] for details.\n",
     "\n",
-    "If the kernel matrices are diagonal dominant, this method is likely to fail. For that and more details, see the original paper. Computational costs are much lower than bootstrapping, which is the only consistent alternative. Since Eigenvalues of the gram matrix has to be computed, costs are in $\\mathcal{O}(m^3)$.\n",
+    "If the kernel matrices are diagonal dominant, this method is likely to fail. For that and more details, see the original paper. Computational costs are likely to be larger than permutation testing, due to the efficient implementation of the latter: Eigenvalues of the gram matrix cost $\\mathcal{O}(m^3)$.\n",
     "\n",
-    "Below, we illustrate how to sample the null distribution and perform two-sample testing with the Spectrum approximation in the class  <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CQuadraticTimeMMD.html\">CQuadraticTimeMMD</a>. This method only works with the biased statistic."
+    "Below, we illustrate how to sample the null distribution and perform two-sample testing with the Spectrum approximation in the class  <a href=\"https://shogun.ml/&QuadraticTimeMMD\">CQuadraticTimeMMD</a>. This method only works with the biased statistic."
    ]
   },
   {
@@ -499,23 +460,24 @@
    "outputs": [],
    "source": [
     "# optional: plot spectrum of joint kernel matrix\n",
-    "from numpy.linalg import eig\n",
+    "\n",
+    "# TODO: it would be good if there was a way to extract the joint kernel matrix for all kernel tests\n",
     "\n",
     "# get joint feature object and compute kernel matrix and its spectrum\n",
     "feats_p_q=mmd.get_p_and_q()\n",
     "mmd.get_kernel().init(feats_p_q, feats_p_q)\n",
     "K=mmd.get_kernel().get_kernel_matrix()\n",
-    "w,_=eig(K)\n",
+    "w,_=np.linalg.eig(K)\n",
     "\n",
     "# visualise K and its spectrum (only up to threshold)\n",
-    "figure(figsize=(18,5))\n",
-    "subplot(121)\n",
-    "imshow(K, interpolation=\"nearest\")\n",
-    "title(\"Kernel matrix K of joint data $X$ and $Y$\")\n",
-    "subplot(122)\n",
+    "plt.figure(figsize=(18,5))\n",
+    "plt.subplot(121)\n",
+    "plt.imshow(K, interpolation=\"nearest\")\n",
+    "plt.title(\"Kernel matrix K of joint data $X$ and $Y$\")\n",
+    "plt.subplot(122)\n",
     "thresh=0.1\n",
-    "plot(w[:len(w[w>thresh])])\n",
-    "_=title(\"Eigenspectrum of K until component %d\" % len(w[w>thresh]))"
+    "plt.plot(w[:len(w[w>thresh])])\n",
+    "title(\"Eigenspectrum of K until component %d\" % len(w[w>thresh]));"
    ]
   },
   {
@@ -540,22 +502,23 @@
     "num_eigen=len(w[w>thresh])\n",
     "\n",
     "# finally, do the test, use biased statistic\n",
-    "mmd.set_statistic_type(BIASED)\n",
+    "mmd.set_statistic_type(sg.ST_BIASED_FULL)\n",
     "\n",
     "#tell Shogun to use spectrum approximation\n",
-    "mmd.set_null_approximation_method(MMD2_SPECTRUM)\n",
-    "mmd.set_num_eigenvalues_spectrum(num_eigen)\n",
-    "mmd.set_num_samples_spectrum(num_samples)\n",
+    "mmd.set_null_approximation_method(sg.NAM_MMD2_SPECTRUM)\n",
+    "mmd.spectrum_set_num_eigenvalues(num_eigen)\n",
+    "mmd.set_num_null_samples(num_samples)\n",
     "\n",
     "# the usual test interface\n",
-    "p_value_spectrum=mmd.perform_test()\n",
+    "statistic=mmd.compute_statistic()\n",
+    "p_value_spectrum=mmd.compute_p_value(statistic)\n",
     "print \"Spectrum: P-value of MMD test is %.2f\" % p_value_spectrum\n",
     "\n",
-    "# compare with ground truth bootstrapping\n",
-    "mmd.set_null_approximation_method(PERMUTATION)\n",
+    "# compare with ground truth from permutation test\n",
+    "mmd.set_null_approximation_method(sg.NAM_PERMUTATION)\n",
     "mmd.set_num_null_samples(num_samples)\n",
-    "p_value_boot=mmd.perform_test()\n",
-    "print \"Bootstrapping: P-value of MMD test is %.2f\" % p_value_spectrum"
+    "p_value_permutation=mmd.compute_p_value(statistic)\n",
+    "print \"Bootstrapping: P-value of MMD test is %.2f\" % p_value_permutation"
    ]
   },
   {
@@ -595,15 +558,16 @@
    "outputs": [],
    "source": [
     "# tell Shogun to use gamma approximation\n",
-    "mmd.set_null_approximation_method(MMD2_GAMMA)\n",
+    "mmd.set_null_approximation_method(sg.NAM_MMD2_GAMMA)\n",
     "\n",
     "# the usual test interface\n",
-    "p_value_gamma=mmd.perform_test()\n",
+    "statistic=mmd.compute_statistic()\n",
+    "p_value_gamma=mmd.compute_p_value(statistic)\n",
     "print \"Gamma: P-value of MMD test is %.2f\" % p_value_gamma\n",
     "\n",
     "# compare with ground truth bootstrapping\n",
-    "mmd.set_null_approximation_method(PERMUTATION)\n",
-    "p_value_boot=mmd.perform_test()\n",
+    "mmd.set_null_approximation_method(sg.NAM_PERMUTATION)\n",
+    "p_value_spectrum=mmd.compute_p_value(statistic)\n",
     "print \"Bootstrapping: P-value of MMD test is %.2f\" % p_value_spectrum"
    ]
   },
@@ -637,32 +601,34 @@
     "    Z=hstack((X,Y))\n",
     "    X=Z[:len(X)]\n",
     "    Y=Z[len(X):]\n",
-    "    feat_p=RealFeatures(reshape(X, (1,len(X))))\n",
-    "    feat_q=RealFeatures(reshape(Y, (1,len(Y))))\n",
+    "    feat_p=sg.RealFeatures(reshape(X, (1,len(X))))\n",
+    "    feat_q=sg.RealFeatures(reshape(Y, (1,len(Y))))\n",
     "    \n",
     "    # gamma\n",
-    "    mmd=QuadraticTimeMMD(kernel, feat_p, feat_q)\n",
-    "    mmd.set_null_approximation_method(MMD2_GAMMA)\n",
-    "    mmd.set_statistic_type(BIASED)\n",
+    "    mmd=sg.QuadraticTimeMMD(feat_p, feat_q)\n",
+    "    mmd.set_kernel(kernel)\n",
+    "    mmd.set_null_approximation_method(sg.NAM_MMD2_GAMMA)\n",
+    "    mmd.set_statistic_type(sg.ST_BIASED_FULL)    \n",
     "    rejections_gamma[i]=mmd.perform_test(alpha)\n",
     "    \n",
     "    # spectrum\n",
-    "    mmd=QuadraticTimeMMD(kernel, feat_p, feat_q)\n",
-    "    mmd.set_null_approximation_method(MMD2_SPECTRUM)\n",
-    "    mmd.set_num_eigenvalues_spectrum(num_eigen)\n",
-    "    mmd.set_num_samples_spectrum(num_samples)\n",
-    "    mmd.set_statistic_type(BIASED)\n",
+    "    mmd=sg.QuadraticTimeMMD(feat_p, feat_q)\n",
+    "    mmd.set_kernel(kernel)\n",
+    "    mmd.set_null_approximation_method(sg.NAM_MMD2_SPECTRUM)\n",
+    "    mmd.spectrum_set_num_eigenvalues(num_eigen)\n",
+    "    mmd.set_num_null_samples(num_samples)\n",
+    "    mmd.set_statistic_type(sg.ST_BIASED_FULL)\n",
     "    rejections_spectrum[i]=mmd.perform_test(alpha)\n",
     "    \n",
     "    # bootstrap (precompute kernel)\n",
-    "    mmd=QuadraticTimeMMD(kernel, feat_p, feat_q)\n",
+    "    mmd=sg.QuadraticTimeMMD(feat_p, feat_q)\n",
     "    p_and_q=mmd.get_p_and_q()\n",
     "    kernel.init(p_and_q, p_and_q)\n",
-    "    precomputed_kernel=CustomKernel(kernel)\n",
+    "    precomputed_kernel=sg.CustomKernel(kernel)\n",
     "    mmd.set_kernel(precomputed_kernel)\n",
-    "    mmd.set_null_approximation_method(PERMUTATION)\n",
+    "    mmd.set_null_approximation_method(sg.NAM_PERMUTATION)\n",
     "    mmd.set_num_null_samples(num_samples)\n",
-    "    mmd.set_statistic_type(BIASED)\n",
+    "    mmd.set_statistic_type(sg.ST_BIASED_FULL)\n",
     "    rejections_bootstrap[i]=mmd.perform_test(alpha)"
    ]
   },
@@ -701,9 +667,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "So far, we basically had to precompute the kernel matrix for reasonable runtimes. This is not possible for more than a few thousand points. The linear time MMD statistic, implemented in <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLinearTimeMMD.html\">CLinearTimeMMD</a> can help here, as it accepts data under the streaming interface <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CStreamingFeatures.html\">CStreamingFeatures</a>, which deliver data one-by-one.\n",
+    "So far, we basically had to precompute the kernel matrix for reasonable runtimes. This is not possible for more than a few thousand points. The linear time MMD statistic, implemented in <a href=\"http://shogun.ml/CLinearTimeMMD\">CLinearTimeMMD</a> can help here, as it accepts data under the streaming interface <a href=\"http://shogun.ml/CStreamingFeatures\">CStreamingFeatures</a>, which deliver data one-by-one.\n",
     "\n",
-    "And it can do more cool things, for example choose the best single (or combined) kernel for you. But we need a more fancy dataset for that to show its power. We will use one of Shogun's streaming based data generator, <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CGaussianBlobsDataGenerator.html\">CGaussianBlobsDataGenerator</a> for that. This dataset consists of two distributions which are a grid of Gaussians where in one of them, the Gaussians are stretched and rotated. This dataset is regarded as challenging for two-sample testing."
+    "And it can do more cool things, for example choose the best single (or combined) kernel for you. But we need a more fancy dataset for that to show its power. We will use one of Shogun's streaming based data generator, <a href=\"http://shogun.ml/CGaussianBlobsDataGenerator\">CGaussianBlobsDataGenerator</a> for that. This dataset consists of two distributions which are a grid of Gaussians where in one of them, the Gaussians are stretched and rotated. This dataset is regarded as challenging for two-sample testing."
    ]
   },
   {
@@ -722,8 +688,8 @@
     "angle=pi/4\n",
     "\n",
     "# these are streaming features\n",
-    "gen_p=GaussianBlobsDataGenerator(num_blobs, distance, 1, 0)\n",
-    "gen_q=GaussianBlobsDataGenerator(num_blobs, distance, stretch, angle)\n",
+    "gen_p=sg.GaussianBlobsDataGenerator(num_blobs, distance, 1, 0)\n",
+    "gen_q=sg.GaussianBlobsDataGenerator(num_blobs, distance, stretch, angle)\n",
     "\t\t\n",
     "# stream some data and plot\n",
     "num_plot=1000\n",
@@ -755,7 +721,7 @@
     "\n",
     "where $ m_2=\\lfloor\\frac{m}{2} \\rfloor$. While the above expression assumes that $m$ data are available from each distribution, the statistic in general works in an online setting where features are obtained one by one. Since only pairs of four points are considered at once, this allows to compute it on data streams. In addition, the computational costs are linear in the number of samples that are considered from each distribution. These two properties make the linear time MMD very applicable for large scale two-sample tests. In theory, any number of samples can be processed -- time is the only limiting factor.\n",
     "\n",
-    "We begin by illustrating how to pass data to  <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLinearTimeMMD.html\">CLinearTimeMMD</a>. In order not to loose performance due to overhead, it is possible to specify a block size for the data stream."
+    "We begin by illustrating how to pass data to  <a href=\"http://shogun.ml/CLinearTimeMMD\">CLinearTimeMMD</a>. In order not to loose performance due to overhead, it is possible to specify a block size for the data stream."
    ]
   },
   {
@@ -769,7 +735,11 @@
     "block_size=100\n",
     "\n",
     "# if features are already under the streaming interface, just pass them\n",
-    "mmd=LinearTimeMMD(kernel, gen_p, gen_q, m, block_size)\n",
+    "mmd=sg.LinearTimeMMD(gen_p, gen_q)\n",
+    "mmd.set_kernel(kernel)\n",
+    "mmd.set_num_samples_p(m)\n",
+    "mmd.set_num_samples_q(m)\n",
+    "mmd.set_num_blocks_per_burst(block_size)\n",
     "\n",
     "# compute an unbiased estimate in linear time\n",
     "statistic=mmd.compute_statistic()\n",
@@ -785,7 +755,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Sometimes, one might want to use <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLinearTimeMMD.html\">CLinearTimeMMD</a> with data that is stored in memory. In that case, it is easy to data in the form of for example <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CStreamingDenseFeatures.html\">CStreamingDenseFeatures</a> into <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CDenseFeatures.html\">CDenseFeatures</a>."
+    "Sometimes, one might want to use <a href=\"http://shogun.ml/CLinearTimeMMD\">CLinearTimeMMD</a> with data that is stored in memory. In that case, it is easy to data in the form of for example <a href=\"http://shogun.ml/CStreamingDenseFeatures\">CStreamingDenseFeatures</a> into <a href=\"http://shogun.ml/CDenseFeatures\">CDenseFeatures</a>."
    ]
   },
   {
@@ -797,25 +767,21 @@
    "outputs": [],
    "source": [
     "# data source\n",
-    "gen_p=GaussianBlobsDataGenerator(num_blobs, distance, 1, 0)\n",
-    "gen_q=GaussianBlobsDataGenerator(num_blobs, distance, stretch, angle)\n",
-    "\n",
-    "# retreive some points, store them as non-streaming data in memory\n",
-    "data_p=gen_p.get_streamed_features(100)\n",
-    "data_q=gen_q.get_streamed_features(data_p.get_num_vectors())\n",
-    "print \"Number of data is %d\" % data_p.get_num_vectors()\n",
+    "gen_p=sg.GaussianBlobsDataGenerator(num_blobs, distance, 1, 0)\n",
+    "gen_q=sg.GaussianBlobsDataGenerator(num_blobs, distance, stretch, angle)\n",
     "\n",
-    "# cast data in memory as streaming features again (which now stream from the in-memory data)\n",
-    "streaming_p=StreamingRealFeatures(data_p)\n",
-    "streaming_q=StreamingRealFeatures(data_q)\n",
+    "num_samples=100\n",
+    "print \"Number of data is %d\" % num_samples\n",
     "\n",
-    "# it is important to start the internal parser to avoid deadlocks\n",
-    "streaming_p.start_parser()\n",
-    "streaming_q.start_parser()\n",
+    "# retreive some points, store them as non-streaming data in memory\n",
+    "data_p=gen_p.get_streamed_features(num_samples)\n",
+    "data_q=gen_q.get_streamed_features(num_samples)\n",
     "\n",
-    "# example to create mmd (note that m can be maximum the number of data in memory)\n",
+    "# example to create mmd (note that num_samples can be maximum the number of data in memory)\n",
     "\n",
-    "mmd=LinearTimeMMD(GaussianKernel(10,1), streaming_p, streaming_q, data_p.get_num_vectors(), 1)\n",
+    "mmd=sg.LinearTimeMMD(data_p, data_q)\n",
+    "mmd.set_kernel(sg.GaussianKernel(10, 1))\n",
+    "mmd.set_num_blocks_per_burst(100)\n",
     "print \"Linear time MMD statistic: %.2f\" % mmd.compute_statistic()"
    ]
   },
@@ -830,7 +796,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As for any two-sample test in Shogun, bootstrapping can be used to approximate the null distribution. This results in a consistent, but slow test. The number of samples to take is the only parameter. Note that since <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLinearTimeMMD.html\">CLinearTimeMMD</a> operates on streaming features, *new* data is taken from the stream in every iteration.\n",
+    "As for any two-sample test in Shogun, bootstrapping can be used to approximate the null distribution. This results in a consistent, but slow test. The number of samples to take is the only parameter. Note that since <a href=\"http://shogun.ml/CLinearTimeMMD\">CLinearTimeMMD</a> operates on streaming features, *new* data is taken from the stream in every iteration.\n",
     "\n",
     "Bootstrapping is not really necessary since there exists a fast and consistent estimate of the null-distribution. However, to ensure that any approximation is accurate, it should always be checked against bootstrapping at least once.\n",
     "\n",
@@ -848,7 +814,7 @@
     "\n",
     "A normal distribution with this variance and zero mean can then be used as an approximation for the null-distribution. This results in a consistent test and is very fast. However, note that it is an approximation and its accuracy depends on the underlying data distributions. It is a good idea to compare to the bootstrapping approach first to determine an appropriate number of samples to use. This number is usually in the tens of thousands.\n",
     "\n",
-    "<a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLinearTimeMMD.html\">CLinearTimeMMD</a> allows to approximate the null distribution in the same pass as computing the statistic itself (in linear time). This should always be used in practice since seperate calls of computing statistic and p-value will operator on different data from the stream. Below, we compute the test on a large amount of data (impossible to perform quadratic time MMD for this one as the kernel matrices cannot be stored in memory)"
+    "<a href=\"http://shogun.ml/CLinearTimeMMD\">CLinearTimeMMD</a> allows to approximate the null distribution in the same pass as computing the statistic itself (in linear time). This should always be used in practice since seperate calls of computing statistic and p-value will operator on different data from the stream. Below, we compute the test on a large amount of data (impossible to perform quadratic time MMD for this one as the kernel matrices cannot be stored in memory)"
    ]
   },
   {
@@ -859,10 +825,15 @@
    },
    "outputs": [],
    "source": [
-    "mmd=LinearTimeMMD(kernel, gen_p, gen_q, m, block_size)\n",
+    "mmd=sg.LinearTimeMMD(gen_p, gen_q)\n",
+    "mmd.set_kernel(kernel)\n",
+    "mmd.set_num_samples_p(m)\n",
+    "mmd.set_num_samples_q(m)\n",
+    "mmd.set_num_blocks_per_burst(block_size)\n",
+    "\n",
     "print \"m=%d samples from p and q\" % m\n",
     "print \"Binary test result is: \" + (\"Rejection\" if mmd.perform_test(alpha) else \"No rejection\")\n",
-    "print \"P-value test result is %.2f\" % mmd.perform_test()"
+    "print \"P-value test result is %.2f\" % mmd.compute_p_value(mmd.compute_statistic())"
    ]
   },
   {
@@ -880,38 +851,38 @@
     "\\DeclareMathOperator*{\\argmax}{arg\\,max}$\n",
     "Now which kernel do we actually use for our tests? So far, we just plugged in arbritary ones. However, for kernel two-sample testing, it is possible to do something more clever.\n",
     "\n",
-    "Shogun's kernel selection methods for MMD based two-sample tests are all based around  [3, 4]. For the <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLinearTimeMMD.html\">CLinearTimeMMD</a>, [3] describes a way of selecting the *optimal*  kernel in the sense that the test's type II error is minimised. For the linear time MMD, this is the method of choice. It is done via maximising the MMD statistic divided by its standard deviation and it is possible for single kernels and also for convex combinations of them. For the <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CQuadraticTimeMMD.html\">CQuadraticTimeMMD</a>, the best method in literature is choosing the kernel that maximised the MMD statistic [4]. For convex combinations of kernels, this can be achieved via a $L2$ norm constraint. A detailed comparison of all methods on numerous datasets can be found in [5].\n",
+    "Shogun's kernel selection methods for MMD based two-sample tests are all based around  [3, 4]. For the <a href=\"http://shogun.ml/CLinearTimeMMD\">CLinearTimeMMD</a>, [3] describes a way of selecting the *optimal*  kernel in the sense that the test's type II error is minimised. For the linear time MMD, this is the method of choice. It is done via maximising the MMD statistic divided by its standard deviation and it is possible for single kernels and also for convex combinations of them. For the <a href=\"http://shogun.ml/CQuadraticTimeMMD\">CQuadraticTimeMMD</a>, the best method in literature is choosing the kernel that maximised the MMD statistic [4]. For convex combinations of kernels, this can be achieved via a $L2$ norm constraint. A detailed comparison of all methods on numerous datasets can be found in [5].\n",
     "\n",
-    "MMD Kernel selection in Shogun always involves an implementation of the base class  <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelection.html\">CMMDKernelSelection</a>, which defines the interface for kernel selection. If combinations of kernel should be considered, there is a sub-class <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionComb.html\">CMMDKernelSelectionComb</a>. In addition, it involves setting up a number of baseline kernels $\\mathcal{K}$ to choose from/combine in the form of a  <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CCombinedKernel.html\">CCombinedKernel</a>. All methods compute their results for a fixed set of these baseline kernels. We later give an example how to use these classes after providing a list of available methods.\n",
+    "MMD Kernel selection in Shogun always involves coosing a one of the methods of <a href=\"http://shogun.ml/CKernelSelectionStrategy\">CGaussianKernel</a>  All methods compute their results for a fixed set of these baseline kernels. We later give an example how to use these classes after providing a list of available methods.\n",
     "\n",
-    " *  <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionMedian.html\">CMMDKernelSelectionMedian</a>  Selects from a set <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CGaussianKernel.html\">CGaussianKernel</a> instances the one whose width parameter is closest to the median of the pairwise distances in the data. The median is computed on a certain number of points from each distribution that can be specified as a parameter. Since the median is a stable statistic, one does not have to compute all pairwise distances but rather just a few thousands. This method a useful (and fast) heuristic that in many cases gives a good hint on where to start looking for Gaussian kernel widths. It is for example described in [1]. Note that it may fail badly in selecting a good kernel for certain problems.\n",
+    " *  *KSM_MEDIAN_HEURISTIC*:  Selects from a set <a href=\"http://shogun.ml/CGaussianKernel\">CGaussianKernel</a> instances the one whose width parameter is closest to the median of the pairwise distances in the data. The median is computed on a certain number of points from each distribution that can be specified as a parameter. Since the median is a stable statistic, one does not have to compute all pairwise distances but rather just a few thousands. This method a useful (and fast) heuristic that in many cases gives a good hint on where to start looking for Gaussian kernel widths. It is for example described in [1]. Note that it may fail badly in selecting a good kernel for certain problems.\n",
     "\n",
-    " * <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionMax.html\">CMMDKernelSelectionMax</a> Selects from a set of arbitrary baseline kernels a single one that maximises the used MMD statistic -- more specific its estimate.\n",
+    " * *KSM_MAXIMIZE_MMD*: Selects from a set of arbitrary baseline kernels a single one that maximises the used MMD statistic -- more specific its estimate.\n",
     "$$\n",
     "k^*=\\argmax_{k\\in\\mathcal{K}} \\hat \\eta_k,\n",
     "$$\n",
     "where $\\eta_k$ is an empirical MMD estimate for using a kernel $k$.\n",
     "This was first described in [4] and was empirically shown to perform better than the median heuristic above. However, it remains a heuristic that comes with no guarantees. Since MMD estimates can be computed in linear and quadratic time, this method works for both methods. However, for the linear time statistic, there exists a better method.\n",
     " \n",
-    " * <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionOpt.html\">CMMDKernelSelectionOpt</a> Selects the optimal single kernel from a set of baseline kernels. This is done via maximising the ratio of the linear MMD statistic and its standard deviation.\n",
+    " * *KSM_MAXIMIZE_POWER*: Selects the optimal single kernel from a set of baseline kernels. This is done via maximising the ratio of the linear MMD statistic and its standard deviation.\n",
     "$$\n",
     "k^*=\\argmax_{k\\in\\mathcal{K}} \\frac{\\hat \\eta_k}{\\hat\\sigma_k+\\lambda},\n",
     "$$\n",
     "where $\\eta_k$ is a linear time MMD estimate for using a kernel $k$ and $\\hat\\sigma_k$ is a linear time variance estimate of $\\eta_k$ to which a small number $\\lambda$ is added to prevent division by zero.\n",
-    "These are estimated in a linear time way with the streaming framework that was described earlier. Therefore, this method is only available for <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLinearTimeMMD.html\">CLinearTimeMMD</a>. Optimal here means that the resulting test's type II error is minimised for a fixed type I error. *Important:* For this method to work, the kernel needs to be selected on *different* data than the test is performed on. Otherwise, the method will produce wrong results.\n",
+    "These are estimated in a linear time way with the streaming framework that was described earlier. Therefore, this method is only available for <a href=\"http://shogun.ml/CLinearTimeMMD\">CLinearTimeMMD</a>. Optimal here means that the resulting test's type II error is minimised for a fixed type I error. *Important:* For this method to work, the kernel needs to be selected on *different* data than the test is performed on. Otherwise, the method will produce wrong results.\n",
     " \n",
-    " * <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionCombMaxL2.html\">CMMDKernelSelectionCombMaxL2</a>  Selects a convex combination of kernels that maximises the MMD statistic. This is the multiple kernel analogous to <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionMax.html\">CMMDKernelSelectionMax</a>. This is done via solving the convex program\n",
+    " * <a href=\"http://shogun.ml/CMMDKernelSelectionCombMaxL2\">CMMDKernelSelectionCombMaxL2</a>  Selects a convex combination of kernels that maximises the MMD statistic. This is the multiple kernel analogous to <a href=\"http://shogun.ml/CMMDKernelSelectionMax\">CMMDKernelSelectionMax</a>. This is done via solving the convex program\n",
     "$$\n",
     "\\boldsymbol{\\beta}^*=\\min_{\\boldsymbol{\\beta}} \\{\\boldsymbol{\\beta}^T\\boldsymbol{\\beta}  :  \\boldsymbol{\\beta}^T\\boldsymbol{\\eta}=\\mathbf{1}, \\boldsymbol{\\beta}\\succeq 0\\},\n",
     "$$\n",
     "where $\\boldsymbol{\\beta}$ is a vector of the resulting kernel weights and $\\boldsymbol{\\eta}$ is a vector of which each component contains a MMD estimate for a baseline kernel. See [3] for details. Note that this method is unable to select a single kernel -- even when this would be optimal.\n",
     "Again, when using the linear time MMD, there are better methods available.\n",
     "\n",
-    " * <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionCombOpt.html\">CMMDKernelSelectionCombOpt</a> Selects a convex combination of kernels that maximises the MMD statistic divided by its covariance. This corresponds to \\emph{optimal} kernel selection in the same sense as in class <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionOpt.html\">CMMDKernelSelectionOpt</a> and is its multiple kernel analogous. The convex program to solve is\n",
+    " * <a href=\"http://shogun.ml/CMMDKernelSelectionCombOpt\">CMMDKernelSelectionCombOpt</a> Selects a convex combination of kernels that maximises the MMD statistic divided by its covariance. This corresponds to \\emph{optimal} kernel selection in the same sense as in class <a href=\"http://shogun.ml/CMMDKernelSelectionOpt\">CMMDKernelSelectionOpt</a> and is its multiple kernel analogous. The convex program to solve is\n",
     "$$\n",
     "\\boldsymbol{\\beta}^*=\\min_{\\boldsymbol{\\beta}} (\\hat Q+\\lambda I) \\{\\boldsymbol{\\beta}^T\\boldsymbol{\\beta}  :  \\boldsymbol{\\beta}^T\\boldsymbol{\\eta}=\\mathbf{1}, \\boldsymbol{\\beta}\\succeq 0\\},\n",
     "$$\n",
-    "where again $\\boldsymbol{\\beta}$ is a vector of the resulting kernel weights and $\\boldsymbol{\\eta}$ is a vector of which each component contains a MMD estimate for a baseline kernel. The matrix $\\hat Q$ is a linear time estimate of the covariance matrix of the vector $\\boldsymbol{\\eta}$ to whose diagonal a small number $\\lambda$ is added to prevent division by zero. See [3] for details. In contrast to <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionCombMaxL2.html\">CMMDKernelSelectionCombMaxL2</a>, this method is able to select a single kernel when this gives a lower type II error than a combination. In this sense, it contains <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMMDKernelSelectionOpt.html\">CMMDKernelSelectionOpt</a>."
+    "where again $\\boldsymbol{\\beta}$ is a vector of the resulting kernel weights and $\\boldsymbol{\\eta}$ is a vector of which each component contains a MMD estimate for a baseline kernel. The matrix $\\hat Q$ is a linear time estimate of the covariance matrix of the vector $\\boldsymbol{\\eta}$ to whose diagonal a small number $\\lambda$ is added to prevent division by zero. See [3] for details. In contrast to <a href=\"http://shogun.ml/CMMDKernelSelectionCombMaxL2\">CMMDKernelSelectionCombMaxL2</a>, this method is able to select a single kernel when this gives a lower type II error than a combination. In this sense, it contains <a href=\"http://shogun.ml/CMMDKernelSelectionOpt\">CMMDKernelSelectionOpt</a>."
    ]
   },
   {
@@ -925,11 +896,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In order to use one of the above methods for kernel selection, one has to create a new instance of <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CCombinedKernel.html\">CCombinedKernel</a> append all desired baseline kernels to it. This combined kernel is then passed to the MMD class. Then, an object of any of the above kernel selection methods is created and the MMD instance is passed to it in the constructor. There are then multiple methods to call\n",
+    "In order to use one of the above methods for kernel selection, one has to create a new instance of <a href=\"http://shogun.ml/CCombinedKernel\">CCombinedKernel</a> append all desired baseline kernels to it. This combined kernel is then passed to the MMD class. Then, an object of any of the above kernel selection methods is created and the MMD instance is passed to it in the constructor. There are then multiple methods to call\n",
     "\n",
     " * *compute_measures* to compute a vector kernel selection criteria if a single kernel selection method is used. It will return a vector of selected kernel weights if a combined kernel selection method is used. For \\shogunclass{CMMDKernelSelectionMedian}, the method does throw an error.\n",
     "\n",
-    " * *select\\_kernel* returns the selected kernel of the method. For single kernels this will be one of the baseline kernel instances. For the combined kernel case, this will be the underlying <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CCombinedKernel.html\">CCombinedKernel</a> instance where the subkernel weights are set to the weights that were selected by the method. \n",
+    " * *select\\_kernel* returns the selected kernel of the method. For single kernels this will be one of the baseline kernel instances. For the combined kernel case, this will be the underlying <a href=\"http://shogun.ml/CCombinedKernel\">CCombinedKernel</a> instance where the subkernel weights are set to the weights that were selected by the method. \n",
     "\n",
     "In order to utilise the selected kernel, it has to be passed to an MMD instance. We now give an example how to select the optimal single and combined kernel for the Gaussian Blobs dataset."
    ]
@@ -949,22 +920,29 @@
    },
    "outputs": [],
    "source": [
-    "sigmas=[2**x for x in linspace(-5,5, 10)]\n",
+    "# mmd instance using streaming features\n",
+    "mmd=sg.LinearTimeMMD(gen_p, gen_q)\n",
+    "mmd.set_num_samples_p(m)\n",
+    "mmd.set_num_samples_q(m)\n",
+    "mmd.set_num_blocks_per_burst(block_size)\n",
+    "\n",
+    "sigmas=[2**x for x in np.linspace(-5, 5, 11)]\n",
     "print \"Choosing kernel width from\", [\"{0:.2f}\".format(sigma) for sigma in sigmas]\n",
-    "combined=CombinedKernel()\n",
-    "for i in range(len(sigmas)):\n",
-    "    combined.append_kernel(GaussianKernel(10, sigmas[i]))\n",
     "\n",
-    "# mmd instance using streaming features\n",
-    "block_size=1000\n",
-    "mmd=LinearTimeMMD(combined, gen_p, gen_q, m, block_size)\n",
+    "for i in range(len(sigmas)):\n",
+    "    mmd.add_kernel(sg.GaussianKernel(10, sigmas[i]))\n",
     "\n",
     "# optmal kernel choice is possible for linear time MMD\n",
-    "selection=MMDKernelSelectionOpt(mmd)\n",
+    "mmd.set_kernel_selection_strategy(sg.KSM_MAXIMIZE_POWER)\n",
+    "\n",
+    "# must be set true for kernel selection\n",
+    "mmd.set_train_test_mode(True)\n",
     "\n",
     "# select best kernel\n",
-    "best_kernel=selection.select_kernel()\n",
-    "best_kernel=GaussianKernel.obtain_from_generic(best_kernel)\n",
+    "mmd.select_kernel()\n",
+    "\n",
+    "best_kernel=mmd.get_kernel()\n",
+    "best_kernel=sg.GaussianKernel.obtain_from_generic(best_kernel)\n",
     "print \"Best single kernel has bandwidth %.2f\" % best_kernel.get_width()"
    ]
   },
@@ -983,10 +961,8 @@
    },
    "outputs": [],
    "source": [
-    "alpha=0.05\n",
-    "mmd=LinearTimeMMD(best_kernel, gen_p, gen_q, m, block_size)\n",
-    "mmd.set_null_approximation_method(MMD1_GAUSSIAN);\n",
-    "p_value_best=mmd.perform_test();\n",
+    "mmd.set_null_approximation_method(sg.NAM_MMD1_GAUSSIAN);\n",
+    "p_value_best=mmd.compute_p_value(mmd.compute_statistic());\n",
     "\n",
     "print \"Bootstrapping: P-value of MMD test with optimal kernel is %.2f\" % p_value_best"
    ]
@@ -1006,19 +982,20 @@
    },
    "outputs": [],
    "source": [
-    "mmd=LinearTimeMMD(best_kernel, gen_p, gen_q, 5000, block_size)\n",
+    "m=5000\n",
+    "mmd.set_num_samples_p(m)\n",
+    "mmd.set_num_samples_q(m)\n",
+    "mmd.set_train_test_mode(False)\n",
     "num_samples=500\n",
     "\n",
     "# sample null and alternative distribution, implicitly generate new data for that\n",
-    "null_samples=zeros(num_samples)\n",
+    "mmd.set_null_approximation_method(sg.NAM_PERMUTATION)\n",
+    "mmd.set_num_null_samples(num_samples)\n",
+    "null_samples=mmd.sample_null()\n",
+    "\n",
     "alt_samples=zeros(num_samples)\n",
     "for i in range(num_samples):\n",
-    "    alt_samples[i]=mmd.compute_statistic()\n",
-    "    \n",
-    "    # tell MMD to merge data internally while streaming\n",
-    "    mmd.set_simulate_h0(True)\n",
-    "    null_samples[i]=mmd.compute_statistic()\n",
-    "    mmd.set_simulate_h0(False)"
+    "    alt_samples[i]=mmd.compute_statistic()"
    ]
   },
   {
@@ -1103,7 +1080,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython2",
-   "version": "2.7.12"
+   "version": "2.7.10"
   }
  },
  "nbformat": 4,
diff --git a/examples/meta/generator/targets/cpp.json b/examples/meta/generator/targets/cpp.json
index d512e30b309..73a224161e0 100644
--- a/examples/meta/generator/targets/cpp.json
+++ b/examples/meta/generator/targets/cpp.json
@@ -86,7 +86,7 @@
         "MethodCall": "$object->$method($arguments)",
         "StaticCall": "C$typeName::$method($arguments)",
         "Identifier": "$identifier",
-        "Enum":"$value"
+        "Enum":"$typeName::$value"
     },
     "Element": {
         "Access": {
diff --git a/examples/meta/src/statistical_testing/linear_time_mmd.sg b/examples/meta/src/statistical_testing/linear_time_mmd.sg
new file mode 100644
index 00000000000..97e93642b83
--- /dev/null
+++ b/examples/meta/src/statistical_testing/linear_time_mmd.sg
@@ -0,0 +1,59 @@
+GaussianBlobsDataGenerator features_p()
+GaussianBlobsDataGenerator features_q()
+
+#![create_instance]
+LinearTimeMMD mmd()
+GaussianKernel kernel(10, 1)
+mmd.set_kernel(kernel)
+mmd.set_p(features_p)
+mmd.set_q(features_q)
+mmd.set_num_samples_p(1000)
+mmd.set_num_samples_q(1000)
+real alpha = 0.05
+#![create_instance]
+
+#![set_burst]
+mmd.set_num_blocks_per_burst(1000)
+#![set_burst]
+
+#![estimate_mmd]
+real statistic = mmd.compute_statistic()
+#![estimate_mmd]
+
+#![perform_test]
+real threshold = mmd.compute_threshold(alpha)
+real p_value = mmd.compute_p_value(statistic)
+#![perform_test]
+
+#![add_kernels]
+GaussianKernel kernel1(10, 0.1)
+GaussianKernel kernel2(10, 1)
+GaussianKernel kernel3(10, 10)
+mmd.add_kernel(kernel1)
+mmd.add_kernel(kernel2)
+mmd.add_kernel(kernel3)
+#![add_kernels]
+
+#![enable_train_test_mode]
+mmd.set_train_test_mode(True)
+mmd.set_train_test_ratio(1)
+#![enable_train_test_mode]
+
+#![select_kernel_single]
+mmd.set_kernel_selection_strategy(enum EKernelSelectionMethod.KSM_MAXIMIZE_POWER)
+mmd.select_kernel()
+GaussianKernel learnt_kernel_single = GaussianKernel:obtain_from_generic(mmd.get_kernel())
+real width = learnt_kernel_single.get_width()
+#![select_kernel_single]
+
+#![select_kernel_combined]
+mmd.set_kernel_selection_strategy(enum EKernelSelectionMethod.KSM_MAXIMIZE_POWER, True)
+mmd.select_kernel()
+CombinedKernel learnt_kernel_combined = CombinedKernel:obtain_from_generic(mmd.get_kernel())
+RealVector weights = learnt_kernel_combined.get_subkernel_weights()
+#![select_kernel_combined]
+
+#![perform_test_optimized]
+real statistic_optimized = mmd.compute_statistic()
+real p_value_optimized = mmd.compute_p_value(statistic)
+#![perform_test_optimized]
diff --git a/examples/meta/src/statistical_testing/quadratic_time_mmd.sg b/examples/meta/src/statistical_testing/quadratic_time_mmd.sg
new file mode 100644
index 00000000000..e6ec822ab40
--- /dev/null
+++ b/examples/meta/src/statistical_testing/quadratic_time_mmd.sg
@@ -0,0 +1,71 @@
+CSVFile f_features_p("../../data/two_sample_test_gaussian.dat")
+CSVFile f_features_q("../../data/two_sample_test_laplace.dat")
+
+#![create_features]
+RealFeatures features_p(f_features_p)
+RealFeatures features_q(f_features_q)
+#![create_features]
+
+#![create_instance]
+QuadraticTimeMMD mmd(features_p, features_q)
+GaussianKernel kernel(10, 1)
+mmd.set_kernel(kernel)
+real alpha = 0.05
+#![create_instance]
+
+#![estimate_mmd]
+mmd.set_statistic_type(enum EStatisticType.ST_BIASED_FULL)
+real statistic = mmd.compute_statistic()
+#![estimate_mmd]
+
+#![perform_test]
+mmd.set_null_approximation_method(enum ENullApproximationMethod.NAM_PERMUTATION)
+mmd.set_num_null_samples(200)
+real threshold = mmd.compute_threshold(alpha)
+real p_value = mmd.compute_p_value(statistic)
+#![perform_test]
+
+#![add_kernels]
+GaussianKernel kernel1(10, 0.1)
+GaussianKernel kernel2(10, 1)
+GaussianKernel kernel3(10, 10)
+mmd.add_kernel(kernel1)
+mmd.add_kernel(kernel2)
+mmd.add_kernel(kernel3)
+#![add_kernels]
+
+#![multi_kernel]
+MultiKernelQuadraticTimeMMD mk = mmd.multikernel()
+mk.add_kernel(kernel1)
+mk.add_kernel(kernel2)
+mk.add_kernel(kernel2)
+
+RealVector mk_statistic = mk.compute_statistic()
+RealVector mk_p_value = mk.compute_p_value()
+#![multi_kernel]
+
+#![enable_train_test_mode]
+mmd.set_train_test_mode(True)
+mmd.set_train_test_ratio(1)
+#![enable_train_test_mode]
+
+#![select_kernel_single]
+int num_runs = 1
+int num_folds = 3
+mmd.set_kernel_selection_strategy(enum EKernelSelectionMethod.KSM_CROSS_VALIDATION, num_runs, num_folds, alpha)
+mmd.select_kernel()
+GaussianKernel learnt_kernel_single = GaussianKernel:obtain_from_generic(mmd.get_kernel())
+real width = learnt_kernel_single.get_width()
+#![select_kernel_single]
+
+#![select_kernel_combined]
+mmd.set_kernel_selection_strategy(enum EKernelSelectionMethod.KSM_MAXIMIZE_MMD, True)
+mmd.select_kernel()
+CombinedKernel learnt_kernel_combined = CombinedKernel:obtain_from_generic(mmd.get_kernel())
+RealVector weights = learnt_kernel_combined.get_subkernel_weights()
+#![select_kernel_combined]
+
+#![perform_test_optimized]
+real statistic_optimized = mmd.compute_statistic()
+real p_value_optimized = mmd.compute_p_value(statistic)
+#![perform_test_optimized]
diff --git a/examples/undocumented/libshogun/statistics_hsic.cpp b/examples/undocumented/libshogun/statistics_hsic.cpp
deleted file mode 100644
index 196d9874a90..00000000000
--- a/examples/undocumented/libshogun/statistics_hsic.cpp
+++ /dev/null
@@ -1,172 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012 Heiko Strathmann
- */
-
-#include <shogun/base/init.h>
-#include <shogun/statistics/HSIC.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/mathematics/Statistics.h>
-
-using namespace shogun;
-
-void create_fixed_data_kernel_small(CFeatures*& features_p,
-		CFeatures*& features_q, CKernel*& kernel_p, CKernel*& kernel_q)
-{
-	index_t m=2;
-	index_t d=3;
-
-	SGMatrix<float64_t> p(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		p.matrix[i]=i;
-
-//	p.display_matrix("p");
-
-	SGMatrix<float64_t> q(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		q.matrix[i]=i+10;
-
-//	q.display_matrix("q");
-
-	features_p=new CDenseFeatures<float64_t>(p);
-	features_q=new CDenseFeatures<float64_t>(q);
-
-	float64_t sigma_x=2;
-	float64_t sigma_y=3;
-	float64_t sq_sigma_x_twice=sigma_x*sigma_x*2;
-	float64_t sq_sigma_y_twice=sigma_y*sigma_y*2;
-
-	/* shoguns kernel width is different */
-	kernel_p=new CGaussianKernel(10, sq_sigma_x_twice);
-	kernel_q=new CGaussianKernel(10, sq_sigma_y_twice);
-}
-
-void create_fixed_data_kernel_big(CFeatures*& features_p,
-		CFeatures*& features_q, CKernel*& kernel_p, CKernel*& kernel_q)
-{
-	index_t m=10;
-	index_t d=7;
-
-	SGMatrix<float64_t> p(d,m);
-	for (index_t i=0; i<d*m; ++i)
-		p.matrix[i]=(i+8)%3;
-
-//	p.display_matrix("p");
-
-	SGMatrix<float64_t> q(d,m);
-	for (index_t i=0; i<d*m; ++i)
-		q.matrix[i]=((i+10)*(i%4+2))%4;
-
-//	q.display_matrix("q");
-
-	features_p=new CDenseFeatures<float64_t>(p);
-	features_q=new CDenseFeatures<float64_t>(q);
-
-	float64_t sigma_x=2;
-	float64_t sigma_y=3;
-	float64_t sq_sigma_x_twice=sigma_x*sigma_x*2;
-	float64_t sq_sigma_y_twice=sigma_y*sigma_y*2;
-
-	/* shoguns kernel width is different */
-	kernel_p=new CGaussianKernel(10, sq_sigma_x_twice);
-	kernel_q=new CGaussianKernel(10, sq_sigma_y_twice);
-}
-
-/** tests the hsic statistic for a single fixed data case and ensures
- * equality with sma implementation */
-void test_hsic_fixed()
-{
-	CFeatures* features_p=NULL;
-	CFeatures* features_q=NULL;
-	CKernel* kernel_p=NULL;
-	CKernel* kernel_q=NULL;
-	create_fixed_data_kernel_small(features_p, features_q, kernel_p, kernel_q);
-
-	index_t m=features_p->get_num_vectors();
-
-	CHSIC* hsic=new CHSIC(kernel_p, kernel_q, features_p, features_q);
-
-	/* assert matlab result, note that compute statistic computes m*hsic */
-	float64_t difference=hsic->compute_statistic();
-	SG_SPRINT("hsic fixed: %f\n", difference);
-	ASSERT(CMath::abs(difference-m*0.164761446385339)<10E-16);
-
-
-	SG_UNREF(hsic);
-}
-
-void test_hsic_gamma()
-{
-	CFeatures* features_p=NULL;
-	CFeatures* features_q=NULL;
-	CKernel* kernel_p=NULL;
-	CKernel* kernel_q=NULL;
-	create_fixed_data_kernel_big(features_p, features_q, kernel_p, kernel_q);
-
-	CHSIC* hsic=new CHSIC(kernel_p, kernel_q, features_p, features_q);
-
-	hsic->set_null_approximation_method(HSIC_GAMMA);
-	float64_t p=hsic->compute_p_value(0.05);
-	SG_SPRINT("p-value: %f\n", p);
-	
-	// disabled as I think previous inverse_gamma_cdf was faulty
-	// now unit test fails. Needs to be investigated statistically
-	//ASSERT(CMath::abs(p-0.172182287884256)<10E-15);
-
-	SG_UNREF(hsic);
-}
-
-void test_hsic_sample_null()
-{
-	CFeatures* features_p=NULL;
-	CFeatures* features_q=NULL;
-	CKernel* kernel_p=NULL;
-	CKernel* kernel_q=NULL;
-	create_fixed_data_kernel_big(features_p, features_q, kernel_p, kernel_q);
-
-	CHSIC* hsic=new CHSIC(kernel_p, kernel_q, features_p, features_q);
-
-	/* do sampling null */
-	hsic->set_null_approximation_method(PERMUTATION);
-	float64_t p=hsic->compute_p_value(0.05);
-	SG_SPRINT("p-value: %f\n", p);
-
-	/* ensure that sampling null of hsic leads to same results as using
-	 * CKernelIndependenceTest */
-	CMath::init_random(1);
-	float64_t mean1=CStatistics::mean(hsic->sample_null());
-	float64_t var1=CStatistics::variance(hsic->sample_null());
-	SG_SPRINT("mean1=%f, var1=%f\n", mean1, var1);
-
-	CMath::init_random(1);
-	float64_t mean2=CStatistics::mean(
-			hsic->CKernelIndependenceTest::sample_null());
-	float64_t var2=CStatistics::variance(hsic->sample_null());
-	SG_SPRINT("mean2=%f, var2=%f\n", mean2, var2);
-
-	/* assert than results are the same from bot sampling null impl. */
-	ASSERT(CMath::abs(mean1-mean2)<10E-8);
-	ASSERT(CMath::abs(var1-var2)<10E-8);
-
-	SG_UNREF(hsic);
-}
-
-int main(int argc, char** argv)
-{
-	init_shogun_with_defaults();
-
-//	sg_io->set_loglevel(MSG_DEBUG);
-
-	test_hsic_fixed();
-	test_hsic_gamma();
-	test_hsic_sample_null();
-
-	exit_shogun();
-	return 0;
-}
-
diff --git a/examples/undocumented/libshogun/statistics_linear_time_mmd.cpp b/examples/undocumented/libshogun/statistics_linear_time_mmd.cpp
deleted file mode 100644
index 3687cd49d1a..00000000000
--- a/examples/undocumented/libshogun/statistics_linear_time_mmd.cpp
+++ /dev/null
@@ -1,93 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/base/init.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
-#include <shogun/mathematics/Statistics.h>
-
-using namespace shogun;
-
-void linear_time_mmd()
-{
-	/* note that the linear time statistic is designed for much larger datasets
-	 * so increase to get reasonable results */
-	index_t m=1000;
-	index_t dim=2;
-	float64_t difference=0.5;
-
-	/* streaming data generator for mean shift distributions */
-	CMeanShiftDataGenerator* gen_p=new CMeanShiftDataGenerator(0, dim);
-	CMeanShiftDataGenerator* gen_q=new CMeanShiftDataGenerator(difference, dim);
-
-	/* set kernel a-priori. usually one would do some kernel selection. See
-	 * other examples for this. */
-	float64_t width=10;
-	CGaussianKernel* kernel=new CGaussianKernel(10, width);
-
-	/* create linear time mmd instance */
-	index_t blocksize=1000;
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(kernel, gen_p, gen_q, m, blocksize);
-
-	/* perform test: compute p-value and test if null-hypothesis is rejected for
-	 * a test level of 0.05 */
-	float64_t alpha=0.05;
-
-	/* using bootstrapping (not reccomended for linear time MMD, since slow).
-	 * Also, in practice, use at least 250 iterations */
-	mmd->set_null_approximation_method(PERMUTATION);
-	mmd->set_num_null_samples(10);
-	float64_t p_value_bootstrap=mmd->perform_test();
-	/* reject if p-value is smaller than test level */
-	SG_SPRINT("bootstrap: p!=q: %d\n", p_value_bootstrap<alpha);
-
-	/* using Gaussian approximation (use large sample size, check type I error).
-	 * Also, in practice, use at least 250 iterations */
-	mmd->set_null_approximation_method(MMD1_GAUSSIAN);
-	float64_t p_value_gaussian=mmd->perform_test();
-	/* reject if p-value is smaller than test level */
-	SG_SPRINT("gaussian approx: p!=q: %d\n", p_value_gaussian<alpha);
-
-	/* compute tpye I and II error (use many more trials in practice).
-	 * Type I error is only estimated to check MMD1_GAUSSIAN method for
-	 * estimating the null distribution. Note that testing has to happen on
-	 * difference data than kernel selection, but the linear time mmd does this
-	 * implicitly and we used a fixed kernel here. */
-	index_t num_trials=5;
-	SGVector<float64_t> typeIerrors(num_trials);
-	SGVector<float64_t> typeIIerrors(num_trials);
-	for (index_t i=0; i<num_trials; ++i)
-	{
-		/* this effectively means that p=q - rejecting is tpye I error */
-		mmd->set_simulate_h0(true);
-		typeIerrors[i]=mmd->perform_test()>alpha;
-		mmd->set_simulate_h0(false);
-
-		typeIIerrors[i]=mmd->perform_test()>alpha;
-	}
-
-	SG_SPRINT("type I error: %f\n", CStatistics::mean(typeIerrors));
-	SG_SPRINT("type II error: %f\n", CStatistics::mean(typeIIerrors));
-
-	SG_UNREF(mmd);
-}
-
-int main(int argc, char** argv)
-{
-	init_shogun_with_defaults();
-//	sg_io->set_loglevel(MSG_DEBUG);
-
-	linear_time_mmd();
-
-	exit_shogun();
-	return 0;
-}
-
diff --git a/examples/undocumented/libshogun/statistics_mmd_kernel_selection.cpp b/examples/undocumented/libshogun/statistics_mmd_kernel_selection.cpp
deleted file mode 100644
index a3f5df0764f..00000000000
--- a/examples/undocumented/libshogun/statistics_mmd_kernel_selection.cpp
+++ /dev/null
@@ -1,216 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012 Heiko Strathmann
- */
-
-#include <shogun/base/init.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/statistics/QuadraticTimeMMD.h>
-#ifdef USE_GPL_SHOGUN
-#include <shogun/statistics/MMDKernelSelectionCombOpt.h>
-#include <shogun/statistics/MMDKernelSelectionCombMaxL2.h>
-#endif //USE_GPL_SHOGUN
-#include <shogun/statistics/MMDKernelSelectionOpt.h>
-#include <shogun/statistics/MMDKernelSelectionMax.h>
-#include <shogun/statistics/MMDKernelSelectionMedian.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/features/streaming/generators/GaussianBlobsDataGenerator.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <shogun/mathematics/Statistics.h>
-
-using namespace shogun;
-
-void kernel_choice_linear_time_mmd_opt_single()
-{
-	/* Note that the linear time mmd is designed for large datasets. Results on
-	 * this small number will be bad (unstable, type I error wrong) */
-	index_t m=1000;
-	index_t num_blobs=3;
-	float64_t distance=3;
-	float64_t stretch=10;
-	float64_t angle=CMath::PI/4;
-
-	CGaussianBlobsDataGenerator* gen_p=new CGaussianBlobsDataGenerator(
-				num_blobs, distance, stretch, angle);
-
-	CGaussianBlobsDataGenerator* gen_q=new CGaussianBlobsDataGenerator(
-				num_blobs, distance, 1, 1);
-
-	/* create kernels */
-	CCombinedKernel* combined=new CCombinedKernel();
-	float64_t sigma_from=-3;
-	float64_t sigma_to=10;
-	float64_t sigma_step=1;
-	float64_t sigma=sigma_from;
-	while (sigma<=sigma_to)
-	{
-		/* shoguns kernel width is different */
-		float64_t width=CMath::pow(2.0, sigma);
-		float64_t sq_width_twice=width*width*2;
-		combined->append_kernel(new CGaussianKernel(10, sq_width_twice));
-		sigma+=sigma_step;
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(combined, gen_p, gen_q, m);
-
-	/* kernel selection instance with regularisation term. May be replaced by
-	 * other methods for selecting single kernels */
-	CMMDKernelSelectionOpt* selection=
-			new CMMDKernelSelectionOpt(mmd, 10E-5);
-//
-	/* select kernel that maximised MMD */
-//	CMMDKernelSelectionMax* selection=
-//			new CMMDKernelSelectionMax(mmd);
-
-//	/* select kernel with width closest to median data distance */
-//	CMMDKernelSelectionMedian* selection=
-//			new CMMDKernelSelectionMedian(mmd, 10E-5);
-
-	/* compute measures.
-	 * For Opt: ratio of MMD and standard deviation
-	 * For Max: MMDs of single kernels
-	 * for Medigan: Does not work! */
-	SG_SPRINT("computing ratios\n");
-	SGVector<float64_t> ratios=selection->compute_measures();
-	ratios.display_vector("ratios");
-
-	/* select kernel using the maximum ratio (and cast) */
-	SG_SPRINT("selecting kernel\n");
-	CKernel* selected=selection->select_kernel();
-	CGaussianKernel* casted=CGaussianKernel::obtain_from_generic(selected);
-	SG_SPRINT("selected kernel width: %f\n", casted->get_width());
-	mmd->set_kernel(selected);
-	SG_UNREF(casted);
-	SG_UNREF(selected);
-
-	mmd->set_null_approximation_method(MMD1_GAUSSIAN);
-
-	/* compute tpye I and II error (use many more trials). Type I error is only
-	 * estimated to check MMD1_GAUSSIAN method for estimating the null
-	 * distribution. Note that testing has to happen on difference data than
-	 * kernel selecting, but the linear time mmd does this implicitly */
-	float64_t alpha=0.05;
-	index_t num_trials=5;
-	SGVector<float64_t> typeIerrors(num_trials);
-	SGVector<float64_t> typeIIerrors(num_trials);
-	for (index_t i=0; i<num_trials; ++i)
-	{
-		/* this effectively means that p=q - rejecting is tpye I error */
-		mmd->set_simulate_h0(true);
-		typeIerrors[i]=mmd->perform_test()>alpha;
-		mmd->set_simulate_h0(false);
-
-		typeIIerrors[i]=mmd->perform_test()>alpha;
-	}
-
-	SG_SPRINT("type I error: %f\n", CStatistics::mean(typeIerrors));
-	SG_SPRINT("type II error: %f\n", CStatistics::mean(typeIIerrors));
-
-
-	SG_UNREF(selection);
-}
-
-void kernel_choice_linear_time_mmd_opt_comb()
-{
-#ifdef USE_GPL_SHOGUN
-	/* Note that the linear time mmd is designed for large datasets. Results on
-	 * this small number will be bad (unstable, type I error wrong) */
-	index_t m=1000;
-	index_t num_blobs=3;
-	float64_t distance=3;
-	float64_t stretch=10;
-	float64_t angle=CMath::PI/4;
-
-	CGaussianBlobsDataGenerator* gen_p=new CGaussianBlobsDataGenerator(
-				num_blobs, distance, stretch, angle);
-
-	CGaussianBlobsDataGenerator* gen_q=new CGaussianBlobsDataGenerator(
-				num_blobs, distance, 1, 1);
-
-	/* create kernels */
-	CCombinedKernel* combined=new CCombinedKernel();
-	float64_t sigma_from=-3;
-	float64_t sigma_to=10;
-	float64_t sigma_step=1;
-	float64_t sigma=sigma_from;
-	index_t num_kernels=0;
-	while (sigma<=sigma_to)
-	{
-		/* shoguns kernel width is different */
-		float64_t width=CMath::pow(2.0, sigma);
-		float64_t sq_width_twice=width*width*2;
-		combined->append_kernel(new CGaussianKernel(10, sq_width_twice));
-		sigma+=sigma_step;
-		num_kernels++;
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(combined, gen_p, gen_q, m);
-
-	/* kernel selection instance with regularisation term. May be replaced by
-	 * other methods for selecting single kernels */
-	CMMDKernelSelectionCombOpt* selection=
-			new CMMDKernelSelectionCombOpt(mmd, 10E-5);
-
-	/* maximise L2 regularised MMD */
-//	CMMDKernelSelectionCombMaxL2* selection=
-//			new CMMDKernelSelectionCombMaxL2(mmd, 10E-5);
-
-	/* select kernel (does the same as above, but sets weights to kernel) */
-	SG_SPRINT("selecting kernel\n");
-	CKernel* selected=selection->select_kernel();
-	CCombinedKernel* casted=CCombinedKernel::obtain_from_generic(selected);
-	casted->get_subkernel_weights().display_vector("weights");
-	mmd->set_kernel(selected);
-	SG_UNREF(casted);
-	SG_UNREF(selected);
-
-	/* compute tpye I and II error (use many more trials). Type I error is only
-	 * estimated to check MMD1_GAUSSIAN method for estimating the null
-	 * distribution. Note that testing has to happen on difference data than
-	 * kernel selecting, but the linear time mmd does this implicitly */
-	mmd->set_null_approximation_method(MMD1_GAUSSIAN);
-	float64_t alpha=0.05;
-	index_t num_trials=5;
-	SGVector<float64_t> typeIerrors(num_trials);
-	SGVector<float64_t> typeIIerrors(num_trials);
-	for (index_t i=0; i<num_trials; ++i)
-	{
-		/* this effectively means that p=q - rejecting is tpye I error */
-		mmd->set_simulate_h0(true);
-		typeIerrors[i]=mmd->perform_test()>alpha;
-		mmd->set_simulate_h0(false);
-
-		typeIIerrors[i]=mmd->perform_test()>alpha;
-	}
-
-	SG_SPRINT("type I error: %f\n", CStatistics::mean(typeIerrors));
-	SG_SPRINT("type II error: %f\n", CStatistics::mean(typeIIerrors));
-
-
-	SG_UNREF(selection);
-#endif //USE_GPL_SHOGUN
-}
-
-int main(int argc, char** argv)
-{
-	init_shogun_with_defaults();
-//	sg_io->set_loglevel(MSG_DEBUG);
-
-	/* select a single kernel for linear time MMD */
-	kernel_choice_linear_time_mmd_opt_single();
-
-	/* select combined kernels for linear time MMD */
-	kernel_choice_linear_time_mmd_opt_comb();
-
-	exit_shogun();
-	return 0;
-}
-
diff --git a/examples/undocumented/libshogun/statistics_quadratic_time_mmd.cpp b/examples/undocumented/libshogun/statistics_quadratic_time_mmd.cpp
deleted file mode 100644
index 5ffe8cdc3c4..00000000000
--- a/examples/undocumented/libshogun/statistics_quadratic_time_mmd.cpp
+++ /dev/null
@@ -1,135 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/base/init.h>
-#include <shogun/statistics/QuadraticTimeMMD.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CustomKernel.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
-#include <shogun/mathematics/Statistics.h>
-
-using namespace shogun;
-
-void quadratic_time_mmd()
-{
-	/* number of examples kept low in order to make things fast */
-	index_t m=30;
-	index_t dim=2;
-	float64_t difference=0.5;
-
-	/* streaming data generator for mean shift distributions */
-	CMeanShiftDataGenerator* gen_p=new CMeanShiftDataGenerator(0, dim);
-	CMeanShiftDataGenerator* gen_q=new CMeanShiftDataGenerator(difference, dim);
-
-	/* stream some data from generator */
-	CFeatures* feat_p=gen_p->get_streamed_features(m);
-	CFeatures* feat_q=gen_q->get_streamed_features(m);
-
-	/* set kernel a-priori. usually one would do some kernel selection. See
-	 * other examples for this. */
-	float64_t width=10;
-	CGaussianKernel* kernel=new CGaussianKernel(10, width);
-
-	/* create quadratic time mmd instance. Note that this constructor
-	 * copies p and q and does not reference them */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, feat_p, feat_q);
-
-	/* perform test: compute p-value and test if null-hypothesis is rejected for
-	 * a test level of 0.05 */
-	float64_t alpha=0.05;
-
-	/* using permutation (slow, not the most reliable way. Consider pre-
-	 * computing the kernel when using it, see below).
-	 * Also, in practice, use at least 250 iterations */
-	mmd->set_null_approximation_method(PERMUTATION);
-	mmd->set_num_null_samples(3);
-	float64_t p_value=mmd->perform_test();
-	/* reject if p-value is smaller than test level */
-	SG_SPRINT("bootstrap: p!=q: %d\n", p_value<alpha);
-
-	/* using spectrum method. Use at least 250 samples from null.
-	 * This is consistent but sometimes breaks, always monitor type I error.
-	 * See tutorial for number of eigenvalues to use .
-	 * Only works with BIASED statistic */
-	mmd->set_statistic_type(BIASED);
-	mmd->set_null_approximation_method(MMD2_SPECTRUM);
-	mmd->set_num_eigenvalues_spectrum(3);
-	mmd->set_num_samples_spectrum(250);
-	p_value=mmd->perform_test();
-	/* reject if p-value is smaller than test level */
-	SG_SPRINT("spectrum: p!=q: %d\n", p_value<alpha);
-
-	/* using gamma method. This is a quick hack, which works most of the time
-	 * but is NOT guaranteed to. See tutorial for details.
-	 * Only works with BIASED statistic */
-	mmd->set_statistic_type(BIASED);
-	mmd->set_null_approximation_method(MMD2_GAMMA);
-	p_value=mmd->perform_test();
-	/* reject if p-value is smaller than test level */
-	SG_SPRINT("gamma: p!=q: %d\n", p_value<alpha);
-
-	/* compute tpye I and II error (use many more trials in practice).
-	 * Type I error is not necessary if one uses permutation. We do it here
-	 * anyway, but note that this is an efficient way of computing it.
-	 * Also note that testing has to happen on
-	 * difference data than kernel selection, but the linear time mmd does this
-	 * implicitly and we used a fixed kernel here. */
-	mmd->set_null_approximation_method(PERMUTATION);
-	mmd->set_num_null_samples(5);
-	index_t num_trials=5;
-	SGVector<float64_t> type_I_errors(num_trials);
-	SGVector<float64_t> type_II_errors(num_trials);
-	SGVector<index_t> inds(2*m);
-	inds.range_fill();
-	CFeatures* p_and_q=mmd->get_p_and_q();
-
-	/* use a precomputed kernel to be faster */
-	kernel->init(p_and_q, p_and_q);
-	CCustomKernel* precomputed=new CCustomKernel(kernel);
-	mmd->set_kernel(precomputed);
-	for (index_t i=0; i<num_trials; ++i)
-	{
-		/* this effectively means that p=q - rejecting is tpye I error */
-		CMath::permute(inds);
-		precomputed->add_row_subset(inds);
-		precomputed->add_col_subset(inds);
-		type_I_errors[i]=mmd->perform_test()>alpha;
-		precomputed->remove_row_subset();
-		precomputed->remove_col_subset();
-
-		/* on normal data, this gives type II error */
-		type_II_errors[i]=mmd->perform_test()>alpha;
-	}
-	SG_UNREF(p_and_q);
-
-	SG_SPRINT("type I error: %f\n", CStatistics::mean(type_I_errors));
-	SG_SPRINT("type II error: %f\n", CStatistics::mean(type_II_errors));
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(gen_p);
-	SG_UNREF(gen_q);
-
-	/* convienience constructor of MMD was used, these were not referenced */
-	SG_UNREF(feat_p);
-	SG_UNREF(feat_q);
-}
-
-int main(int argc, char** argv)
-{
-	init_shogun_with_defaults();
-//	sg_io->set_loglevel(MSG_DEBUG);
-
-	quadratic_time_mmd();
-
-	exit_shogun();
-	return 0;
-}
-
diff --git a/examples/undocumented/python_modular/statistics_hsic.py b/examples/undocumented/python_modular/statistics_hsic.py
deleted file mode 100644
index ba1f3470bc3..00000000000
--- a/examples/undocumented/python_modular/statistics_hsic.py
+++ /dev/null
@@ -1,107 +0,0 @@
-#!/usr/bin/env python
-#
-# This program is free software you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation either version 3 of the License, or
-# (at your option) any later version.
-#
-# Written (C) 2012-2013 Heiko Strathmann
-#
-import numpy as np
-from math import pi
-
-parameter_list = [[150,3,3]]
-
-def statistics_hsic (n, difference, angle):
-	from modshogun import RealFeatures
-	from modshogun import DataGenerator
-	from modshogun import GaussianKernel
-	from modshogun import HSIC
-	from modshogun import PERMUTATION, HSIC_GAMMA
-	from modshogun import EuclideanDistance
-	from modshogun import Statistics, Math
-
-	# for reproducable results (the numpy one might not be reproducible across
-	# different OS/Python-distributions
-	Math.init_random(1)
-	np.random.seed(1)
-
-	# note that the HSIC has to store kernel matrices
-	# which upper bounds the sample size
-
-	# use data generator class to produce example data
-	data=DataGenerator.generate_sym_mix_gauss(n,difference,angle)
-	#plot(data[0], data[1], 'x');show()
-
-	# create shogun feature representation
-	features_x=RealFeatures(np.array([data[0]]))
-	features_y=RealFeatures(np.array([data[1]]))
-
-	# compute median data distance in order to use for Gaussian kernel width
-	# 0.5*median_distance normally (factor two in Gaussian kernel)
-	# However, shoguns kernel width is different to usual parametrization
-	# Therefore 0.5*2*median_distance^2
-	# Use a subset of data for that, only 200 elements. Median is stable
-	subset=np.random.permutation(features_x.get_num_vectors()).astype(np.int32)
-	subset=subset[0:200]
-	features_x.add_subset(subset)
-	dist=EuclideanDistance(features_x, features_x)
-	distances=dist.get_distance_matrix()
-	features_x.remove_subset()
-	median_distance=np.median(distances)
-	sigma_x=median_distance**2
-	features_y.add_subset(subset)
-	dist=EuclideanDistance(features_y, features_y)
-	distances=dist.get_distance_matrix()
-	features_y.remove_subset()
-	median_distance=np.median(distances)
-	sigma_y=median_distance**2
-	#print "median distance for Gaussian kernel on x:", sigma_x
-	#print "median distance for Gaussian kernel on y:", sigma_y
-	kernel_x=GaussianKernel(10,sigma_x)
-	kernel_y=GaussianKernel(10,sigma_y)
-
-	hsic=HSIC(kernel_x,kernel_y,features_x,features_y)
-
-	# perform test: compute p-value and test if null-hypothesis is rejected for
-	# a test level of 0.05 using different methods to approximate
-	# null-distribution
-	statistic=hsic.compute_statistic()
-	#print "HSIC:", statistic
-	alpha=0.05
-
-	#print "computing p-value using sampling null"
-	hsic.set_null_approximation_method(PERMUTATION)
-	# normally, at least 250 iterations should be done, but that takes long
-	hsic.set_num_null_samples(100)
-	# sampling null allows usage of unbiased or biased statistic
-	p_value_boot=hsic.compute_p_value(statistic)
-	thresh_boot=hsic.compute_threshold(alpha)
-	#print "p_value:", p_value_boot
-	#print "threshold for 0.05 alpha:", thresh_boot
-	#print "p_value <", alpha, ", i.e. test sais p and q are dependend:", p_value_boot<alpha
-
-	#print "computing p-value using gamma method"
-	hsic.set_null_approximation_method(HSIC_GAMMA)
-	p_value_gamma=hsic.compute_p_value(statistic)
-	thresh_gamma=hsic.compute_threshold(alpha)
-	#print "p_value:", p_value_gamma
-	#print "threshold for 0.05 alpha:", thresh_gamma
-	#print "p_value <", alpha, ", i.e. test sais p and q are dependend:", p_value_gamma<alpha
-
-	# sample from null distribution (these may be plotted or whatsoever)
-	# mean should be close to zero, variance stronly depends on data/kernel
-	# sampling null, biased statistic
-	#print "sampling null distribution using sample_null"
-	hsic.set_null_approximation_method(PERMUTATION)
-	hsic.set_num_null_samples(100)
-	null_samples=hsic.sample_null()
-	#print "null mean:", np.mean(null_samples)
-	#print "null variance:", np.var(null_samples)
-	#hist(null_samples, 100); show()
-
-	return p_value_boot, thresh_boot, p_value_gamma, thresh_gamma, statistic, null_samples
-
-if __name__=='__main__':
-	print('HSIC')
-	statistics_hsic(*parameter_list[0])
diff --git a/examples/undocumented/python_modular/statistics_kmm.py b/examples/undocumented/python_modular/statistics_kmm.py
deleted file mode 100644
index d41244e315a..00000000000
--- a/examples/undocumented/python_modular/statistics_kmm.py
+++ /dev/null
@@ -1,40 +0,0 @@
-#!/usr/bin/env python
-from numpy import *
-from numpy import random
-
-parameter_list = [[10,3]]
-
-def statistics_kmm (n,d):
-	from modshogun import RealFeatures
-	from modshogun import DataGenerator
-	from modshogun import GaussianKernel, MSG_DEBUG
-	try:
-		from modshogun import KernelMeanMatching
-	except ImportError:
-		print("KernelMeanMatching not available")
-		exit(0)
-	from modshogun import Math
-
-	# init seed for reproducability
-	Math.init_random(1)
-	random.seed(1);
-
-	data = random.randn(d,n)
-
-	# create shogun feature representation
-	features=RealFeatures(data)
-
-	# use a kernel width of sigma=2, which is 8 in SHOGUN's parametrization
-	# which is k(x,y)=exp(-||x-y||^2 / tau), in constrast to the standard
-	# k(x,y)=exp(-||x-y||^2 / (2*sigma^2)), so tau=2*sigma^2
-	kernel=GaussianKernel(10,8)
-	kernel.init(features,features)
-
-	kmm = KernelMeanMatching(kernel,array([0,1,2,3,7,8,9],dtype=int32),array([4,5,6],dtype=int32))
-	w = kmm.compute_weights()
-	#print w
-	return w
-
-if __name__=='__main__':
-	print('KernelMeanMatching')
-	statistics_kmm(*parameter_list[0])
diff --git a/examples/undocumented/python_modular/statistics_linear_time_mmd.py b/examples/undocumented/python_modular/statistics_linear_time_mmd.py
deleted file mode 100644
index 269744acc1a..00000000000
--- a/examples/undocumented/python_modular/statistics_linear_time_mmd.py
+++ /dev/null
@@ -1,107 +0,0 @@
-#!/usr/bin/env python
-#
-# This program is free software you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation either version 3 of the License, or
-# (at your option) any later version.
-#
-# Written (C) 2012-2013 Heiko Strathmann
-#
-from numpy import *
-
-parameter_list = [[1000,2,0.5]]
-
-def statistics_linear_time_mmd (n,dim,difference):
-	from modshogun import RealFeatures
-	from modshogun import MeanShiftDataGenerator
-	from modshogun import GaussianKernel
-	from modshogun import LinearTimeMMD
-	from modshogun import PERMUTATION, MMD1_GAUSSIAN
-	from modshogun import EuclideanDistance
-	from modshogun import Statistics, Math
-
-	# init seed for reproducability
-	Math.init_random(1)
-
-	# note that the linear time statistic is designed for much larger datasets
-	# so increase to get reasonable results
-
-	# streaming data generator for mean shift distributions
-	gen_p=MeanShiftDataGenerator(0, dim)
-	gen_q=MeanShiftDataGenerator(difference, dim)
-
-	# compute median data distance in order to use for Gaussian kernel width
-	# 0.5*median_distance normally (factor two in Gaussian kernel)
-	# However, shoguns kernel width is different to usual parametrization
-	# Therefore 0.5*2*median_distance^2
-	# Use a subset of data for that, only 200 elements. Median is stable
-
-	# Stream examples and merge them in order to compute median on joint sample
-	features=gen_p.get_streamed_features(100)
-	features=features.create_merged_copy(gen_q.get_streamed_features(100))
-
-	# compute all pairwise distances
-	dist=EuclideanDistance(features, features)
-	distances=dist.get_distance_matrix()
-
-	# compute median and determine kernel width
-	median_distance=median(distances)
-	sigma=median_distance**2
-	#print "median distance for Gaussian kernel:", sigma
-	kernel=GaussianKernel(10,sigma)
-
-	# mmd instance using streaming features, blocksize of 10000
-	mmd=LinearTimeMMD(kernel, gen_p, gen_q, n, 10000)
-
-	# perform test: compute p-value and test if null-hypothesis is rejected for
-	# a test level of 0.05
-	statistic=mmd.compute_statistic()
-	#print "test statistic:", statistic
-
-	# do the same thing using two different way to approximate null-dstribution
-	# sampling null and gaussian approximation (ony for really large samples)
-	alpha=0.05
-
-	#print "computing p-value using sampling null"
-	mmd.set_null_approximation_method(PERMUTATION)
-	mmd.set_num_null_samples(50) # normally, far more iterations are needed
-	p_value_boot=mmd.compute_p_value(statistic)
-	#print "p_value_boot:", p_value_boot
-	#print "p_value_boot <", alpha, ", i.e. test sais p!=q:", p_value_boot<alpha
-
-	#print "computing p-value using gaussian approximation"
-	mmd.set_null_approximation_method(MMD1_GAUSSIAN)
-	p_value_gaussian=mmd.compute_p_value(statistic)
-	#print "p_value_gaussian:", p_value_gaussian
-	#print "p_value_gaussian <", alpha, ", i.e. test sais p!=q:", p_value_gaussian<alpha
-
-	# sample from null distribution (these may be plotted or whatsoever)
-	# mean should be close to zero, variance stronly depends on data/kernel
-	mmd.set_null_approximation_method(PERMUTATION)
-	mmd.set_num_null_samples(10) # normally, far more iterations are needed
-	null_samples=mmd.sample_null()
-	#print "null mean:", mean(null_samples)
-	#print "null variance:", var(null_samples)
-
-	# compute type I and type II errors for Gaussian approximation
-	# number of trials should be larger to compute tight confidence bounds
-	mmd.set_null_approximation_method(MMD1_GAUSSIAN)
-	num_trials=5;
-	alpha=0.05 # test power
-	typeIerrors=[0 for x in range(num_trials)]
-	typeIIerrors=[0 for x in range(num_trials)]
-	for i in range(num_trials):
-		# this effectively means that p=q - rejecting is tpye I error
-		mmd.set_simulate_h0(True)
-		typeIerrors[i]=mmd.perform_test()>alpha
-		mmd.set_simulate_h0(False)
-
-		typeIIerrors[i]=mmd.perform_test()>alpha
-
-	#print "type I error:", mean(typeIerrors), ", type II error:", mean(typeIIerrors)
-
-	return statistic, p_value_boot, p_value_gaussian, null_samples, typeIerrors, typeIIerrors
-
-if __name__=='__main__':
-	print('LinearTimeMMD')
-	statistics_linear_time_mmd(*parameter_list[0])
diff --git a/examples/undocumented/python_modular/statistics_mmd_kernel_selection_combined.py b/examples/undocumented/python_modular/statistics_mmd_kernel_selection_combined.py
deleted file mode 100644
index 677a48672f2..00000000000
--- a/examples/undocumented/python_modular/statistics_mmd_kernel_selection_combined.py
+++ /dev/null
@@ -1,115 +0,0 @@
-#!/usr/bin/env python
-#
-# This program is free software you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation either version 3 of the License, or
-# (at your option) any later version.
-#
-# Written (C) 2012-2013 Heiko Strathmann
-#
-from numpy import *
-#from pylab import *
-
-parameter_list = [[1000,10,5,3,pi/4, "opt"], [1000,10,5,3,pi/4, "l2"]]
-
-
-def statistics_mmd_kernel_selection_combined(m,distance,stretch,num_blobs,angle,selection_method):
-	from modshogun import RealFeatures
-	from modshogun import GaussianBlobsDataGenerator
-	from modshogun import GaussianKernel, CombinedKernel
-	from modshogun import LinearTimeMMD
-	try:
-		from modshogun import MMDKernelSelectionCombMaxL2
-	except ImportError:
-		print("MMDKernelSelectionCombMaxL2 not available")
-		exit(0)
-	try:
-		from modshogun import MMDKernelSelectionCombOpt
-	except ImportError:
-		print("MMDKernelSelectionCombOpt not available")
-		exit(0)
-		
-	from modshogun import PERMUTATION, MMD1_GAUSSIAN
-	from modshogun import EuclideanDistance
-	from modshogun import Statistics, Math
-
-	# init seed for reproducability
-	Math.init_random(1)
-
-	# note that the linear time statistic is designed for much larger datasets
-	# results for this low number will be bad (unstable, type I error wrong)
-
-	# streaming data generator
-	gen_p=GaussianBlobsDataGenerator(num_blobs, distance, 1, 0)
-	gen_q=GaussianBlobsDataGenerator(num_blobs, distance, stretch, angle)
-
-	# stream some data and plot
-	num_plot=1000
-	features=gen_p.get_streamed_features(num_plot)
-	features=features.create_merged_copy(gen_q.get_streamed_features(num_plot))
-	data=features.get_feature_matrix()
-
-	#figure()
-	#subplot(2,2,1)
-	#grid(True)
-	#plot(data[0][0:num_plot], data[1][0:num_plot], 'r.', label='$x$')
-	#title('$X\sim p$')
-	#subplot(2,2,2)
-	#grid(True)
-	#plot(data[0][num_plot+1:2*num_plot], data[1][num_plot+1:2*num_plot], 'b.', label='$x$', alpha=0.5)
-	#title('$Y\sim q$')
-
-	# create combined kernel with Gaussian kernels inside (shoguns Gaussian kernel is
-	# different to the standard form, see documentation)
-	sigmas=[2**x for x in range(-3,10)]
-	widths=[x*x*2 for x in sigmas]
-	combined=CombinedKernel()
-	for i in range(len(sigmas)):
-		combined.append_kernel(GaussianKernel(10, widths[i]))
-
-	# mmd instance using streaming features, blocksize of 10000
-	block_size=10000
-	mmd=LinearTimeMMD(combined, gen_p, gen_q, m, block_size)
-
-	# kernel selection instance (this can easily replaced by the other methods for selecting
-	# combined kernels
-	if selection_method=="opt":
-		selection=MMDKernelSelectionCombOpt(mmd)
-	elif selection_method=="l2":
-		selection=MMDKernelSelectionCombMaxL2(mmd)
-
-	# perform kernel selection (kernel is automatically set)
-	kernel=selection.select_kernel()
-	kernel=CombinedKernel.obtain_from_generic(kernel)
-	#print "selected kernel weights:", kernel.get_subkernel_weights()
-	#subplot(2,2,3)
-	#plot(kernel.get_subkernel_weights())
-	#title("Kernel weights")
-
-	# compute tpye I and II error (use many more trials). Type I error is only
-	# estimated to check MMD1_GAUSSIAN method for estimating the null
-	# distribution. Note that testing has to happen on difference data than
-	# kernel selecting, but the linear time mmd does this implicitly
-	mmd.set_null_approximation_method(MMD1_GAUSSIAN)
-
-	# number of trials should be larger to compute tight confidence bounds
-	num_trials=5;
-	alpha=0.05 # test power
-	typeIerrors=[0 for x in range(num_trials)]
-	typeIIerrors=[0 for x in range(num_trials)]
-	for i in range(num_trials):
-		# this effectively means that p=q - rejecting is tpye I error
-		mmd.set_simulate_h0(True)
-		typeIerrors[i]=mmd.perform_test()>alpha
-		mmd.set_simulate_h0(False)
-
-		typeIIerrors[i]=mmd.perform_test()>alpha
-
-	#print "type I error:", mean(typeIerrors), ", type II error:", mean(typeIIerrors)
-
-	return kernel,typeIerrors,typeIIerrors
-
-if __name__=='__main__':
-	print('MMDKernelSelectionCombined')
-	statistics_mmd_kernel_selection_combined(*parameter_list[0])
-	#show()
diff --git a/examples/undocumented/python_modular/statistics_mmd_kernel_selection_single.py b/examples/undocumented/python_modular/statistics_mmd_kernel_selection_single.py
deleted file mode 100644
index ffa291afbde..00000000000
--- a/examples/undocumented/python_modular/statistics_mmd_kernel_selection_single.py
+++ /dev/null
@@ -1,124 +0,0 @@
-#!/usr/bin/env python
-#
-# This program is free software you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation either version 3 of the License, or
-# (at your option) any later version.
-#
-# Written (C) 2012-2013 Heiko Strathmann
-#
-from numpy import *
-#from pylab import *
-
-parameter_list = [[1000,10,5,3,pi/4, "opt"], [1000,10,5,3,pi/4, "max"], [1000,10,5,3,pi/4, "median"]]
-
-def statistics_mmd_kernel_selection_single(m,distance,stretch,num_blobs,angle,selection_method):
-	from modshogun import RealFeatures
-	from modshogun import GaussianBlobsDataGenerator
-	from modshogun import GaussianKernel, CombinedKernel
-	from modshogun import LinearTimeMMD
-	from modshogun import MMDKernelSelectionMedian
-	from modshogun import MMDKernelSelectionMax
-	from modshogun import MMDKernelSelectionOpt
-	from modshogun import PERMUTATION, MMD1_GAUSSIAN
-	from modshogun import EuclideanDistance
-	from modshogun import Statistics, Math
-
-	# init seed for reproducability
-	Math.init_random(1)
-
-	# note that the linear time statistic is designed for much larger datasets
-	# results for this low number will be bad (unstable, type I error wrong)
-	m=1000
-	distance=10
-	stretch=5
-	num_blobs=3
-	angle=pi/4
-
-	# streaming data generator
-	gen_p=GaussianBlobsDataGenerator(num_blobs, distance, 1, 0)
-	gen_q=GaussianBlobsDataGenerator(num_blobs, distance, stretch, angle)
-
-	# stream some data and plot
-	num_plot=1000
-	features=gen_p.get_streamed_features(num_plot)
-	features=features.create_merged_copy(gen_q.get_streamed_features(num_plot))
-	data=features.get_feature_matrix()
-
-	#figure()
-	#subplot(2,2,1)
-	#grid(True)
-	#plot(data[0][0:num_plot], data[1][0:num_plot], 'r.', label='$x$')
-	#title('$X\sim p$')
-	#subplot(2,2,2)
-	#grid(True)
-	#plot(data[0][num_plot+1:2*num_plot], data[1][num_plot+1:2*num_plot], 'b.', label='$x$', alpha=0.5)
-	#title('$Y\sim q$')
-
-
-	# create combined kernel with Gaussian kernels inside (shoguns Gaussian kernel is
-	# different to the standard form, see documentation)
-	sigmas=[2**x for x in range(-3,10)]
-	widths=[x*x*2 for x in sigmas]
-	combined=CombinedKernel()
-	for i in range(len(sigmas)):
-		combined.append_kernel(GaussianKernel(10, widths[i]))
-
-	# mmd instance using streaming features, blocksize of 10000
-	block_size=1000
-	mmd=LinearTimeMMD(combined, gen_p, gen_q, m, block_size)
-
-	# kernel selection instance (this can easily replaced by the other methods for selecting
-	# single kernels
-	if selection_method=="opt":
-		selection=MMDKernelSelectionOpt(mmd)
-	elif selection_method=="max":
-		selection=MMDKernelSelectionMax(mmd)
-	elif selection_method=="median":
-		selection=MMDKernelSelectionMedian(mmd)
-
-	# print measures (just for information)
-	# in case Opt: ratios of MMD and standard deviation
-	# in case Max: MMDs for each kernel
-	# Does not work for median method
-	if selection_method!="median":
-		ratios=selection.compute_measures()
-		#print "Measures:", ratios
-
-	#subplot(2,2,3)
-	#plot(ratios)
-	#title('Measures')
-
-	# perform kernel selection
-	kernel=selection.select_kernel()
-	kernel=GaussianKernel.obtain_from_generic(kernel)
-	#print "selected kernel width:", kernel.get_width()
-
-	# compute tpye I and II error (use many more trials). Type I error is only
-	# estimated to check MMD1_GAUSSIAN method for estimating the null
-	# distribution. Note that testing has to happen on difference data than
-	# kernel selecting, but the linear time mmd does this implicitly
-	mmd.set_kernel(kernel)
-	mmd.set_null_approximation_method(MMD1_GAUSSIAN)
-
-	# number of trials should be larger to compute tight confidence bounds
-	num_trials=5;
-	alpha=0.05 # test power
-	typeIerrors=[0 for x in range(num_trials)]
-	typeIIerrors=[0 for x in range(num_trials)]
-	for i in range(num_trials):
-		# this effectively means that p=q - rejecting is tpye I error
-		mmd.set_simulate_h0(True)
-		typeIerrors[i]=mmd.perform_test()>alpha
-		mmd.set_simulate_h0(False)
-
-		typeIIerrors[i]=mmd.perform_test()>alpha
-
-	#print "type I error:", mean(typeIerrors), ", type II error:", mean(typeIIerrors)
-
-	return kernel,typeIerrors,typeIIerrors
-
-if __name__=='__main__':
-	print('MMDKernelSelection')
-	statistics_mmd_kernel_selection_single(*parameter_list[0])
-	#show()
diff --git a/examples/undocumented/python_modular/statistics_quadratic_time_mmd.py b/examples/undocumented/python_modular/statistics_quadratic_time_mmd.py
deleted file mode 100644
index 343a03c4ed2..00000000000
--- a/examples/undocumented/python_modular/statistics_quadratic_time_mmd.py
+++ /dev/null
@@ -1,115 +0,0 @@
-#!/usr/bin/env python
-#
-# This program is free software you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation either version 3 of the License, or
-# (at your option) any later version.
-#
-# Written (C) 2012-2013 Heiko Strathmann
-#
-import numpy as np
-
-parameter_list = [[30,2,0.5]]
-
-def statistics_quadratic_time_mmd (m,dim,difference):
-	from modshogun import RealFeatures
-	from modshogun import MeanShiftDataGenerator
-	from modshogun import GaussianKernel, CustomKernel
-	from modshogun import QuadraticTimeMMD
-	from modshogun import PERMUTATION, MMD2_SPECTRUM, MMD2_GAMMA, BIASED, BIASED_DEPRECATED
-	from modshogun import Statistics, IntVector, RealVector, Math
-
-	# for reproducable results (the numpy one might not be reproducible across
-	# different OS/Python-distributions
-	Math.init_random(1)
-	np.random.seed(1)
-
-	# number of examples kept low in order to make things fast
-
-	# streaming data generator for mean shift distributions
-	gen_p=MeanShiftDataGenerator(0, dim);
-	#gen_p.parallel.set_num_threads(1)
-	gen_q=MeanShiftDataGenerator(difference, dim);
-
-	# stream some data from generator
-	feat_p=gen_p.get_streamed_features(m);
-	feat_q=gen_q.get_streamed_features(m);
-
-	# set kernel a-priori. usually one would do some kernel selection. See
-	# other examples for this.
-	width=10;
-	kernel=GaussianKernel(10, width);
-
-	# create quadratic time mmd instance. Note that this constructor
-	# copies p and q and does not reference them
-	mmd=QuadraticTimeMMD(kernel, feat_p, feat_q);
-
-	# perform test: compute p-value and test if null-hypothesis is rejected for
-	# a test level of 0.05
-	alpha=0.05;
-
-	# using permutation (slow, not the most reliable way. Consider pre-
-	# computing the kernel when using it, see below).
-	# Also, in practice, use at least 250 iterations
-	mmd.set_null_approximation_method(PERMUTATION);
-	mmd.set_num_null_samples(3);
-	p_value_null=mmd.perform_test();
-	# reject if p-value is smaller than test level
-	#print "bootstrap: p!=q: ", p_value_null<alpha
-
-	# using spectrum method. Use at least 250 samples from null.
-	# This is consistent but sometimes breaks, always monitor type I error.
-	# See tutorial for number of eigenvalues to use .
-	mmd.set_statistic_type(BIASED);
-	mmd.set_null_approximation_method(MMD2_SPECTRUM);
-	mmd.set_num_eigenvalues_spectrum(3);
-	mmd.set_num_samples_spectrum(250);
-	p_value_spectrum=mmd.perform_test();
-	# reject if p-value is smaller than test level
-	#print "spectrum: p!=q: ", p_value_spectrum<alpha
-
-	# using gamma method. This is a quick hack, which works most of the time
-	# but is NOT guaranteed to. See tutorial for details.
-	# Only works with BIASED_DEPRECATED statistic
-	mmd.set_statistic_type(BIASED_DEPRECATED);
-	mmd.set_null_approximation_method(MMD2_GAMMA);
-	p_value_gamma=mmd.perform_test();
-	# reject if p-value is smaller than test level
-	#print "gamma: p!=q: ", p_value_gamma<alpha
-
-	# compute tpye I and II error (use many more trials in practice).
-	# Type I error is not necessary if one uses permutation. We do it here
-	# anyway, but note that this is an efficient way of computing it.
-	# Also note that testing has to happen on
-	# difference data than kernel selection, but the linear time mmd does this
-	# implicitly and we used a fixed kernel here.
-	mmd.set_statistic_type(BIASED);
-	mmd.set_null_approximation_method(PERMUTATION);
-	mmd.set_num_null_samples(5);
-	num_trials=5;
-	type_I_errors=np.zeros(num_trials)
-	type_II_errors=np.zeros(num_trials)
-	inds=np.array([x for x in range(2*m)], dtype=np.int32)
-	p_and_q=mmd.get_p_and_q();
-
-	# use a precomputed kernel to be faster
-	kernel.init(p_and_q, p_and_q);
-	precomputed=CustomKernel(kernel);
-	mmd.set_kernel(precomputed);
-	for i in range(num_trials):
-		# this effectively means that p=q - rejecting is tpye I error
-		inds=np.random.permutation(inds) # numpy permutation
-		precomputed.add_row_subset(inds);
-		precomputed.add_col_subset(inds);
-		type_I_errors[i]=mmd.perform_test()>alpha;
-		precomputed.remove_row_subset();
-		precomputed.remove_col_subset();
-
-		# on normal data, this gives type II error
-		type_II_errors[i]=mmd.perform_test()>alpha;
-
-	return type_I_errors,type_I_errors,p_value_null,p_value_spectrum,p_value_gamma,
-
-if __name__=='__main__':
-	print('QuadraticTimeMMD')
-	statistics_quadratic_time_mmd(*parameter_list[0])
diff --git a/src/interfaces/modular/Preprocessor.i b/src/interfaces/modular/Preprocessor.i
index 0086099701e..fdb05143421 100644
--- a/src/interfaces/modular/Preprocessor.i
+++ b/src/interfaces/modular/Preprocessor.i
@@ -29,9 +29,9 @@
 %rename(SortWordString) CSortWordString;
 
 /* Feature selection framework */
-%rename(DependenceMaximization) CDependenceMaximization;
-%rename(KernelDependenceMaximization) CDependenceMaximization;
-%rename(BAHSIC) CBAHSIC;
+#%rename(DependenceMaximization) CDependenceMaximization;
+#%rename(KernelDependenceMaximization) CDependenceMaximization;
+#%rename(BAHSIC) CBAHSIC;
 
 %newobject shogun::CFeatureSelection::apply;
 %newobject shogun::CFeatureSelection::remove_feats;
@@ -145,7 +145,3 @@ namespace shogun
 
 %include <shogun/preprocessor/SortUlongString.h>
 %include <shogun/preprocessor/SortWordString.h>
-
-%include <shogun/preprocessor/DependenceMaximization.h>
-%include <shogun/preprocessor/KernelDependenceMaximization.h>
-%include <shogun/preprocessor/BAHSIC.h>
diff --git a/src/interfaces/modular/Preprocessor_includes.i b/src/interfaces/modular/Preprocessor_includes.i
index 95a101c4f86..35076c98410 100644
--- a/src/interfaces/modular/Preprocessor_includes.i
+++ b/src/interfaces/modular/Preprocessor_includes.i
@@ -25,7 +25,4 @@
 #include <shogun/preprocessor/SortWordString.h>
 
 #include <shogun/preprocessor/FeatureSelection.h>
-#include <shogun/preprocessor/DependenceMaximization.h>
-#include <shogun/preprocessor/KernelDependenceMaximization.h>
-#include <shogun/preprocessor/BAHSIC.h>
 %}
diff --git a/src/interfaces/modular/Statistics.i b/src/interfaces/modular/Statistics.i
index e542c6c6fb1..4c63f8b4e58 100644
--- a/src/interfaces/modular/Statistics.i
+++ b/src/interfaces/modular/Statistics.i
@@ -7,45 +7,36 @@
  * Written (W) 2012-2013 Heiko Strathmann
  */
 
+/* These functions return new Objects */
+%newobject shogun::CTwoDistributionTest::compute_distance(CDistance*);
+%newobject shogun::CTwoDistributionTest::compute_joint_distance(CDistance*);
+%newobject shogun::CQuadraticTimeMMD::get_p_and_q();
+
 /* Remove C Prefix */
 %rename(HypothesisTest) CHypothesisTest;
+%rename(OneDistributionTest) COneDistributionTest;
+%rename(TwoDistributionTest) CTwoDistributionTest;
 %rename(IndependenceTest) CIndependenceTest;
 %rename(TwoSampleTest) CTwoSampleTest;
-%rename(KernelTwoSampleTest) CKernelTwoSampleTest;
+%rename(MMD) CMMD;
 %rename(StreamingMMD) CStreamingMMD;
 %rename(LinearTimeMMD) CLinearTimeMMD;
+%rename(BTestMMD) CBTestMMD;
 %rename(QuadraticTimeMMD) CQuadraticTimeMMD;
-%rename(KernelIndependenceTest) CKernelIndependenceTest;
-%rename(HSIC) CHSIC;
-%rename(NOCCO) CNOCCO;
-%rename(KernelMeanMatching) CKernelMeanMatching;
-%rename(KernelSelection) CKernelSelection;
-%rename(MMDKernelSelection) CMMDKernelSelection;
-%rename(MMDKernelSelectionComb) CMMDKernelSelectionComb;
-%rename(MMDKernelSelectionMedian) CMMDKernelSelectionMedian;
-%rename(MMDKernelSelectionMax) CMMDKernelSelectionMax;
-%rename(MMDKernelSelectionOpt) CMMDKernelSelectionOpt;
-%rename(MMDKernelSelectionCombOpt) CMMDKernelSelectionCombOpt;
-%rename(MMDKernelSelectionCombMaxL2) CMMDKernelSelectionCombMaxL2;
-
+%rename(MultiKernelQuadraticTimeMMD) CMultiKernelQuadraticTimeMMD;
+%rename(KernelSelectionStrategy) CKernelSelectionStrategy;
 
 /* Include Class Headers to make them visible from within the target language */
-%include <shogun/statistics/HypothesisTest.h>
-%include <shogun/statistics/IndependenceTest.h>
-%include <shogun/statistics/TwoSampleTest.h>
-%include <shogun/statistics/KernelTwoSampleTest.h>
-%include <shogun/statistics/StreamingMMD.h>
-%include <shogun/statistics/LinearTimeMMD.h>
-%include <shogun/statistics/QuadraticTimeMMD.h>
-%include <shogun/statistics/KernelIndependenceTest.h>
-%include <shogun/statistics/HSIC.h>
-%include <shogun/statistics/NOCCO.h>
-%include <shogun/statistics/KernelMeanMatching.h>
-%include <shogun/statistics/KernelSelection.h>
-%include <shogun/statistics/MMDKernelSelection.h>
-%include <shogun/statistics/MMDKernelSelectionComb.h>
-%include <shogun/statistics/MMDKernelSelectionMedian.h>
-%include <shogun/statistics/MMDKernelSelectionMax.h>
-%include <shogun/statistics/MMDKernelSelectionOpt.h>
-%include <shogun/statistics/MMDKernelSelectionCombOpt.h>
-%include <shogun/statistics/MMDKernelSelectionCombMaxL2.h>
+%include <shogun/statistical_testing/TestEnums.h>
+%include <shogun/statistical_testing/HypothesisTest.h>
+%include <shogun/statistical_testing/OneDistributionTest.h>
+%include <shogun/statistical_testing/TwoDistributionTest.h>
+%include <shogun/statistical_testing/IndependenceTest.h>
+%include <shogun/statistical_testing/TwoSampleTest.h>
+%include <shogun/statistical_testing/MMD.h>
+%include <shogun/statistical_testing/StreamingMMD.h>
+%include <shogun/statistical_testing/LinearTimeMMD.h>
+%include <shogun/statistical_testing/BTestMMD.h>
+%include <shogun/statistical_testing/QuadraticTimeMMD.h>
+%include <shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h>
+%include <shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h>
diff --git a/src/interfaces/modular/Statistics_includes.i b/src/interfaces/modular/Statistics_includes.i
index 8cf811edeac..48aa564c620 100644
--- a/src/interfaces/modular/Statistics_includes.i
+++ b/src/interfaces/modular/Statistics_includes.i
@@ -1,22 +1,16 @@
 %{
- #include <shogun/statistics/HypothesisTest.h>
- #include <shogun/statistics/IndependenceTest.h>
- #include <shogun/statistics/TwoSampleTest.h>
- #include <shogun/statistics/KernelTwoSampleTest.h>
- #include <shogun/statistics/StreamingMMD.h>
- #include <shogun/statistics/LinearTimeMMD.h>
- #include <shogun/statistics/QuadraticTimeMMD.h>
- #include <shogun/statistics/KernelIndependenceTest.h>
- #include <shogun/statistics/HSIC.h>
- #include <shogun/statistics/NOCCO.h>
- #include <shogun/statistics/KernelMeanMatching.h>
- #include <shogun/statistics/KernelSelection.h>
- #include <shogun/statistics/MMDKernelSelection.h>
- #include <shogun/statistics/MMDKernelSelectionComb.h>
- #include <shogun/statistics/MMDKernelSelectionMedian.h>
- #include <shogun/statistics/MMDKernelSelectionMax.h>
- #include <shogun/statistics/MMDKernelSelectionOpt.h>
- #include <shogun/statistics/MMDKernelSelectionCombOpt.h>
- #include <shogun/statistics/MMDKernelSelectionCombMaxL2.h>
+ #include <shogun/statistical_testing/TestEnums.h>
+ #include <shogun/statistical_testing/HypothesisTest.h>
+ #include <shogun/statistical_testing/OneDistributionTest.h>
+ #include <shogun/statistical_testing/TwoDistributionTest.h>
+ #include <shogun/statistical_testing/IndependenceTest.h>
+ #include <shogun/statistical_testing/TwoSampleTest.h>
+ #include <shogun/statistical_testing/MMD.h>
+ #include <shogun/statistical_testing/StreamingMMD.h>
+ #include <shogun/statistical_testing/LinearTimeMMD.h>
+ #include <shogun/statistical_testing/BTestMMD.h>
+ #include <shogun/statistical_testing/QuadraticTimeMMD.h>
+ #include <shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h>
+ #include <shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h>
 %}
 
diff --git a/src/shogun/distance/Distance.cpp b/src/shogun/distance/Distance.cpp
index e3a581a9a04..7284d412299 100644
--- a/src/shogun/distance/Distance.cpp
+++ b/src/shogun/distance/Distance.cpp
@@ -56,13 +56,13 @@ bool CDistance::init(CFeatures* l, CFeatures* r)
 {
 	REQUIRE(check_compatibility(l, r), "Features are not compatible!\n");
 
-	//remove references to previous features
-	remove_lhs_and_rhs();
-
 	//increase reference counts
 	SG_REF(l);
 	SG_REF(r);
 
+	//remove references to previous features
+	remove_lhs_and_rhs();
+
 	lhs=l;
 	rhs=r;
 
diff --git a/src/shogun/features/DenseFeatures.cpp b/src/shogun/features/DenseFeatures.cpp
index d88b5cd2fcd..da012aa2031 100644
--- a/src/shogun/features/DenseFeatures.cpp
+++ b/src/shogun/features/DenseFeatures.cpp
@@ -641,14 +641,14 @@ template<class ST>
 CFeatures* CDenseFeatures<ST>::shallow_subset_copy()
 {
 	CFeatures* shallow_copy_features=NULL;
-	
+
 	SG_SDEBUG("Using underlying feature matrix with %d dimensions and %d feature vectors!\n", num_features, num_vectors);
 	SGMatrix<ST> shallow_copy_matrix(feature_matrix);
 	shallow_copy_features=new CDenseFeatures<ST>(shallow_copy_matrix);
 	SG_REF(shallow_copy_features);
 	if (m_subset_stack->has_subsets())
 		shallow_copy_features->add_subset(m_subset_stack->get_last_subset()->get_subset_idx());
-	
+
 	return shallow_copy_features;
 }
 
diff --git a/src/shogun/features/streaming/StreamingDenseFeatures.cpp b/src/shogun/features/streaming/StreamingDenseFeatures.cpp
index d47a1ec49d0..1db8f72ac5a 100644
--- a/src/shogun/features/streaming/StreamingDenseFeatures.cpp
+++ b/src/shogun/features/streaming/StreamingDenseFeatures.cpp
@@ -70,6 +70,7 @@ template<class T> void CStreamingDenseFeatures<T>::reset_stream()
 		parser.exit_parser();
 		parser.init(working_file, has_labels, 1);
 		parser.set_free_vector_after_release(false);
+		parser.set_free_vectors_on_destruct(false);
 		parser.start_parser();
 	}
 }
diff --git a/src/shogun/io/streaming/InputParser.h b/src/shogun/io/streaming/InputParser.h
index 73fdec0812f..652d0db3a1d 100644
--- a/src/shogun/io/streaming/InputParser.h
+++ b/src/shogun/io/streaming/InputParser.h
@@ -428,6 +428,7 @@ template <class T>
     else
         example_type = E_UNLABELLED;
 
+	SG_UNREF(examples_ring);
     examples_ring = new CParseBuffer<T>(size);
     SG_REF(examples_ring);
 
@@ -466,7 +467,8 @@ template <class T>
     }
 
     SG_SDEBUG("creating parse thread\n")
-    examples_ring->init_vector();
+    if (examples_ring)
+		examples_ring->init_vector();
 #ifdef HAVE_CXX11
 	parse_thread.reset(new std::thread(&parse_loop_entry_point, this));
 #elif defined(HAVE_PTHREAD)
diff --git a/src/shogun/kernel/CombinedKernel.cpp b/src/shogun/kernel/CombinedKernel.cpp
index 31292136f8c..9c39a2c8fb7 100644
--- a/src/shogun/kernel/CombinedKernel.cpp
+++ b/src/shogun/kernel/CombinedKernel.cpp
@@ -811,7 +811,7 @@ CCombinedKernel* CCombinedKernel::obtain_from_generic(CKernel* kernel)
 	if (kernel->get_kernel_type()!=K_COMBINED)
 	{
 		SG_SERROR("CCombinedKernel::obtain_from_generic(): provided kernel is "
-				"not of type CGaussianKernel!\n");
+				"not of type CCombinedKernel!\n");
 	}
 
 	/* since an additional reference is returned */
diff --git a/src/shogun/kernel/CustomKernel.h b/src/shogun/kernel/CustomKernel.h
index 613a71a852d..165e9a724d9 100644
--- a/src/shogun/kernel/CustomKernel.h
+++ b/src/shogun/kernel/CustomKernel.h
@@ -550,13 +550,13 @@ class CCustomKernel: public CKernel
 		 */
 		SGMatrix<float32_t> get_float32_kernel_matrix()
 		{
-			REQUIRE(!m_row_subset_stack, "%s::get_float32_kernel_matrix(): "
+			REQUIRE(!m_row_subset_stack->has_subsets(), "%s::get_float32_kernel_matrix(): "
 						"Not possible with row subset active! If you want to"
 						" create a %s from another one with a subset, use "
 						"get_kernel_matrix() and the SGMatrix constructor!\n",
 						get_name(), get_name());
 
-			REQUIRE(!m_col_subset_stack, "%s::get_float32_kernel_matrix(): "
+			REQUIRE(!m_col_subset_stack->has_subsets(), "%s::get_float32_kernel_matrix(): "
 					"Not possible with collumn subset active! If you want to"
 					" create a %s from another one with a subset, use "
 					"get_kernel_matrix() and the SGMatrix constructor!\n",
diff --git a/src/shogun/kernel/ShiftInvariantKernel.h b/src/shogun/kernel/ShiftInvariantKernel.h
index b52544b3277..3f75646a45d 100644
--- a/src/shogun/kernel/ShiftInvariantKernel.h
+++ b/src/shogun/kernel/ShiftInvariantKernel.h
@@ -39,6 +39,11 @@
 namespace shogun
 {
 
+namespace internal
+{
+	class KernelManager;
+}
+
 /** @brief Base class for the family of kernel functions that only depend on
  * the difference of the inputs, i.e. whose values does not change if the
  * inputs are shifted by the same amount. More precisely,
@@ -49,6 +54,9 @@ namespace shogun
  */
 class CShiftInvariantKernel: public CKernel
 {
+
+	friend class internal::KernelManager;
+
 public:
 	/** Default constructor.  */
 	CShiftInvariantKernel();
diff --git a/src/shogun/labels/BinaryLabels.cpp b/src/shogun/labels/BinaryLabels.cpp
index f46890d9718..6f1e93f0484 100644
--- a/src/shogun/labels/BinaryLabels.cpp
+++ b/src/shogun/labels/BinaryLabels.cpp
@@ -149,6 +149,6 @@ CLabels* CBinaryLabels::shallow_subset_copy()
 	((CDenseLabels*) shallow_copy_labels)->set_labels(shallow_copy_vector);
 	if (m_subset_stack->has_subsets())
 		shallow_copy_labels->add_subset(m_subset_stack->get_last_subset()->get_subset_idx());
-	
+
 	return shallow_copy_labels;
 }
diff --git a/src/shogun/labels/BinaryLabels.h b/src/shogun/labels/BinaryLabels.h
index 248608483e2..462397596bd 100644
--- a/src/shogun/labels/BinaryLabels.h
+++ b/src/shogun/labels/BinaryLabels.h
@@ -119,7 +119,6 @@ class CBinaryLabels : public CDenseLabels
 #ifndef SWIG // SWIG should skip this part
 	virtual CLabels* shallow_subset_copy();
 #endif
-
 };
 }
 #endif
diff --git a/src/shogun/labels/MulticlassLabels.cpp b/src/shogun/labels/MulticlassLabels.cpp
index ef65ea092e0..3133efdbbfd 100644
--- a/src/shogun/labels/MulticlassLabels.cpp
+++ b/src/shogun/labels/MulticlassLabels.cpp
@@ -144,6 +144,6 @@ CLabels* CMulticlassLabels::shallow_subset_copy()
 	((CDenseLabels*) shallow_copy_labels)->set_labels(shallow_copy_vector);
 	if (m_subset_stack->has_subsets())
 		shallow_copy_labels->add_subset(m_subset_stack->get_last_subset()->get_subset_idx());
-	
-	return shallow_copy_labels;	
+
+	return shallow_copy_labels;
 }
diff --git a/src/shogun/labels/RegressionLabels.cpp b/src/shogun/labels/RegressionLabels.cpp
index eb85c368526..5870eafd2d2 100644
--- a/src/shogun/labels/RegressionLabels.cpp
+++ b/src/shogun/labels/RegressionLabels.cpp
@@ -35,6 +35,6 @@ CLabels* CRegressionLabels::shallow_subset_copy()
 	((CDenseLabels*) shallow_copy_labels)->set_labels(shallow_copy_vector);
 	if (m_subset_stack->has_subsets())
 		shallow_copy_labels->add_subset(m_subset_stack->get_last_subset()->get_subset_idx());
-	
+
 	return shallow_copy_labels;
 }
diff --git a/src/shogun/labels/RegressionLabels.h b/src/shogun/labels/RegressionLabels.h
index 831b69961c8..56698d149ce 100644
--- a/src/shogun/labels/RegressionLabels.h
+++ b/src/shogun/labels/RegressionLabels.h
@@ -69,7 +69,6 @@ class CRegressionLabels : public CDenseLabels
 #ifndef SWIG // SWIG should skip this part
 		virtual CLabels* shallow_subset_copy();
 #endif
-
 };
 }
 #endif
diff --git a/src/shogun/machine/BaggingMachine.cpp b/src/shogun/machine/BaggingMachine.cpp
index 5edeef58c7c..ad3aa046082 100644
--- a/src/shogun/machine/BaggingMachine.cpp
+++ b/src/shogun/machine/BaggingMachine.cpp
@@ -76,7 +76,7 @@ SGVector<float64_t> CBaggingMachine::apply_get_outputs(CFeatures* data)
 	SGMatrix<float64_t> output(data->get_num_vectors(), m_num_bags);
 	output.zero();
 
-	
+
 	#pragma omp parallel for
 	for (int32_t i = 0; i < m_num_bags; ++i)
 	{
@@ -178,7 +178,7 @@ bool CBaggingMachine::train_machine(CFeatures* data)
 		labels->remove_subset();
 
 		#pragma omp critical
-		{		
+		{
 		// get out of bag indexes
 		CDynamicArray<index_t>* oob = get_oob_indices(idx);
 		m_oob_indices->push_back(oob);
diff --git a/src/shogun/multiclass/tree/CARTree.cpp b/src/shogun/multiclass/tree/CARTree.cpp
index cd7c261da1d..ac0a310ed42 100644
--- a/src/shogun/multiclass/tree/CARTree.cpp
+++ b/src/shogun/multiclass/tree/CARTree.cpp
@@ -105,7 +105,7 @@ CMulticlassLabels* CCARTree::apply_multiclass(CFeatures* data)
 
 	// apply multiclass starting from root
 	bnode_t* current=dynamic_cast<bnode_t*>(get_root());
-	
+
 	REQUIRE(current, "Tree machine not yet trained.\n");
 	CLabels* ret=apply_from_current_node(dynamic_cast<CDenseFeatures<float64_t>*>(data), current);
 
@@ -289,7 +289,7 @@ bool CCARTree::train_machine(CFeatures* data)
 
 void CCARTree::set_sorted_features(SGMatrix<float64_t>& sorted_feats, SGMatrix<index_t>& sorted_indices)
 {
-	m_pre_sort=true;	
+	m_pre_sort=true;
 	m_sorted_features=sorted_feats;
 	m_sorted_indices=sorted_indices;
 }
@@ -414,7 +414,7 @@ CBinaryTreeMachineNode<CARTreeNodeData>* CCARTree::CARTtrain(CFeatures* data, SG
 	int32_t c_left=-1;
 	int32_t c_right=-1;
 	int32_t best_attribute;
-	
+
 	SGVector<index_t> indices(num_vecs);
 	if (m_pre_sort)
 	{
@@ -532,13 +532,13 @@ int32_t CCARTree::compute_best_attribute(const SGMatrix<float64_t>& mat, const S
 	SGVector<float64_t>& left, SGVector<float64_t>& right, SGVector<bool>& is_left_final, int32_t &num_missing_final, int32_t &count_left,
 	int32_t &count_right, int32_t subset_size, const SGVector<index_t>& active_indices)
 {
-	SGVector<float64_t> labels_vec=(dynamic_cast<CDenseLabels*>(labels))->get_labels();	
+	SGVector<float64_t> labels_vec=(dynamic_cast<CDenseLabels*>(labels))->get_labels();
 	int32_t num_vecs=labels->get_num_labels();
 	int32_t num_feats;
 	if (m_pre_sort)
 		num_feats=mat.num_cols;
 	else
-		num_feats=mat.num_rows;		
+		num_feats=mat.num_rows;
 
 	int32_t n_ulabels;
 	SGVector<float64_t> ulabels=get_unique_labels(labels_vec,n_ulabels);
@@ -567,7 +567,7 @@ int32_t CCARTree::compute_best_attribute(const SGMatrix<float64_t>& mat, const S
 			}
 		}
 	}
-	
+
 	SGVector<index_t> idx(num_feats);
 	idx.range_fill();
 	if (subset_size)
@@ -579,7 +579,7 @@ int32_t CCARTree::compute_best_attribute(const SGMatrix<float64_t>& mat, const S
 	float64_t max_gain=MIN_SPLIT_GAIN;
 	int32_t best_attribute=-1;
 	float64_t best_threshold=0;
-	
+
 	SGVector<int64_t> indices_mask;
 	SGVector<int32_t> count_indices(mat.num_rows);
 	count_indices.zero();
@@ -603,6 +603,8 @@ int32_t CCARTree::compute_best_attribute(const SGMatrix<float64_t>& mat, const S
 	{
 		SGVector<float64_t> feats(num_vecs);
 		SGVector<index_t> sorted_args(num_vecs);
+		SGVector<int32_t> temp_count_indices(count_indices.size());
+		memcpy(temp_count_indices.vector, count_indices.vector, sizeof(int32_t)*count_indices.size());
 
 		if (m_pre_sort)
 		{
@@ -708,7 +710,7 @@ int32_t CCARTree::compute_best_attribute(const SGMatrix<float64_t>& mat, const S
 					if(dupes[j]!=j)
 						is_left[j]=is_left[dupes[j]];
 				}
-				
+
 				float64_t g=0;
 				if (m_mode==PT_MULTICLASS)
 					g=gain(wleft,wright,total_wclasses);
@@ -806,7 +808,7 @@ int32_t CCARTree::compute_best_attribute(const SGMatrix<float64_t>& mat, const S
 		count_right=1;
 		if (m_pre_sort)
 		{
-			SGVector<float64_t> temp_vec(mat.get_column_vector(best_attribute), mat.num_rows, false);			
+			SGVector<float64_t> temp_vec(mat.get_column_vector(best_attribute), mat.num_rows, false);
 			SGVector<index_t> sorted_indices(m_sorted_indices.get_column_vector(best_attribute), mat.num_rows, false);
 			int32_t count=0;
 			for(int32_t i=0;i<mat.num_rows;i++)
@@ -824,8 +826,8 @@ int32_t CCARTree::compute_best_attribute(const SGMatrix<float64_t>& mat, const S
 				if(dupes[i]!=i)
 					is_left_final[i]=is_left_final[dupes[i]];
 			}
-				
-		}	
+
+		}
 		else
 		{
 			for (int32_t i=0;i<num_vecs;i++)
@@ -1088,7 +1090,7 @@ float64_t CCARTree::least_squares_deviation(const SGVector<float64_t>& feats, co
 {
 
 	Map<VectorXd> map_weights(weights.vector, weights.size());
-	Map<VectorXd> map_feats(feats.vector, weights.size());	
+	Map<VectorXd> map_feats(feats.vector, weights.size());
 	float64_t mean=map_weights.dot(map_feats);
 	total_weight=map_weights.sum();
 
@@ -1104,7 +1106,7 @@ CLabels* CCARTree::apply_from_current_node(CDenseFeatures<float64_t>* feats, bno
 {
 	int32_t num_vecs=feats->get_num_vectors();
 	REQUIRE(num_vecs>0, "No data provided in apply\n");
-	
+
 	SGVector<float64_t> labels(num_vecs);
 	for (int32_t i=0;i<num_vecs;i++)
 	{
@@ -1365,7 +1367,7 @@ float64_t CCARTree::compute_error(CLabels* labels, CLabels* reference, SGVector<
 CDynamicObjectArray* CCARTree::prune_tree(CTreeMachine<CARTreeNodeData>* tree)
 {
 	REQUIRE(tree, "Tree not provided for pruning.\n");
-	
+
 	CDynamicObjectArray* trees=new CDynamicObjectArray();
 	SG_UNREF(m_alphas);
 	m_alphas=new CDynamicArray<float64_t>();
diff --git a/src/shogun/preprocessor/BAHSIC.h b/src/shogun/preprocessor/BAHSIC.h
deleted file mode 100644
index e58a89e9d1d..00000000000
--- a/src/shogun/preprocessor/BAHSIC.h
+++ /dev/null
@@ -1,92 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef BAHSIC_H__
-#define BAHSIC_H__
-
-#include <shogun/lib/config.h>
-#include <shogun/preprocessor/KernelDependenceMaximization.h>
-
-namespace shogun
-{
-
-/** @brief Class CBAHSIC, that extends CKernelDependenceMaximization and uses
- * HSIC [1] to compute dependence measures for feature selection using a
- * backward elimination approach as described in [1]. This class serves as a
- * convenience class that initializes the CDependenceMaximization#m_estimator
- * with an instance of CHSIC and allows only shogun::BACKWARD_ELIMINATION algorithm
- * to use which is set internally. Therefore, trying to use other algorithms
- * by set_algorithm() will not work. Plese see the class documentation of CHSIC
- * and [2] for more details on mathematical description of HSIC.
- *
- * Refrences:
- * [1] Song, Le and Bedo, Justin and Borgwardt, Karsten M. and Gretton, Arthur
- * and Smola, Alex. (2007). Gene Selection via the BAHSIC Family of Algorithms.
- * Journal Bioinformatics. Volume 23 Issue Pages i490-i498. Oxford University
- * Press Oxford, UK
- * [2]: Gretton, A., Fukumizu, K., Teo, C., & Song, L. (2008). A kernel
- * statistical test of independence. Advances in Neural Information Processing
- * Systems, 1-8.
- */
-class CBAHSIC : public CKernelDependenceMaximization
-{
-public:
-	/** Default constructor */
-	CBAHSIC();
-
-	/** Destructor */
-	virtual ~CBAHSIC();
-
-	/**
-	 * Since only shogun::BACKWARD_ELIMINATION algorithm is applicable for BAHSIC,
-	 * and this is set internally, this method is overridden to prevent this
-	 * to be set from public API.
-	 *
-	 * @param algorithm the feature selection algorithm to use
-	 */
-	virtual void set_algorithm(EFeatureSelectionAlgorithm algorithm);
-
-	/** @return the preprocessor type */
-	virtual EPreprocessorType get_type() const;
-
-	/** @return the class name */
-	virtual const char* get_name() const
-	{
-		return "BAHSIC";
-	}
-
-private:
-	/** Register params and initialize with default values */
-	void initialize_parameters();
-
-};
-
-}
-#endif // BAHSIC_H__
diff --git a/src/shogun/preprocessor/DependenceMaximization.cpp b/src/shogun/preprocessor/DependenceMaximization.cpp
index ac636f7fad9..16cb71576bc 100644
--- a/src/shogun/preprocessor/DependenceMaximization.cpp
+++ b/src/shogun/preprocessor/DependenceMaximization.cpp
@@ -31,7 +31,7 @@
 #include <shogun/lib/SGMatrix.h>
 #include <shogun/labels/Labels.h>
 #include <shogun/features/DenseFeatures.h>
-#include <shogun/statistics/IndependenceTest.h>
+#include <shogun/statistical_testing/IndependenceTest.h>
 #include <shogun/preprocessor/DependenceMaximization.h>
 #include <shogun/mathematics/Math.h>
 
diff --git a/src/shogun/preprocessor/KernelDependenceMaximization.cpp b/src/shogun/preprocessor/KernelDependenceMaximization.cpp
deleted file mode 100644
index b8292ab166c..00000000000
--- a/src/shogun/preprocessor/KernelDependenceMaximization.cpp
+++ /dev/null
@@ -1,141 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/kernel/CustomKernel.h>
-#include <shogun/statistics/KernelIndependenceTest.h>
-#include <shogun/preprocessor/KernelDependenceMaximization.h>
-
-using namespace shogun;
-
-CKernelDependenceMaximization::CKernelDependenceMaximization()
-	: CDependenceMaximization()
-{
-	initialize_parameters();
-}
-
-void CKernelDependenceMaximization::initialize_parameters()
-{
-	SG_ADD((CSGObject**)&m_kernel_features, "kernel_features",
-			"the kernel to be used for features", MS_NOT_AVAILABLE);
-	SG_ADD((CSGObject**)&m_kernel_labels, "kernel_labels",
-			"the kernel to be used for labels", MS_NOT_AVAILABLE);
-
-	m_kernel_features=NULL;
-	m_kernel_labels=NULL;
-}
-
-CKernelDependenceMaximization::~CKernelDependenceMaximization()
-{
-	SG_UNREF(m_kernel_features);
-	SG_UNREF(m_kernel_labels);
-}
-
-void CKernelDependenceMaximization::precompute()
-{
-	SG_DEBUG("Entering!\n");
-
-	REQUIRE(m_labels_feats, "Features for labels is not initialized!\n");
-	REQUIRE(m_kernel_labels, "Kernel for labels is not initialized!\n");
-
-	// ASSERT here because the estimator is set internally and cannot
-	// be set via public API
-	ASSERT(m_estimator);
-
-	CFeatureSelection<float64_t>::precompute();
-
-	// make sure that we have an instance of CKernelIndependenceTest via
-	// proper cast and set this kernel to the estimator
-	CKernelIndependenceTest* estimator
-		=dynamic_cast<CKernelIndependenceTest*>(m_estimator);
-	ASSERT(estimator);
-
-	// precompute the kernel for labels
-	m_kernel_labels->init(m_labels_feats, m_labels_feats);
-	CCustomKernel* precomputed
-		=new CCustomKernel(m_kernel_labels->get_kernel_matrix());
-
-	// replace the kernel for labels with precomputed kernel
-	SG_UNREF(m_kernel_labels);
-	m_kernel_labels=precomputed;
-	SG_REF(m_kernel_labels);
-
-	// we can safely SG_UNREF the feature object for labels now
-	SG_UNREF(m_labels_feats);
-	m_labels_feats=NULL;
-
-	// finally set this as kernel for the labels
-	estimator->set_kernel_q(m_kernel_labels);
-
-	SG_DEBUG("Leaving!\n");
-}
-
-void CKernelDependenceMaximization::set_kernel_features(CKernel* kernel)
-{
-	// sanity check. using assert here because estimator instances are
-	// set internally and cannot be set via public API.
-	ASSERT(m_estimator);
-	CKernelIndependenceTest* estimator
-		=dynamic_cast<CKernelIndependenceTest*>(m_estimator);
-	ASSERT(estimator);
-
-	SG_REF(kernel);
-	SG_UNREF(m_kernel_features);
-	m_kernel_features=kernel;
-
-	estimator->set_kernel_p(m_kernel_features);
-}
-
-void CKernelDependenceMaximization::set_kernel_labels(CKernel* kernel)
-{
-	// sanity check. using assert here because estimator instances are
-	// set internally and cannot be set via public API.
-	ASSERT(m_estimator);
-	CKernelIndependenceTest* estimator
-		=dynamic_cast<CKernelIndependenceTest*>(m_estimator);
-	ASSERT(estimator);
-
-	SG_REF(kernel);
-	SG_UNREF(m_kernel_labels);
-	m_kernel_labels=kernel;
-
-	estimator->set_kernel_q(m_kernel_labels);
-}
-
-CKernel* CKernelDependenceMaximization::get_kernel_features() const
-{
-	SG_REF(m_kernel_features);
-	return m_kernel_features;
-}
-
-CKernel* CKernelDependenceMaximization::get_kernel_labels() const
-{
-	SG_REF(m_kernel_labels);
-	return m_kernel_labels;
-}
diff --git a/src/shogun/preprocessor/KernelDependenceMaximization.h b/src/shogun/preprocessor/KernelDependenceMaximization.h
deleted file mode 100644
index 9d4159088dd..00000000000
--- a/src/shogun/preprocessor/KernelDependenceMaximization.h
+++ /dev/null
@@ -1,105 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef KERNEL_DEPENDENCE_MAXIMIZATION_H__
-#define KERNEL_DEPENDENCE_MAXIMIZATION_H__
-
-#include <shogun/lib/config.h>
-#include <shogun/preprocessor/DependenceMaximization.h>
-
-namespace shogun
-{
-
-class CFeatures;
-class CKernelSelection;
-
-/** @brief Class CKernelDependenceMaximization, that uses an implementation
- * of CKernelIndependenceTest to compute dependence measures for feature
- * selection. Different kernels are used for labels and data. For the sake
- * of computational convenience, the precompute() method is overridden to
- * precompute the kernel for labels and save as an instance of CCustomKernel
- */
-class CKernelDependenceMaximization : public CDependenceMaximization
-{
-public:
-	/** Default constructor */
-	CKernelDependenceMaximization();
-
-	/** Destructor */
-	virtual ~CKernelDependenceMaximization();
-
-	/** @param kernel the kernel for features (data) */
-	void set_kernel_features(CKernel* kernel);
-
-	/** @return the kernel for features */
-	CKernel* get_kernel_features() const;
-
-	/** @param kernel the kernel for labels */
-	void set_kernel_labels(CKernel* kernel);
-
-	/** @return the kernel for labels */
-	CKernel* get_kernel_labels() const;
-
-	/**
-	 * Abstract method which is overridden in the subclasses to set accepted
-	 * feature selection algorithm
-	 *
-	 * @param algorithm the feature selection algorithm to use
-	 */
-	virtual void set_algorithm(EFeatureSelectionAlgorithm algorithm)=0;
-
-	/** @return the class name */
-	virtual const char* get_name() const
-	{
-		return "KernelDependenceMaximization";
-	}
-
-protected:
-	/**
-	 * Precomputes the kernel on labels and replaces the #m_kernel_labels
-	 * with an instance of CCustomKernel. Labels features are set via
-	 * CDependenceMaximization::set_labels call.
-	 */
-	virtual void precompute();
-
-	/** The kernel for data (features) to be used in CKernelIndependenceTest */
-	CKernel* m_kernel_features;
-
-	/** The kernel for labels to be used in CKernelIndependenceTest */
-	CKernel* m_kernel_labels;
-
-private:
-	/** Register params and initialize with default values */
-	void initialize_parameters();
-
-};
-
-}
-#endif // KERNEL_DEPENDENCE_MAXIMIZATION_H__
diff --git a/src/shogun/statistical_testing/BTestMMD.cpp b/src/shogun/statistical_testing/BTestMMD.cpp
new file mode 100644
index 00000000000..d1be47bd3dc
--- /dev/null
+++ b/src/shogun/statistical_testing/BTestMMD.cpp
@@ -0,0 +1,117 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/mathematics/Statistics.h>
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/statistical_testing/BTestMMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/mmd/WithinBlockDirect.h>
+
+using namespace shogun;
+using namespace internal;
+
+CBTestMMD::CBTestMMD() : CStreamingMMD()
+{
+}
+
+CBTestMMD::~CBTestMMD()
+{
+}
+
+void CBTestMMD::set_blocksize(index_t blocksize)
+{
+	get_data_mgr().set_blocksize(blocksize);
+}
+
+void CBTestMMD::set_num_blocks_per_burst(index_t num_blocks_per_burst)
+{
+	get_data_mgr().set_num_blocks_per_burst(num_blocks_per_burst);
+}
+
+const std::function<float32_t(SGMatrix<float32_t>)> CBTestMMD::get_direct_estimation_method() const
+{
+	return mmd::WithinBlockDirect();
+}
+
+float64_t CBTestMMD::normalize_statistic(float64_t statistic) const
+{
+	const DataManager& data_mgr=get_data_mgr();
+	const index_t Nx=data_mgr.num_samples_at(0);
+	const index_t Ny=data_mgr.num_samples_at(1);
+	const index_t Bx=data_mgr.blocksize_at(0);
+	const index_t By=data_mgr.blocksize_at(1);
+	return Nx*Ny*statistic*CMath::sqrt((Bx+By)/float64_t(Nx+Ny))/(Nx+Ny);
+}
+
+const float64_t CBTestMMD::normalize_variance(float64_t variance) const
+{
+	const DataManager& data_mgr=get_data_mgr();
+	const index_t Bx=data_mgr.blocksize_at(0);
+	const index_t By=data_mgr.blocksize_at(1);
+	return variance*CMath::sq(Bx*By/float64_t(Bx+By));
+}
+
+float64_t CBTestMMD::compute_p_value(float64_t statistic)
+{
+	float64_t result=0;
+	switch (get_null_approximation_method())
+	{
+		case NAM_MMD1_GAUSSIAN:
+		{
+			float64_t sigma_sq=compute_variance();
+			float64_t std_dev=CMath::sqrt(sigma_sq);
+			result=1.0-CStatistics::normal_cdf(statistic, std_dev);
+			break;
+		}
+		default:
+		{
+			result=CHypothesisTest::compute_p_value(statistic);
+			break;
+		}
+	}
+	return result;
+}
+
+float64_t CBTestMMD::compute_threshold(float64_t alpha)
+{
+	float64_t result=0;
+	switch (get_null_approximation_method())
+	{
+		case NAM_MMD1_GAUSSIAN:
+		{
+			float64_t sigma_sq=compute_variance();
+			float64_t std_dev=CMath::sqrt(sigma_sq);
+			result=1.0-CStatistics::inverse_normal_cdf(1-alpha, 0, std_dev);
+			break;
+		}
+		default:
+		{
+			result=CHypothesisTest::compute_threshold(alpha);
+			break;
+		}
+	}
+	return result;
+}
+
+const char* CBTestMMD::get_name() const
+{
+	return "BTestMMD";
+}
diff --git a/src/shogun/statistical_testing/BTestMMD.h b/src/shogun/statistical_testing/BTestMMD.h
new file mode 100644
index 00000000000..03439818c17
--- /dev/null
+++ b/src/shogun/statistical_testing/BTestMMD.h
@@ -0,0 +1,48 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef B_TEST_MMD_H_
+#define B_TEST_MMD_H_
+
+#include <shogun/statistical_testing/StreamingMMD.h>
+
+namespace shogun
+{
+
+class CBTestMMD : public CStreamingMMD
+{
+public:
+	typedef std::function<float32_t(SGMatrix<float32_t>)> operation;
+	CBTestMMD();
+	virtual ~CBTestMMD();
+
+	void set_blocksize(index_t blocksize);
+	void set_num_blocks_per_burst(index_t num_blocks_per_burst);
+
+	virtual float64_t compute_p_value(float64_t statistic);
+	virtual float64_t compute_threshold(float64_t alpha);
+
+	virtual const char* get_name() const;
+private:
+	virtual const operation get_direct_estimation_method() const;
+	virtual float64_t normalize_statistic(float64_t statistic) const;
+	virtual const float64_t normalize_variance(float64_t variance) const;
+};
+
+}
+#endif // B_TEST_MMD_H_
diff --git a/src/shogun/statistics/HypothesisTest.cpp b/src/shogun/statistical_testing/HypothesisTest.cpp
similarity index 50%
rename from src/shogun/statistics/HypothesisTest.cpp
rename to src/shogun/statistical_testing/HypothesisTest.cpp
index d8167fd9e24..9afd853d094 100644
--- a/src/shogun/statistics/HypothesisTest.cpp
+++ b/src/shogun/statistical_testing/HypothesisTest.cpp
@@ -1,6 +1,7 @@
 /*
  * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -28,98 +29,84 @@
  * either expressed or implied, of the Shogun Development Team.
  */
 
-#include <shogun/statistics/HypothesisTest.h>
-#include <shogun/base/Parameter.h>
+#include <algorithm>
 #include <shogun/lib/SGVector.h>
 #include <shogun/mathematics/Math.h>
+#include <shogun/statistical_testing/HypothesisTest.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
 
 using namespace shogun;
+using namespace internal;
 
-CHypothesisTest::CHypothesisTest() : CSGObject()
+struct CHypothesisTest::Self
+{
+	explicit Self(index_t num_distributions);
+	DataManager data_mgr;
+};
+
+CHypothesisTest::Self::Self(index_t num_distributions) : data_mgr(num_distributions)
 {
-	init();
 }
 
-CHypothesisTest::~CHypothesisTest()
+CHypothesisTest::CHypothesisTest()
 {
+	SG_WARNING("An empty instance of this class should not be used! If you are seeing \
+			this error, please contact Shogun developers!\n");
 }
 
-void CHypothesisTest::init()
+CHypothesisTest::CHypothesisTest(index_t num_distributions) : CSGObject()
 {
-	SG_ADD(&m_num_null_samples, "num_null_samples",
-			"Number of permutation iterations for sampling null",
-			MS_NOT_AVAILABLE);
-	SG_ADD((machine_int_t*)&m_null_approximation_method,
-			"null_approximation_method",
-			"Method for approximating null distribution",
-			MS_NOT_AVAILABLE);
+	self=std::unique_ptr<Self>(new CHypothesisTest::Self(num_distributions));
+}
 
-	m_num_null_samples=250;
-	m_null_approximation_method=PERMUTATION;
+CHypothesisTest::~CHypothesisTest()
+{
 }
 
-void CHypothesisTest::set_null_approximation_method(
-		ENullApproximationMethod null_approximation_method)
+void CHypothesisTest::set_train_test_mode(bool on)
 {
-	m_null_approximation_method=null_approximation_method;
+	self->data_mgr.set_train_test_mode(on);
 }
 
-void CHypothesisTest::set_num_null_samples(index_t num_null_samples)
+void CHypothesisTest::set_train_test_ratio(float64_t ratio)
 {
-	m_num_null_samples=num_null_samples;
+	self->data_mgr.set_train_test_ratio(ratio);
+	self->data_mgr.reset();
 }
 
 float64_t CHypothesisTest::compute_p_value(float64_t statistic)
 {
-	float64_t result=0;
-
-	if (m_null_approximation_method==PERMUTATION)
-	{
-		/* sample a bunch of MMD values from null distribution */
-		SGVector<float64_t> values=sample_null();
-
-		/* find out percentile of parameter "statistic" in null distribution */
-		CMath::qsort(values);
-		float64_t i=values.find_position_to_insert(statistic);
-
-		/* return corresponding p-value */
-		result=1.0-i/values.vlen;
-	}
-	else
-		SG_ERROR("Unknown method to approximate null distribution!\n");
-
-	return result;
+	SGVector<float64_t> values=sample_null();
+	std::sort(values.vector, values.vector + values.vlen);
+	float64_t i=values.find_position_to_insert(statistic);
+	return 1.0-i/values.vlen;
 }
 
 float64_t CHypothesisTest::compute_threshold(float64_t alpha)
 {
-	float64_t result=0;
-
-	if (m_null_approximation_method==PERMUTATION)
-	{
-		/* sample a bunch of MMD values from null distribution */
-		SGVector<float64_t> values=sample_null();
+	SGVector<float64_t> values=sample_null();
+	std::sort(values.vector, values.vector + values.vlen);
+	return values[index_t(CMath::floor(values.vlen*(1-alpha)))];
+}
 
-		/* return value of (1-alpha) quantile */
-		CMath::qsort(values);
-		result=values[index_t(CMath::floor(values.vlen*(1-alpha)))];
-	}
-	else
-		SG_ERROR("Unknown method to approximate null distribution!\n");
+bool CHypothesisTest::perform_test(float64_t alpha)
+{
+	auto statistic=compute_statistic();
+	auto p_value=compute_p_value(statistic);
+	return p_value<alpha;
+}
 
-	return result;
+const char* CHypothesisTest::get_name() const
+{
+	return "HypothesisTest";
 }
 
-float64_t CHypothesisTest::perform_test()
+DataManager& CHypothesisTest::get_data_mgr()
 {
-	/* baseline method here is simply to compute statistic and p-value
-	 * separately */
-	float64_t statistic=compute_statistic();
-	return compute_p_value(statistic);
+	return self->data_mgr;
 }
 
-bool CHypothesisTest::perform_test(float64_t alpha)
+const DataManager& CHypothesisTest::get_data_mgr() const
 {
-	float64_t p_value=perform_test();
-	return p_value<alpha;
+	return self->data_mgr;
 }
diff --git a/src/shogun/statistics/HypothesisTest.h b/src/shogun/statistical_testing/HypothesisTest.h
similarity index 52%
rename from src/shogun/statistics/HypothesisTest.h
rename to src/shogun/statistical_testing/HypothesisTest.h
index e9607760887..2346e9ef5ee 100644
--- a/src/shogun/statistics/HypothesisTest.h
+++ b/src/shogun/statistical_testing/HypothesisTest.h
@@ -1,6 +1,7 @@
 /*
  * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -31,40 +32,30 @@
 #ifndef HYPOTHESIS_TEST_H_
 #define HYPOTHESIS_TEST_H_
 
+#include <memory>
 #include <shogun/lib/config.h>
-
 #include <shogun/base/SGObject.h>
 
 namespace shogun
 {
 
-/** enum for different statistic types */
-enum EStatisticType
-{
-	S_LINEAR_TIME_MMD,
-	S_QUADRATIC_TIME_MMD,
-	S_HSIC,
-	S_NOCCO
-};
+class CFeatures;
 
-/** enum for different method to approximate null-distibution */
-enum ENullApproximationMethod
+namespace internal
 {
-	PERMUTATION,
-	MMD2_SPECTRUM_DEPRECATED,
-	MMD2_SPECTRUM,
-	MMD2_GAMMA,
-	MMD1_GAUSSIAN,
-	HSIC_GAMMA
-};
 
-/** @brief Hypothesis test base class. Provides an interface for statistical
+class DataManager;
+
+}
+
+/**
+ * @brief Hypothesis test base class. Provides an interface for statistical
  * hypothesis testing via three methods: compute_statistic(), compute_p_value()
  * and compute_threshold(). The second computes a p-value for the statistic
- * computed by the first method.
- * The p-value represents the position of the statistic in the null-distribution,
- * i.e. the distribution of the statistic population given the null-hypothesis
- * is true. (1-position = p-value).
+ * computed by the first method. The p-value represents the position of the
+ * statistic in the null-distribution, i.e. the distribution of the statistic
+ * population given the null-hypothesis is true. (1-position = p-value).
+ *
  * The third method,  compute_threshold(), computes a threshold for a given
  * test level which is needed to reject the null-hypothesis.
  *
@@ -78,20 +69,50 @@ enum ENullApproximationMethod
 class CHypothesisTest : public CSGObject
 {
 public:
-	/** default constructor */
+	/** Default constructor */
 	CHypothesisTest();
 
-	/** destructor */
+	/** Destructor */
 	virtual ~CHypothesisTest();
 
-	/** @return test statistic for the given data/parameters/methods */
-	virtual float64_t compute_statistic()=0;
+	/**
+	 * Method that enables/disables the training-testing mode. If this option
+	 * is turned on, then the samples would be split in two pieces: one chunk
+	 * would be used for training algorithms and the other chunk would be used
+	 * for performing tests. If this option is turned off, the entire data
+	 * would be used for performing the test. Before running any training
+	 * algorithms, make sure to turn this mode on.
+	 *
+	 * By default, the training-testing mode is turned off.
+	 *
+	 * \sa {set_train_test_ratio()}
+	 *
+	 * @param on Whether to enable/disable the training-testing mode
+	 */
+	void set_train_test_mode(bool on);
+
+	/**
+	 * Method that specifies the ratio of training-testing data split for the
+	 * algorithms. Note that this is NOT the percentage of samples to be used
+	 * for training, rather the ratio of the number of samples to be used for
+	 * training and that of testing.
+	 *
+	 * By default, an equal 50-50 split (ratio = 1) is made.
+	 *
+	 * \sa {set_train_test_mode()}
+	 *
+	 * @param ratio The ratio of the number of samples to be used for training
+	 * and that of testing
+	 */
+	void set_train_test_ratio(float64_t ratio);
 
-	/** computes a p-value based on current method for approximating the
-	 * null-distribution. The p-value is the 1-p quantile of the null-
+	/**
+	 * Method that computes a p-value based on current method for approximating
+	 * the null-distribution. The p-value is the 1-p quantile of the null-
 	 * distribution where the given statistic lies in.
+	 *
 	 * This method depends on the implementation of sample_null method
-	 * which should be implemented in its sub-classes
+	 * which should be implemented by the sub-classes.
 	 *
 	 * @param statistic statistic value to compute the p-value for
 	 * @return p-value parameter statistic is the (1-p) percentile of the
@@ -99,38 +120,24 @@ class CHypothesisTest : public CSGObject
 	 */
 	virtual float64_t compute_p_value(float64_t statistic);
 
-	/** computes a threshold based on current method for approximating the
-	 * null-distribution. The threshold is the value that a statistic has
+	/**
+	 * Method that computes a threshold based on current method for approximating
+	 * the null-distribution. The threshold is the value that a statistic has
 	 * to have in ordner to reject the null-hypothesis.
+	 *
 	 * This method depends on the implementation of sample_null method
-	 * which should be implemented in its sub-classes
+	 * which should be implemented by the sub-classes.
 	 *
 	 * @param alpha test level to reject null-hypothesis
-	 * @return threshold for statistics to reject null-hypothesis
+	 * @return Threshold for statistics to reject null-hypothesis
 	 */
 	virtual float64_t compute_threshold(float64_t alpha);
 
-	/** Performs the complete two-sample test on current data and returns a
-	 * p-value.
-	 *
-	 * This is a wrapper that calls compute_statistic first and then
-	 * calls compute_p_value using the obtained statistic. In some statistic
-	 * classes, it might be possible to compute statistic and p-value in
-	 * one single run which is more efficient. Therefore, this method might
-	 * be overwritten in subclasses.
+	/**
+	 * Method that performs the complete hypothesis test on current data and
+	 * returns a binary answer: wheter null hypothesis is rejected or not.
 	 *
-	 * The method for computing the p-value can be set via
-	 * set_null_approximation_method().
-	 *
-	 * @return p-value such that computed statistic is the (1-p) quantile
-	 * of the estimated null distribution
-	 */
-	virtual float64_t perform_test();
-
-	/** Performs the complete two-sample test on current data and returns
-	 * a binary answer wheter null hypothesis is rejected or not.
-	 *
-	 * This is just a wrapper for the above perform_test() method that
+	 * This is just a wrapper for the above compute_p_value() method that
 	 * returns a p-value. If this p-value lies below the test level alpha,
 	 * the null hypothesis is rejected.
 	 *
@@ -141,42 +148,34 @@ class CHypothesisTest : public CSGObject
 	 */
 	bool perform_test(float64_t alpha);
 
-	/** computes the test statistic m_num_null_samples times, exact
-	 * computation depends on the implementations.
+	/**
+	 * Interface for computing the test-statistic for the hypothesis test.
 	 *
-	 * @return vector of all statistics
+	 * @return Test statistic for the given data/parameters/methods
 	 */
-	virtual SGVector<float64_t> sample_null()=0;
+	virtual float64_t compute_statistic()=0;
 
-	/** sets the number of permutation iterations for sample_null()
+	/**
+	 * Interface for computing the samples under the null-hypothesis.
 	 *
-	 * @param num_null_samples how often permutation shall be done
+	 * @return Vector of all statistics
 	 */
-	virtual void set_num_null_samples(index_t num_null_samples);
-
-	/** sets the method how to approximate the null-distribution
-	 * @param null_approximation_method method to use
-	 */
-	virtual void set_null_approximation_method(
-			ENullApproximationMethod null_approximation_method);
-
-	/** returns the statistic type of this test statistic */
-	virtual EStatisticType get_statistic_type() const=0;
-
-	virtual const char* get_name() const=0;
-
-private:
-	/** register parameters and initialize with default values */
-	void init();
+	virtual SGVector<float64_t> sample_null()=0;
 
+	/** @return The name of the class */
+	virtual const char* get_name() const;
 protected:
-	/** number of iterations for sampling from null-distributions */
-	index_t m_num_null_samples;
+	explicit CHypothesisTest(index_t num_distributions);
+	internal::DataManager& get_data_mgr();
+	const internal::DataManager& get_data_mgr() const;
+private:
+	CHypothesisTest(const CHypothesisTest& other)=delete;
+	CHypothesisTest& operator=(const CHypothesisTest& other)=delete;
 
-	/** Defines how the the null distribution is approximated */
-	ENullApproximationMethod m_null_approximation_method;
+	struct Self;
+	std::unique_ptr<Self> self;
 };
 
 }
 
-#endif /* HYPOTHESIS_TEST_H_ */
+#endif // HYPOTHESIS_TEST_H_
diff --git a/src/shogun/statistical_testing/IndependenceTest.cpp b/src/shogun/statistical_testing/IndependenceTest.cpp
new file mode 100644
index 00000000000..77b8013a401
--- /dev/null
+++ b/src/shogun/statistical_testing/IndependenceTest.cpp
@@ -0,0 +1,79 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <shogun/kernel/Kernel.h>
+#include <shogun/statistical_testing/IndependenceTest.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/TestTypes.h>
+
+using namespace shogun;
+using namespace internal;
+
+struct CIndependenceTest::Self
+{
+	Self(index_t num_kernels);
+	KernelManager kernel_mgr;
+};
+
+CIndependenceTest::Self::Self(index_t num_kernels) : kernel_mgr(num_kernels)
+{
+}
+
+CIndependenceTest::CIndependenceTest() : CTwoDistributionTest()
+{
+	self=std::unique_ptr<Self>(new Self(IndependenceTest::num_kernels));
+}
+
+CIndependenceTest::~CIndependenceTest()
+{
+}
+
+void CIndependenceTest::set_kernel_p(CKernel* kernel_p)
+{
+	self->kernel_mgr.kernel_at(0)=kernel_p;
+}
+
+CKernel* CIndependenceTest::get_kernel_p() const
+{
+	return self->kernel_mgr.kernel_at(0);
+}
+
+void CIndependenceTest::set_kernel_q(CKernel* kernel_q)
+{
+	self->kernel_mgr.kernel_at(1)=kernel_q;
+}
+
+CKernel* CIndependenceTest::get_kernel_q() const
+{
+	return self->kernel_mgr.kernel_at(1);
+}
+
+const char* CIndependenceTest::get_name() const
+{
+	return "IndependenceTest";
+}
+
+KernelManager& CIndependenceTest::get_kernel_mgr()
+{
+	return self->kernel_mgr;
+}
+
+const KernelManager& CIndependenceTest::get_kernel_mgr() const
+{
+	return self->kernel_mgr;
+}
diff --git a/src/shogun/statistical_testing/IndependenceTest.h b/src/shogun/statistical_testing/IndependenceTest.h
new file mode 100644
index 00000000000..492fd46d998
--- /dev/null
+++ b/src/shogun/statistical_testing/IndependenceTest.h
@@ -0,0 +1,105 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef INDEPENDENCE_TEST_H_
+#define INDEPENDENCE_TEST_H_
+
+#include <memory>
+#include <shogun/statistical_testing/TwoDistributionTest.h>
+
+namespace shogun
+{
+
+class CKernel;
+
+namespace internal
+{
+	class KernelManager;
+}
+
+/**
+ * @brief Provides an interface for performing the independence test.
+ * Given samples \f$Z=\{(x_i,y_i)\}_{i=1}^m\f$ from the joint distribution
+ * \f$\textbf{P}_{xy}\f$, whether the joint distribution factorize as
+ * \f$\textbf{P}_{xy}=\textbf{P}_x\textbf{P}_y\f$, i.e. product of the marginals.
+ * The null-hypothesis says yes, i.e. no dependence, the alternative hypothesis
+ * says no.
+ *
+ * Abstract base class. Provides all interfaces and implements approximating
+ * the null distribution via permutation, i.e. shuffling the samples from
+ * one distribution repeatedly using subsets while keeping the samples from
+ * the other distribution in its original order
+ *
+ */
+class CIndependenceTest : public CTwoDistributionTest
+{
+public:
+	/** Default constructor */
+	CIndependenceTest();
+
+	/** Destructor */
+	virtual ~CIndependenceTest();
+
+	/**
+	 * Method that sets the kernel to be used for performing the test for the
+	 * samples from p.
+	 *
+	 * @param kernel_p The kernel instance to be used for samples from p
+	 */
+	void set_kernel_p(CKernel* kernel_p);
+
+	/** @return The kernel instance that is used for samples from p */
+	CKernel* get_kernel_p() const;
+
+	/**
+	 * Method that sets the kernel to be used for performing the test for the
+	 * samples from q.
+	 *
+	 * @param kernel_q The kernel instance to be used for samples from q
+	 */
+	void set_kernel_q(CKernel* kernel_q);
+
+	/** @return The kernel instance that is used for samples from q */
+	CKernel* get_kernel_q() const;
+
+	/**
+	 * Interface for computing the test-statistic for the hypothesis test.
+	 *
+	 * @return test statistic for the given data/parameters/methods
+	 */
+	virtual float64_t compute_statistic()=0;
+
+	/**
+	 * Interface for computing the samples under the null-hypothesis.
+	 *
+	 * @return vector of all statistics
+	 */
+	virtual SGVector<float64_t> sample_null()=0;
+
+	/** @return The name of the class */
+	virtual const char* get_name() const;
+protected:
+	internal::KernelManager& get_kernel_mgr();
+	const internal::KernelManager& get_kernel_mgr() const;
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+};
+
+}
+#endif // INDEPENDENCE_TEST_H_
diff --git a/src/shogun/statistical_testing/LinearTimeMMD.cpp b/src/shogun/statistical_testing/LinearTimeMMD.cpp
new file mode 100644
index 00000000000..c1e9fac7de5
--- /dev/null
+++ b/src/shogun/statistical_testing/LinearTimeMMD.cpp
@@ -0,0 +1,152 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <shogun/io/SGIO.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/mathematics/Statistics.h>
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/statistical_testing/LinearTimeMMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/mmd/WithinBlockDirect.h>
+
+using namespace shogun;
+using namespace internal;
+
+CLinearTimeMMD::CLinearTimeMMD() : CStreamingMMD()
+{
+}
+
+CLinearTimeMMD::CLinearTimeMMD(CFeatures* samples_from_p, CFeatures* samples_from_q) : CStreamingMMD()
+{
+	set_p(samples_from_p);
+	set_q(samples_from_q);
+}
+
+CLinearTimeMMD::~CLinearTimeMMD()
+{
+}
+
+void CLinearTimeMMD::set_num_blocks_per_burst(index_t num_blocks_per_burst)
+{
+	auto& data_mgr=get_data_mgr();
+	auto min_blocksize=data_mgr.get_min_blocksize();
+	if (min_blocksize==2)
+	{
+		// only possible when number of samples from both the distributions are the same
+		auto N=data_mgr.num_samples_at(0);
+		for (auto i=2; i<N; ++i)
+		{
+			if (N%i==0)
+			{
+				min_blocksize=i*2;
+				break;
+			}
+		}
+	}
+	data_mgr.set_blocksize(min_blocksize);
+	data_mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+	SG_SDEBUG("Block contains %d and %d samples, from P and Q respectively!\n", data_mgr.blocksize_at(0), data_mgr.blocksize_at(1));
+}
+
+const std::function<float32_t(SGMatrix<float32_t>)> CLinearTimeMMD::get_direct_estimation_method() const
+{
+	return mmd::WithinBlockDirect();
+}
+
+float64_t CLinearTimeMMD::normalize_statistic(float64_t statistic) const
+{
+	const DataManager& data_mgr = get_data_mgr();
+	const index_t Nx = data_mgr.num_samples_at(0);
+	const index_t Ny = data_mgr.num_samples_at(1);
+	return CMath::sqrt(Nx * Ny / float64_t(Nx + Ny)) * statistic;
+}
+
+const float64_t CLinearTimeMMD::normalize_variance(float64_t variance) const
+{
+	const DataManager& data_mgr = get_data_mgr();
+	const index_t Bx = data_mgr.blocksize_at(0);
+	const index_t By = data_mgr.blocksize_at(1);
+	const index_t B = Bx + By;
+	if (get_statistic_type() == ST_UNBIASED_INCOMPLETE)
+	{
+		return variance * B * (B - 2) / 16;
+	}
+	return variance * Bx * By * (Bx - 1) * (By - 1) / (B - 1) / (B - 2);
+}
+
+const float64_t CLinearTimeMMD::gaussian_variance(float64_t variance) const
+{
+	const DataManager& data_mgr = get_data_mgr();
+	const index_t Bx = data_mgr.blocksize_at(0);
+	const index_t By = data_mgr.blocksize_at(1);
+	const index_t B = Bx + By;
+	if (get_statistic_type() == ST_UNBIASED_INCOMPLETE)
+	{
+		return variance * 4 / (B - 2);
+	}
+	return variance * (B - 1) * (B - 2) / (Bx - 1) / (By - 1) / B;
+}
+
+float64_t CLinearTimeMMD::compute_p_value(float64_t statistic)
+{
+	float64_t result = 0;
+	switch (get_null_approximation_method())
+	{
+		case NAM_MMD1_GAUSSIAN:
+		{
+			float64_t sigma_sq = gaussian_variance(compute_variance());
+			float64_t std_dev = CMath::sqrt(sigma_sq);
+			result = 1.0 - CStatistics::normal_cdf(statistic, std_dev);
+			break;
+		}
+		default:
+		{
+			result = CHypothesisTest::compute_p_value(statistic);
+			break;
+		}
+	}
+	return result;
+}
+
+float64_t CLinearTimeMMD::compute_threshold(float64_t alpha)
+{
+	float64_t result = 0;
+	switch (get_null_approximation_method())
+	{
+		case NAM_MMD1_GAUSSIAN:
+		{
+			float64_t sigma_sq = gaussian_variance(compute_variance());
+			float64_t std_dev = CMath::sqrt(sigma_sq);
+			result = 1.0 - CStatistics::inverse_normal_cdf(1 - alpha, 0, std_dev);
+			break;
+		}
+		default:
+		{
+			result = CHypothesisTest::compute_threshold(alpha);
+			break;
+		}
+	}
+	return result;
+}
+
+const char* CLinearTimeMMD::get_name() const
+{
+	return "LinearTimeMMD";
+}
diff --git a/src/shogun/statistical_testing/LinearTimeMMD.h b/src/shogun/statistical_testing/LinearTimeMMD.h
new file mode 100644
index 00000000000..fba6013d31e
--- /dev/null
+++ b/src/shogun/statistical_testing/LinearTimeMMD.h
@@ -0,0 +1,49 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef LINEAR_TIME_MMD_H_
+#define LINEAR_TIME_MMD_H_
+
+#include <shogun/statistical_testing/StreamingMMD.h>
+
+namespace shogun
+{
+
+class CLinearTimeMMD : public CStreamingMMD
+{
+public:
+	typedef std::function<float32_t(SGMatrix<float32_t>)> operation;
+	CLinearTimeMMD();
+	CLinearTimeMMD(CFeatures* samples_from_p, CFeatures* samples_from_q);
+	virtual ~CLinearTimeMMD();
+
+	void set_num_blocks_per_burst(index_t num_blocks_per_burst);
+
+	virtual float64_t compute_p_value(float64_t statistic);
+	virtual float64_t compute_threshold(float64_t alpha);
+
+	virtual const char* get_name() const;
+private:
+	virtual const operation get_direct_estimation_method() const;
+	virtual float64_t normalize_statistic(float64_t statistic) const;
+	virtual const float64_t normalize_variance(float64_t variance) const;
+	const float64_t gaussian_variance(float64_t variance) const;
+};
+
+}
+#endif // LINEAR_TIME_MMD_H_
diff --git a/src/shogun/statistical_testing/MMD.cpp b/src/shogun/statistical_testing/MMD.cpp
new file mode 100644
index 00000000000..6ede25d88aa
--- /dev/null
+++ b/src/shogun/statistical_testing/MMD.cpp
@@ -0,0 +1,162 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/kernel/Kernel.h>
+#include <shogun/features/Features.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/mathematics/eigen3.h>
+
+using namespace shogun;
+using namespace internal;
+using std::unique_ptr;
+using std::shared_ptr;
+
+struct CMMD::Self
+{
+	Self()
+	{
+		num_null_samples = DEFAULT_NUM_NULL_SAMPLES;
+		stype = DEFAULT_STYPE;
+		null_approximation_method = DEFAULT_NULL_APPROXIMATION_METHOD;
+		strategy=unique_ptr<CKernelSelectionStrategy>(new CKernelSelectionStrategy());
+	}
+
+	index_t num_null_samples;
+	EStatisticType stype;
+	ENullApproximationMethod null_approximation_method;
+	std::unique_ptr<CKernelSelectionStrategy> strategy;
+
+	static constexpr index_t DEFAULT_NUM_NULL_SAMPLES = 250;
+	static constexpr EStatisticType DEFAULT_STYPE = ST_UNBIASED_FULL;
+	static constexpr ENullApproximationMethod DEFAULT_NULL_APPROXIMATION_METHOD = NAM_PERMUTATION;
+};
+
+CMMD::CMMD() : CTwoSampleTest()
+{
+	init();
+}
+
+CMMD::CMMD(CFeatures* samples_from_p, CFeatures* samples_from_q) : CTwoSampleTest(samples_from_p, samples_from_q)
+{
+	init();
+}
+
+void CMMD::init()
+{
+#if EIGEN_VERSION_AT_LEAST(3,1,0)
+	Eigen::initParallel();
+#endif
+	self=unique_ptr<Self>(new Self());
+}
+
+CMMD::~CMMD()
+{
+	cleanup();
+}
+
+void CMMD::set_kernel_selection_strategy(EKernelSelectionMethod method, bool weighted)
+{
+	self->strategy->use_method(method)
+		.use_weighted(weighted);
+}
+
+void CMMD::set_kernel_selection_strategy(EKernelSelectionMethod method, index_t num_runs,
+	index_t num_folds, float64_t alpha)
+{
+	self->strategy->use_method(method)
+		.use_num_runs(num_runs)
+		.use_num_folds(num_folds)
+		.use_alpha(alpha);
+}
+
+CKernelSelectionStrategy const * CMMD::get_kernel_selection_strategy() const
+{
+	return self->strategy.get();
+}
+
+void CMMD::add_kernel(CKernel* kernel)
+{
+	self->strategy->add_kernel(kernel);
+}
+
+void CMMD::select_kernel()
+{
+	SG_DEBUG("Entering!\n");
+	auto& data_mgr=get_data_mgr();
+	data_mgr.set_train_mode(true);
+	CMMD::set_kernel(self->strategy->select_kernel(this));
+	data_mgr.set_train_mode(false);
+	SG_DEBUG("Leaving!\n");
+}
+
+void CMMD::cleanup()
+{
+	get_kernel_mgr().restore_kernel_at(0);
+}
+
+void CMMD::set_num_null_samples(index_t null_samples)
+{
+	self->num_null_samples=null_samples;
+}
+
+index_t CMMD::get_num_null_samples() const
+{
+	return self->num_null_samples;
+}
+
+void CMMD::set_statistic_type(EStatisticType stype)
+{
+	self->stype=stype;
+}
+
+EStatisticType CMMD::get_statistic_type() const
+{
+	return self->stype;
+}
+
+void CMMD::set_null_approximation_method(ENullApproximationMethod nmethod)
+{
+	self->null_approximation_method=nmethod;
+}
+
+ENullApproximationMethod CMMD::get_null_approximation_method() const
+{
+	return self->null_approximation_method;
+}
+
+const char* CMMD::get_name() const
+{
+	return "MMD";
+}
diff --git a/src/shogun/statistical_testing/MMD.h b/src/shogun/statistical_testing/MMD.h
new file mode 100644
index 00000000000..1da59ae14b3
--- /dev/null
+++ b/src/shogun/statistical_testing/MMD.h
@@ -0,0 +1,256 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef MMD_H_
+#define MMD_H_
+
+#include <utility>
+#include <memory>
+#include <functional>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/TwoSampleTest.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CKernelSelectionStrategy;
+template <typename> class SGVector;
+
+/** @brief Abstract base class that provides an interface for performing kernel
+ * two-sample test using Maximum Mean Discrepancy (MMD) as the test statistic.
+ * The MMD is the distance of two probability distributions \f$p\f$ and \f$q\f$
+ * in a RKHS (see [1] for formal description).
+ *
+ * \f[
+ * \text{MMD}[\mathcal{F},p,q]^2=||\mu_p - \mu_q||^2_\mathcal{F}=
+ * \textbf{E}_{x,x'}\left[ k(x,x')\right]
+ * -2\textbf{E}_{x,y}\left[ k(x,y)\right]
+ * +\textbf{E}_{y,y'}\left[ k(y,y')\right]
+ * \f]
+ *
+ * where \f$x,x'\sim p\f$ and \f$y,y'\sim q\f$.
+ *
+ * Given two sets of samples \f$\{x_i\}_{i=1}^{n_x}\sim p\f$ and
+ * \f$\{y_i\}_{i=1}^{n_y}\sim q\f$, \f$n_x+n_y=n\f$,
+ * the unbiased estimate of the above statistic is computed as
+ * \f[
+ * 	\hat{\eta}_{k,U}=\frac{1}{n_x(n_x-1)}\sum_{i=1}^{n_x}\sum_{j\neq i}
+ * 	k(x_i,x_j)+\frac{1}{n_y(n_y-1)}\sum_{i=1}^{n_y}\sum_{j\neq i}k(y_i,y_j)
+ * 	-\frac{2}{n_xn_y}\sum_{i=1}^{n_x}\sum_{j=1}^{n_y}k(x_i,y_j)
+ * \f]
+ *
+ * A biased version is
+ * \f[
+ * 	\hat{\eta}_{k,V}=\frac{1}{n_x^2}\sum_{i=1}^{n_x}\sum_{j=1}^{n_x}
+ * 	k(x_i,x_j)+\frac{1}{n_y^2}\sum_{i=1}^{n_y}\sum_{j=1}^{n_y}k(y_i,y_j)
+ * 	-\frac{2}{n_xn_y}\sum_{i=1}^{n_x}\sum_{j=1}^{n_y}k(x_i,y_j)
+ * \f]
+ *
+ * When \f$n_x=n_y=\frac{n}{2}\f$, an incomplete version can also be computed
+ * as the following
+ * \f[
+ * 	\hat{\eta}_{k,U^-}=\frac{1}{\frac{n}{2}(\frac{n}{2}-1)}\sum_{i\neq j}
+ * 	h(z_i,z_j)
+ * \f]
+ * where for each pair \f$z=(x,y)\f$, \f$h(z,z')=k(x,x')+k(y,y')-k(x,y')-
+ * k(x',y)\f$.
+ *
+ * The type (biased/unbiased/incomplete) can be selected via set_statistic_type()
+ * via the enum values from EStatisticType, ST_BIASED, ST_UNBIASED and ST_INCOMPLETE,
+ * respectively. The estimate returned by compute_statistic()
+ * is \f$\frac{n_xn_y}{n_x+n_y}\hat{\eta}_k\f$.
+ *
+ * This class provides an interface for adding multiple kernels and then
+ * selecting the best kernel based on specified strategies. To know more in details
+ * about various learning algorithms for optimal kernel selection, please refer to [2].
+ *
+ * Along with the statistic comes a method to compute a p-value based on
+ * different methods. Permutation test is possible. If unsure which one to
+ * use, sampling with 250 permutation iterations usually always is correct.
+ *
+ * To choose, use set_null_approximation_method() and choose from.
+ *
+ * NAM_MMD2_SPECTRUM: For a fast, consistent test based on the spectrum of
+ * the kernel matrix, as described in [2]. Only supported if Eigen3 is installed.
+ * Only applicable for CQuadraticTimeMMD.
+ *
+ * NAM_MMD2_GAMMA: for a very fast, but not consistent test based on moment matching
+ * of a Gamma distribution, as described in [2].
+ * Only applicable for CQuadraticTimeMMD.
+ *
+ * NAM_PERMUTATION: For permuting available samples to sample null-distribution
+ *
+ * [1]: Gretton, A., Borgwardt, K. M., Rasch, M. J., Schoelkopf, B., &
+ * Smola, A. (2012). A Kernel Two-Sample Test. Journal of Machine Learning
+ * Research, 13, 671-721.
+ *
+ * [2] Arthur Gretton, Bharath K. Sriperumbudur, Dino Sejdinovic, Heiko Strathmann,
+ * Sivaraman Balakrishnan, Massimiliano Pontil, Kenji Fukumizu: Optimal kernel choice
+ * for large-scale two-sample tests. NIPS 2012: 1214-1222.
+ */
+class CMMD : public CTwoSampleTest
+{
+public:
+	/** Default constructor */
+	CMMD();
+
+	/**
+	 * Convenience constructor that initializes the samples from two distributions.
+	 *
+	 * @param samples_from_p Samples from \f$p\f$
+	 * @param samples_from_q Samples from \f$q\f$
+	 */
+	CMMD(CFeatures* samples_from_p, CFeatures* samples_from_q);
+
+	/** Destructor */
+	virtual ~CMMD();
+
+	/**
+	 * Method that sets the specific kernel selection strategy based on the
+	 * specific parameters provided. Please see class documentation for details.
+	 * Use this method for every other strategy other than KSM_CROSS_VALIDATION.
+	 *
+	 * @param method The kernel selection method as specified in EKernelSelectionMethod.
+	 * @param weighted If true, then an weighted combination of the kernel is used after
+	 * solving an optimization. If false, only a single kernel is selected among the
+	 * provided ones.
+	 */
+	void set_kernel_selection_strategy(EKernelSelectionMethod method, bool weighted = false);
+
+	/**
+	 * Method that sets the specific kernel selection strategy based on the
+	 * specific parameters provided. Please see class documentation for details.
+	 * Use this method for KSM_CROSS_VALIDATION.
+	 *
+	 * @param method The kernel selection method as specified in EKernelSelectionMethod.
+	 * @param num_runs The number of total runs of the cross-validation algorithm.
+	 * @param num_folds The number of folds (k) to be used in k-fold stratified cross-validation.
+	 * @param alpha The threshold to be used while performing test for the test-folds.
+	 */
+	void set_kernel_selection_strategy(EKernelSelectionMethod method, index_t num_runs, index_t num_folds, float64_t alpha);
+
+	/**
+	 * Method that adds a kernel instance to be used for kernel selection. Please
+	 * note that the kernels added by this method are NOT set as the main test kernel
+	 * unless select_kernel() method is executed.
+	 *
+	 * This method is NOT thread safe. Please DO NOT use this method from multiple threads.
+	 *
+	 * @param kernel One of the kernel instances with which learning algorithm will work.
+	 */
+	void add_kernel(CKernel *kernel);
+
+	/**
+	 * Method that selects/learns the kernel based on the defined kernel selection strategy.
+	 * If no explicit kernel selection strategy was set using set_kernel_selection_strategy()
+	 * method, then a default strategy is used. Please see EKernelSelectionMethod for the
+	 * default strategy.
+	 *
+	 * This method is NOT thread safe. It replaces the internel kernel set by set_kernel()
+	 * method, if there was any. Please DO NOT use this method from multiple threads.
+	 *
+	 * The learned/selected kernel can be obtained from a subsequent get_kernel() call.
+	 *
+	 * This method expects train-test mode to be turned on at the time of invocation. Please
+	 * see the class documentation of CHypothesisTest.
+	 */
+	virtual void select_kernel();
+
+	/**
+	 * Method that returns the kernel selection strategy wrapper object that will be/
+	 * was used in the last kernel learning algorithm. Use this method when results of
+	 * intermediate steps taken by the kernel selection algorithms are of interest.
+	 *
+	 * @return The internal instance of CKernelSelectionStrategy that holds intermediate
+	 * measures computed at the time of the last kernel selection algorithm invocation.
+	 */
+	CKernelSelectionStrategy const * get_kernel_selection_strategy() const;
+
+	/**
+	 * Interface for computing the test-statistic for the hypothesis test.
+	 *
+	 * @return test statistic for the given data/parameters/methods
+	 */
+	virtual float64_t compute_statistic() = 0;
+
+	/**
+	 * Interface for computing the samples under the null-hypothesis.
+	 *
+	 * @return vector of all statistics
+	 */
+	virtual SGVector<float64_t> sample_null() = 0;
+
+	/** Method that releases the pre-computed kernel that is used in the computation. */
+	void cleanup();
+
+	/**
+	 * Method that sets the number of null-samples used for computing p-value.
+	 *
+	 * @param null_samples Number of null-samples.
+	 */
+	void set_num_null_samples(index_t null_samples);
+
+	/** @return Number of null-samples */
+	index_t get_num_null_samples() const;
+
+	/**
+	 * Method that sets the type of the estimator for MMD^2
+	 *
+	 * @param stype The type of the estimator for MMD^2
+	 */
+	void set_statistic_type(EStatisticType stype);
+
+	/** @return The type of the estimator for MMD^2 */
+	EStatisticType get_statistic_type() const;
+
+	/**
+	 * Method that sets the approach to be taken while approximating the null-samples.
+	 *
+	 * @nmethod The null-approximation method
+	 */
+	void set_null_approximation_method(ENullApproximationMethod nmethod);
+
+	/** @return The null-approximation method */
+	ENullApproximationMethod get_null_approximation_method() const;
+
+	/** @return The name of this class */
+	virtual const char* get_name() const;
+protected:
+	virtual float64_t normalize_statistic(float64_t statistic) const = 0;
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+	void init();
+};
+
+}
+#endif // MMD_H_
diff --git a/src/shogun/statistical_testing/MultiKernelQuadraticTimeMMD.cpp b/src/shogun/statistical_testing/MultiKernelQuadraticTimeMMD.cpp
new file mode 100644
index 00000000000..67fb0327812
--- /dev/null
+++ b/src/shogun/statistical_testing/MultiKernelQuadraticTimeMMD.cpp
@@ -0,0 +1,308 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/io/SGIO.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/kernel/ShiftInvariantKernel.h>
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+#include <shogun/statistical_testing/internals/mmd/VarianceH1.h>
+#include <shogun/statistical_testing/internals/mmd/PermutationMMD.h>
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+using std::unique_ptr;
+
+struct CMultiKernelQuadraticTimeMMD::Self
+{
+	Self(CQuadraticTimeMMD* owner);
+	void update_pairwise_distance(CDistance *distance);
+
+	CQuadraticTimeMMD *m_owner;
+	unique_ptr<CCustomDistance> m_pairwise_distance;
+	EDistanceType m_dtype;
+	KernelManager m_kernel_mgr;
+	ComputeMMD statistic_job;
+	VarianceH1 variance_h1_job;
+	PermutationMMD permutation_job;
+};
+
+CMultiKernelQuadraticTimeMMD::Self::Self(CQuadraticTimeMMD *owner) : m_owner(owner),
+	m_pairwise_distance(nullptr), m_dtype(D_UNKNOWN)
+{
+}
+
+void CMultiKernelQuadraticTimeMMD::Self::update_pairwise_distance(CDistance* distance)
+{
+	ASSERT(distance);
+	if (m_dtype==distance->get_distance_type())
+	{
+		ASSERT(m_pairwise_distance!=nullptr);
+		SG_SINFO("Precomputed distance exists for %s!\n", distance->get_name());
+	}
+	else
+	{
+		auto precomputed_distance=m_owner->compute_joint_distance(distance);
+		m_pairwise_distance=unique_ptr<CCustomDistance>(precomputed_distance);
+		m_dtype=distance->get_distance_type();
+	}
+}
+
+CMultiKernelQuadraticTimeMMD::CMultiKernelQuadraticTimeMMD() : CSGObject()
+{
+	self=unique_ptr<Self>(new Self(nullptr));
+}
+
+CMultiKernelQuadraticTimeMMD::CMultiKernelQuadraticTimeMMD(CQuadraticTimeMMD* owner) : CSGObject()
+{
+	self=unique_ptr<Self>(new Self(owner));
+}
+
+CMultiKernelQuadraticTimeMMD::~CMultiKernelQuadraticTimeMMD()
+{
+	cleanup();
+}
+
+void CMultiKernelQuadraticTimeMMD::add_kernel(CShiftInvariantKernel *kernel)
+{
+	ASSERT(self->m_owner);
+	REQUIRE(kernel, "Kernel instance cannot be NULL!\n");
+	self->m_kernel_mgr.push_back(kernel);
+}
+
+void CMultiKernelQuadraticTimeMMD::cleanup()
+{
+	self->m_kernel_mgr.clear();
+	invalidate_precomputed_distance();
+}
+
+void CMultiKernelQuadraticTimeMMD::invalidate_precomputed_distance()
+{
+	self->m_pairwise_distance=nullptr;
+	self->m_dtype=D_UNKNOWN;
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::compute_statistic()
+{
+	ASSERT(self->m_owner);
+	return statistic(self->m_kernel_mgr);
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::compute_variance_h0()
+{
+	ASSERT(self->m_owner);
+	SG_NOTIMPLEMENTED;
+	return SGVector<float64_t>();
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::compute_variance_h1()
+{
+	ASSERT(self->m_owner);
+	return variance_h1(self->m_kernel_mgr);
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::compute_test_power()
+{
+	ASSERT(self->m_owner);
+	return test_power(self->m_kernel_mgr);
+}
+
+SGMatrix<float32_t> CMultiKernelQuadraticTimeMMD::sample_null()
+{
+	ASSERT(self->m_owner);
+	return sample_null(self->m_kernel_mgr);
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::compute_p_value()
+{
+	ASSERT(self->m_owner);
+	return p_values(self->m_kernel_mgr);
+}
+
+SGVector<bool> CMultiKernelQuadraticTimeMMD::perform_test(float64_t alpha)
+{
+	SGVector<float64_t> pvalues=compute_p_value();
+	SGVector<bool> rejections(pvalues.size());
+	for (auto i=0; i<pvalues.size(); ++i)
+	{
+		rejections[i]=pvalues[i]<alpha;
+	}
+	return rejections;
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::statistic(const KernelManager& kernel_mgr)
+{
+	SG_DEBUG("Entering");
+	REQUIRE(kernel_mgr.num_kernels()>0, "Number of kernels (%d) have to be greater than 0!\n", kernel_mgr.num_kernels());
+
+	const auto nx=self->m_owner->get_num_samples_p();
+	const auto ny=self->m_owner->get_num_samples_q();
+	const auto stype = self->m_owner->get_statistic_type();
+
+	CDistance* distance=kernel_mgr.get_distance_instance();
+	self->update_pairwise_distance(distance);
+	kernel_mgr.set_precomputed_distance(self->m_pairwise_distance.get());
+	SG_UNREF(distance);
+
+	self->statistic_job.m_n_x=nx;
+   	self->statistic_job.m_n_y=ny;
+   	self->statistic_job.m_stype=stype;
+	SGVector<float64_t> result=self->statistic_job(kernel_mgr);
+
+	kernel_mgr.unset_precomputed_distance();
+
+	for (auto i=0; i<result.vlen; ++i)
+		result[i]=self->m_owner->normalize_statistic(result[i]);
+
+	SG_DEBUG("Leaving");
+	return result;
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::variance_h1(const KernelManager& kernel_mgr)
+{
+	SG_DEBUG("Entering");
+	REQUIRE(kernel_mgr.num_kernels()>0, "Number of kernels (%d) have to be greater than 0!\n", kernel_mgr.num_kernels());
+
+	const auto nx=self->m_owner->get_num_samples_p();
+	const auto ny=self->m_owner->get_num_samples_q();
+
+	CDistance* distance=kernel_mgr.get_distance_instance();
+	self->update_pairwise_distance(distance);
+	kernel_mgr.set_precomputed_distance(self->m_pairwise_distance.get());
+	SG_UNREF(distance);
+
+	self->variance_h1_job.m_n_x=nx;
+   	self->variance_h1_job.m_n_y=ny;
+	SGVector<float64_t> result=self->variance_h1_job(kernel_mgr);
+
+	kernel_mgr.unset_precomputed_distance();
+
+	SG_DEBUG("Leaving");
+	return result;
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::test_power(const KernelManager& kernel_mgr)
+{
+	SG_DEBUG("Entering");
+	REQUIRE(kernel_mgr.num_kernels()>0, "Number of kernels (%d) have to be greater than 0!\n", kernel_mgr.num_kernels());
+	REQUIRE(self->m_owner->get_statistic_type()==ST_UNBIASED_FULL, "Only possible with UNBIASED_FULL!\n");
+
+	const auto nx=self->m_owner->get_num_samples_p();
+	const auto ny=self->m_owner->get_num_samples_q();
+
+	CDistance* distance=kernel_mgr.get_distance_instance();
+	self->update_pairwise_distance(distance);
+	kernel_mgr.set_precomputed_distance(self->m_pairwise_distance.get());
+	SG_UNREF(distance);
+
+	self->variance_h1_job.m_n_x=nx;
+   	self->variance_h1_job.m_n_y=ny;
+	SGVector<float64_t> result=self->variance_h1_job.test_power(kernel_mgr);
+
+	kernel_mgr.unset_precomputed_distance();
+
+	SG_DEBUG("Leaving");
+	return result;
+}
+
+SGMatrix<float32_t> CMultiKernelQuadraticTimeMMD::sample_null(const KernelManager& kernel_mgr)
+{
+	SG_DEBUG("Entering");
+	REQUIRE(self->m_owner->get_null_approximation_method()==NAM_PERMUTATION,
+		"Multi-kernel tests requires the H0 approximation method to be PERMUTATION!\n");
+
+	REQUIRE(kernel_mgr.num_kernels()>0, "Number of kernels (%d) have to be greater than 0!\n", kernel_mgr.num_kernels());
+
+	const auto nx=self->m_owner->get_num_samples_p();
+	const auto ny=self->m_owner->get_num_samples_q();
+	const auto stype = self->m_owner->get_statistic_type();
+	const auto num_null_samples = self->m_owner->get_num_null_samples();
+
+	CDistance* distance=kernel_mgr.get_distance_instance();
+	self->update_pairwise_distance(distance);
+	kernel_mgr.set_precomputed_distance(self->m_pairwise_distance.get());
+	SG_UNREF(distance);
+
+	self->permutation_job.m_n_x=nx;
+	self->permutation_job.m_n_y=ny;
+   	self->permutation_job.m_num_null_samples=num_null_samples;
+	self->permutation_job.m_stype=stype;
+	SGMatrix<float32_t> result=self->permutation_job(kernel_mgr);
+
+	kernel_mgr.unset_precomputed_distance();
+
+	for (size_t i=0; i<result.size(); ++i)
+		result.matrix[i]=self->m_owner->normalize_statistic(result.matrix[i]);
+
+	SG_DEBUG("Leaving");
+	return result;
+}
+
+SGVector<float64_t> CMultiKernelQuadraticTimeMMD::p_values(const KernelManager& kernel_mgr)
+{
+	SG_DEBUG("Entering");
+	REQUIRE(self->m_owner->get_null_approximation_method()==NAM_PERMUTATION,
+		"Multi-kernel tests requires the H0 approximation method to be PERMUTATION!\n");
+
+	REQUIRE(kernel_mgr.num_kernels()>0, "Number of kernels (%d) have to be greater than 0!\n", kernel_mgr.num_kernels());
+
+	const auto nx=self->m_owner->get_num_samples_p();
+	const auto ny=self->m_owner->get_num_samples_q();
+	const auto stype = self->m_owner->get_statistic_type();
+	const auto num_null_samples = self->m_owner->get_num_null_samples();
+
+	CDistance* distance=kernel_mgr.get_distance_instance();
+	self->update_pairwise_distance(distance);
+	kernel_mgr.set_precomputed_distance(self->m_pairwise_distance.get());
+	SG_UNREF(distance);
+
+	self->permutation_job.m_n_x=nx;
+	self->permutation_job.m_n_y=ny;
+   	self->permutation_job.m_num_null_samples=num_null_samples;
+	self->permutation_job.m_stype=stype;
+	SGVector<float64_t> result=self->permutation_job.p_value(kernel_mgr);
+
+	kernel_mgr.unset_precomputed_distance();
+
+	SG_DEBUG("Leaving");
+	return result;
+}
+
+const char* CMultiKernelQuadraticTimeMMD::get_name() const
+{
+	return "MultiKernelQuadraticTimeMMD";
+}
diff --git a/src/shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h b/src/shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h
new file mode 100644
index 00000000000..71997025a6f
--- /dev/null
+++ b/src/shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h
@@ -0,0 +1,176 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef MULTI_KERNEL_QUADRATIC_TIME_MMD_H_
+#define MULTI_KERNEL_QUADRATIC_TIME_MMD_H_
+
+#include <memory>
+#include <shogun/base/SGObject.h>
+
+namespace shogun
+{
+
+class CFeatures;
+class CQuadraticTimeMMD;
+class CShiftInvariantKernel;
+template <typename> class SGVector;
+
+namespace internal
+{
+class KernelManager;
+class MaxMeasure;
+class MaxTestPower;
+}
+
+/**
+ * @brief Class that performs quadratic time MMD test optimized for multiple
+ * shift-invariant kernels. If the kernels are not shift-invariant, then the
+ * class CQuadraticTimeMMD should be used multiple times instead of this one.
+ *
+ * If the features are updated, then (if any) existing precomputed distance
+ * instance has to be invalidated by the owner (CQuadraticTimeMMD instance).
+ * This is already taken care of internally. A separate instance of this class
+ * should never be created by invoking the constructor. One should always
+ * call the CQuadraticTimeMMD::multikernel() method to get an instance of this
+ * class.
+ */
+class CMultiKernelQuadraticTimeMMD : public CSGObject
+{
+	friend class CQuadraticTimeMMD;
+	friend class internal::MaxMeasure;
+	friend class internal::MaxTestPower;
+private:
+	CMultiKernelQuadraticTimeMMD(CQuadraticTimeMMD* owner);
+public:
+	/**
+	 * Default constructor. Should never be invoked by the user. Please use
+	 * CQuadraticTimeMMD::multikernel() to obtain an instance of this class.
+	 */
+	CMultiKernelQuadraticTimeMMD();
+
+	/** Destructor */
+	virtual ~CMultiKernelQuadraticTimeMMD();
+
+	/**
+	 * Method that adds instances of shift-invariant kernels (e.g. CGaussianKernel).
+	 * Invoke multiple times to add desired number of kernels. All the estimators
+	 * obtianed from the computation will be in the same order the kernels were
+	 * added.
+	 *
+	 * @param kernel The kernel instance.
+	 */
+	void add_kernel(CShiftInvariantKernel *kernel);
+
+	/**
+	 * Method that does internal cleanups (essentially releases memory from the
+	 * internally stored pair-wise distance instance.
+	 */
+	void cleanup();
+
+	/**
+	 * Method that returns normalized estimates of the MMD^2 for all the kernels.
+	 *
+	 * @return A vector of values for normalized estimates of the MMD^2 for all
+	 * the kernels.
+	 */
+	SGVector<float64_t> compute_statistic();
+
+	/**
+	 * Method that returns variance estimates of the unbiased MMD^2 estimator
+	 * for all the kernels under the assumption that null-hypothesis was true.
+	 *
+	 * @return A vector of values for variance estimates of the unbiased MMD^2
+	 * estimator for all the kernels under null.
+	 */
+	SGVector<float64_t> compute_variance_h0();
+
+	/**
+	 * Method that returns variance estimates of the unbiased MMD^2 estimator
+	 * for all the kernels under the assumption that alternative-hypothesis was true.
+	 *
+	 * @return A vector of values for variance estimates of the unbiased MMD^2
+	 * estimator for all the kernels under alternative.
+	 */
+	SGVector<float64_t> compute_variance_h1();
+
+	/**
+	 * Method that returns proxy measures of the test-power computed as the
+	 * ratio of the unbiased MMD^2 estimator and sqrt of the variance estimate
+	 * of it under alternative.
+	 *
+	 * @return A vector of values for proxy measures of test-power for all kernels.
+	 */
+	SGVector<float64_t> compute_test_power();
+
+	/*
+	 * Method that computes the null-samples for all the kernels, one column per kernel.
+	 * This method uses permutation as the null-approximation technique.
+	 *
+	 * @return Null-samples for all the kernels.
+	 */
+	SGMatrix<float32_t> sample_null();
+
+	/**
+	 * Method that computes the p-values for all the kernels. The API is different
+	 * here than CQuadraticTimeMMD since the test-statistics for the kernels are computed
+	 * internally on the fly. This method uses permutation as the null-approximation
+	 * technique.
+	 *
+	 * @return A vector of p-values for all the kernels.
+	 */
+	SGVector<float64_t> compute_p_value();
+
+	/**
+	 * Method that performs the test and returns whether the null hypothesis was
+	 * accepted or rejected, based on the provided significance level.
+	 *
+	 * @param alpha The significance level of the hypothesis test. Should be between
+	 * 0 and 1.
+	 * @return A vector of values of the test results (true - null hypothesis was
+	 * accepted, false - otherwise) for all the kernels.
+	 */
+	SGVector<bool> perform_test(float64_t alpha);
+
+	/** @return The name of the class */
+	virtual const char* get_name() const;
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+	void invalidate_precomputed_distance();
+	SGVector<float64_t> statistic(const internal::KernelManager& kernel_mgr);
+	SGVector<float64_t> variance_h1(const internal::KernelManager& kernel_mgr);
+	SGVector<float64_t> test_power(const internal::KernelManager& kernel_mgr);
+	SGMatrix<float32_t> sample_null(const internal::KernelManager& kernel_mgr);
+	SGVector<float64_t> p_values(const internal::KernelManager& kernel_mgr);
+};
+
+}
+#endif // MULTI_KERNEL_QUADRATIC_TIME_MMD_H_
diff --git a/src/shogun/statistical_testing/OneDistributionTest.cpp b/src/shogun/statistical_testing/OneDistributionTest.cpp
new file mode 100644
index 00000000000..d116270dc86
--- /dev/null
+++ b/src/shogun/statistical_testing/OneDistributionTest.cpp
@@ -0,0 +1,61 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <shogun/statistical_testing/OneDistributionTest.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/TestTypes.h>
+
+using namespace shogun;
+using namespace internal;
+
+COneDistributionTest::COneDistributionTest() : CHypothesisTest(OneDistributionTest::num_feats)
+{
+}
+
+COneDistributionTest::~COneDistributionTest()
+{
+}
+
+void COneDistributionTest::set_samples(CFeatures* samples)
+{
+	auto& data_mgr=get_data_mgr();
+	data_mgr.samples_at(0)=samples;
+}
+
+CFeatures* COneDistributionTest::get_samples() const
+{
+	const auto& data_mgr=get_data_mgr();
+	return data_mgr.samples_at(0);
+}
+
+void COneDistributionTest::set_num_samples(index_t num_samples)
+{
+	auto& data_mgr=get_data_mgr();
+	data_mgr.num_samples_at(0)=num_samples;
+}
+
+index_t COneDistributionTest::get_num_samples() const
+{
+	const auto& data_mgr=get_data_mgr();
+	return data_mgr.num_samples_at(0);
+}
+
+const char* COneDistributionTest::get_name() const
+{
+	return "OneDistributionTest";
+}
diff --git a/src/shogun/statistical_testing/OneDistributionTest.h b/src/shogun/statistical_testing/OneDistributionTest.h
new file mode 100644
index 00000000000..0de9c62c25f
--- /dev/null
+++ b/src/shogun/statistical_testing/OneDistributionTest.h
@@ -0,0 +1,85 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef ONE_DISTRIBUTION_TEST_H_
+#define ONE_DISTRIBUTION_TEST_H_
+
+#include <shogun/statistical_testing/HypothesisTest.h>
+
+namespace shogun
+{
+
+/**
+ * @brief Class OneDistributionTest is the base class for the statistical
+ * hypothesis testing with samples from one distributions, \f$mathbf{P}\f$.
+ */
+class COneDistributionTest : public CHypothesisTest
+{
+public:
+	/** Default constructor */
+	COneDistributionTest();
+
+	/** Destrutor */
+	virtual ~COneDistributionTest();
+
+	/**
+	 * Method that initializes the samples from \f$\mathbf{P}\f$.
+	 *
+	 * @param samples The CFeatures instance representing the samples
+	 * from \f$\mathbf{P}\f$.
+	 */
+	void set_samples(CFeatures* samples);
+
+	/** @return The samples from \f$\mathbf{P}\f$. */
+	CFeatures* get_samples() const;
+
+	/**
+	 * Method that initializes the number of samples to be drawn from distribution
+	 * \f$\mathbf{P}\f$. Please ensure to call this method if you are intending to
+	 * use streaming data generators that generate the samples on the fly. For
+	 * other types of features, the number of samples is set internally from the
+	 * features object itself, therefore this method should not be used.
+	 *
+	 * @param num_samples The CFeatures instance representing the samples
+	 * from \f$\mathbf{P}\f$.
+	 */
+	void set_num_samples(index_t num_samples);
+
+	/** @return The number of samples from \f$\mathbf{P}\f$. */
+	index_t get_num_samples() const;
+
+	/**
+	 * Interface for computing the test-statistic for the hypothesis test.
+	 *
+	 * @return test statistic for the given data/parameters/methods
+	 */
+	virtual float64_t compute_statistic()=0;
+
+	/**
+	 * Interface for computing the samples under the null-hypothesis.
+	 *
+	 * @return vector of all statistics
+	 */
+	virtual SGVector<float64_t> sample_null()=0;
+
+	/** @return The name of the class */
+	virtual const char* get_name() const;
+};
+
+}
+#endif // ONE_DISTRIBUTION_TEST_H_
diff --git a/src/shogun/statistical_testing/QuadraticTimeMMD.cpp b/src/shogun/statistical_testing/QuadraticTimeMMD.cpp
new file mode 100644
index 00000000000..7043272f966
--- /dev/null
+++ b/src/shogun/statistical_testing/QuadraticTimeMMD.cpp
@@ -0,0 +1,635 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/io/SGIO.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/CustomKernel.h>
+#include <shogun/mathematics/Statistics.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h>
+#include <shogun/statistical_testing/internals/Kernel.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <shogun/statistical_testing/internals/NextSamples.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+#include <shogun/statistical_testing/internals/mmd/PermutationMMD.h>
+#include <shogun/statistical_testing/internals/mmd/VarianceH0.h>
+#include <shogun/statistical_testing/internals/mmd/VarianceH1.h>
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+using std::unique_ptr;
+
+struct CQuadraticTimeMMD::Self
+{
+	Self(CQuadraticTimeMMD&);
+
+	void init_statistic_job();
+	void init_permutation_job();
+	void init_variance_h1_job();
+	void init_kernel();
+	SGMatrix<float32_t> get_kernel_matrix();
+
+	SGVector<float64_t> sample_null_spectrum();
+	SGVector<float64_t> sample_null_permutation();
+	SGVector<float64_t> gamma_fit_null();
+
+	CQuadraticTimeMMD& owner;
+	unique_ptr<CMultiKernelQuadraticTimeMMD> multi_kernel;
+
+	/**
+	 * Whether to precompute the kernel matrix. by default this is true.
+	 * It can be changed by the precompute_kernel_matrix() call. Keep in mind that
+	 * precompute is always true as long as the underlying kernel itself is a
+	 * precomputed kernel. Further updation of this value is ignored unless the
+	 * kernel is changed to a non-precomputed one.
+	 */
+	bool precompute;
+
+	/**
+	 * Whether the kernel is initialized with the joint features. If a kernel is
+	 * initialized once, then it becomes true. It can then becomes false only when
+	 * (a) the features are updated, or
+	 * (b) the kernel is updated later, or
+	 * (c) the internally precomputed kernel is removed and the underlying kernel is in use.
+	 * However, for (a), if the underlying kernel itself is a pre-computed one, it
+	 * stays true even when the features are updated. Also, for (b), if the newly
+	 * updated kernel is a pre-computed one, then also it stays true.
+	 */
+	bool is_kernel_initialized;
+
+	index_t num_eigenvalues;
+
+	ComputeMMD statistic_job;
+	VarianceH0 variance_h0_job;
+	VarianceH1 variance_h1_job;
+	PermutationMMD permutation_job;
+
+	static constexpr bool DEFAULT_PRECOMPUTE = true;
+	static constexpr index_t DEFAULT_NUM_EIGENVALUES = 10;
+};
+
+CQuadraticTimeMMD::Self::Self(CQuadraticTimeMMD& mmd) : owner(mmd)
+{
+	is_kernel_initialized=false;
+	precompute=DEFAULT_PRECOMPUTE;
+	num_eigenvalues=DEFAULT_NUM_EIGENVALUES;
+}
+
+void CQuadraticTimeMMD::Self::init_statistic_job()
+{
+	REQUIRE(owner.get_num_samples_p()>0,
+		"Number of samples from P (was %s) has to be > 0!\n", owner.get_num_samples_p());
+	REQUIRE(owner.get_num_samples_q()>0,
+		"Number of samples from Q (was %s) has to be > 0!\n", owner.get_num_samples_q());
+
+	statistic_job.m_n_x=owner.get_num_samples_p();
+	statistic_job.m_n_y=owner.get_num_samples_q();
+	statistic_job.m_stype=owner.get_statistic_type();
+}
+
+void CQuadraticTimeMMD::Self::init_variance_h1_job()
+{
+	REQUIRE(owner.get_num_samples_p()>0,
+		"Number of samples from P (was %s) has to be > 0!\n", owner.get_num_samples_p());
+	REQUIRE(owner.get_num_samples_q()>0,
+		"Number of samples from Q (was %s) has to be > 0!\n", owner.get_num_samples_q());
+
+	variance_h1_job.m_n_x=owner.get_num_samples_p();
+	variance_h1_job.m_n_y=owner.get_num_samples_q();
+}
+
+void CQuadraticTimeMMD::Self::init_permutation_job()
+{
+	REQUIRE(owner.get_num_samples_p()>0,
+		"Number of samples from P (was %s) has to be > 0!\n", owner.get_num_samples_p());
+	REQUIRE(owner.get_num_samples_q()>0,
+		"Number of samples from Q (was %s) has to be > 0!\n", owner.get_num_samples_q());
+	REQUIRE(owner.get_num_null_samples()>0,
+		"Number of null samples (was %d) has to be > 0!\n", owner.get_num_null_samples());
+
+	permutation_job.m_n_x=owner.get_num_samples_p();
+	permutation_job.m_n_y=owner.get_num_samples_q();
+	permutation_job.m_stype=owner.get_statistic_type();
+	permutation_job.m_num_null_samples=owner.get_num_null_samples();
+}
+
+void CQuadraticTimeMMD::Self::init_kernel()
+{
+	ASSERT(owner.get_kernel());
+	if (!is_kernel_initialized)
+	{
+		ASSERT(owner.get_kernel()->get_kernel_type()!=K_CUSTOM);
+		auto samples_p_and_q=owner.get_p_and_q();
+
+		auto kernel=owner.get_kernel();
+		kernel->init(samples_p_and_q, samples_p_and_q);
+		is_kernel_initialized=true;
+		SG_SINFO("Kernel is initialized with joint features of %d total samples!\n", samples_p_and_q->get_num_vectors());
+	}
+}
+
+SGMatrix<float32_t> CQuadraticTimeMMD::Self::get_kernel_matrix()
+{
+	ASSERT(precompute);
+	ASSERT(owner.get_kernel());
+	ASSERT(is_kernel_initialized);
+
+	if (owner.get_kernel()->get_kernel_type()!=K_CUSTOM)
+	{
+		auto kernel=owner.get_kernel();
+		owner.get_kernel_mgr().precompute_kernel_at(0);
+		kernel->remove_lhs_and_rhs();
+	}
+
+	ASSERT(owner.get_kernel()->get_kernel_type()==K_CUSTOM);
+	auto precomputed_kernel=static_cast<CCustomKernel*>(owner.get_kernel());
+	return precomputed_kernel->get_float32_kernel_matrix();
+}
+
+CQuadraticTimeMMD::CQuadraticTimeMMD() : CMMD()
+{
+	init();
+}
+
+CQuadraticTimeMMD::CQuadraticTimeMMD(CFeatures* samples_from_p, CFeatures* samples_from_q) : CMMD(samples_from_p, samples_from_q)
+{
+	init();
+}
+
+void CQuadraticTimeMMD::init()
+{
+	self=unique_ptr<Self>(new Self(*this));
+	self->multi_kernel=unique_ptr<CMultiKernelQuadraticTimeMMD>(new CMultiKernelQuadraticTimeMMD(this));
+}
+
+CQuadraticTimeMMD::~CQuadraticTimeMMD()
+{
+	CMMD::cleanup();
+}
+
+void CQuadraticTimeMMD::set_p(CFeatures* samples_from_p)
+{
+	if (samples_from_p!=get_p())
+	{
+		CTwoDistributionTest::set_p(samples_from_p);
+		get_kernel_mgr().restore_kernel_at(0);
+		self->is_kernel_initialized=false;
+		self->multi_kernel->invalidate_precomputed_distance();
+
+		if (get_kernel() && get_kernel()->get_kernel_type()==K_CUSTOM)
+		{
+			SG_WARNING("Existing kernel is already precomputed. Features provided will be\
+					ignored unless the kernel is updated with a non-precomputed one!\n");
+			self->is_kernel_initialized=true;
+		}
+	}
+	else
+	{
+		SG_INFO("Provided features are the same as the existing one. Ignoring!\n");
+	}
+}
+
+void CQuadraticTimeMMD::set_q(CFeatures* samples_from_q)
+{
+	if (samples_from_q!=get_q())
+	{
+		CTwoDistributionTest::set_q(samples_from_q);
+		get_kernel_mgr().restore_kernel_at(0);
+		self->is_kernel_initialized=false;
+		self->multi_kernel->invalidate_precomputed_distance();
+
+		if (get_kernel() && get_kernel()->get_kernel_type()==K_CUSTOM)
+		{
+			SG_WARNING("Existing kernel is already precomputed. Features provided will be\
+					ignored unless the kernel is updated with a non-precomputed one!\n");
+			self->is_kernel_initialized=true;
+		}
+	}
+	else
+	{
+		SG_INFO("Provided features are the same as the existing one. Ignoring!\n");
+	}
+}
+
+CFeatures* CQuadraticTimeMMD::get_p_and_q()
+{
+	CFeatures* samples_p_and_q=nullptr;
+	REQUIRE(get_p(), "Samples from P are not set!\n");
+	REQUIRE(get_q(), "Samples from Q are not set!\n");
+
+	DataManager& data_mgr=get_data_mgr();
+	data_mgr.start();
+	auto samples=data_mgr.next();
+	if (!samples.empty())
+	{
+		CFeatures *samples_p=samples[0][0].get();
+		CFeatures *samples_q=samples[1][0].get();
+		samples_p_and_q=FeaturesUtil::create_merged_copy(samples_p, samples_q);
+		samples.clear();
+	}
+	else
+	{
+		SG_SERROR("Could not fetch samples!\n");
+	}
+	data_mgr.end();
+	return samples_p_and_q;
+}
+
+void CQuadraticTimeMMD::set_kernel(CKernel* kernel)
+{
+	if (kernel!=get_kernel())
+	{
+		// removing any pre-computed kernel is done in the base already
+		CTwoSampleTest::set_kernel(kernel);
+		self->is_kernel_initialized=false;
+
+		if (kernel->get_kernel_type()==K_CUSTOM)
+		{
+			SG_INFO("Setting a precomputed kernel. Features provided will be ignored!\n");
+			self->is_kernel_initialized=true;
+		}
+	}
+	else
+	{
+		SG_INFO("Provided kernel is the same as the existing one. Ignoring!\n");
+	}
+}
+
+void CQuadraticTimeMMD::select_kernel()
+{
+	CMMD::select_kernel();
+	self->is_kernel_initialized=false;
+
+	ASSERT(get_kernel());
+	if (get_kernel()->get_kernel_type()==K_CUSTOM)
+	{
+		SG_WARNING("Selected kernel is already precomputed. Features provided will be\
+				ignored unless the kernel is updated with a non-precomputed one!\n");
+		self->is_kernel_initialized=true;
+	}
+}
+
+float64_t CQuadraticTimeMMD::normalize_statistic(float64_t statistic) const
+{
+	const index_t Nx=get_num_samples_p();
+	const index_t Ny=get_num_samples_q();
+	return Nx*Ny*statistic/(Nx+Ny);
+}
+
+
+float64_t CQuadraticTimeMMD::compute_statistic()
+{
+	SG_DEBUG("Entering\n");
+	REQUIRE(get_kernel(), "Kernel is not set!\n");
+
+	self->init_statistic_job();
+	self->init_kernel();
+
+	float64_t statistic=0;
+	if (self->precompute)
+	{
+		SGMatrix<float32_t> kernel_matrix=self->get_kernel_matrix();
+		statistic=self->statistic_job(kernel_matrix);
+	}
+	else
+	{
+		auto kernel=get_kernel();
+		if (kernel->get_kernel_type()==K_CUSTOM)
+			SG_INFO("Precompute is turned off, but provided kernel is already precomputed!\n");
+		auto kernel_functor=internal::Kernel(kernel);
+		statistic=self->statistic_job(kernel_functor);
+	}
+
+	statistic=normalize_statistic(statistic);
+
+	SG_DEBUG("Leaving\n");
+	return statistic;
+}
+
+SGVector<float64_t> CQuadraticTimeMMD::Self::sample_null_permutation()
+{
+	SG_SDEBUG("Entering\n");
+	REQUIRE(owner.get_kernel(), "Kernel is not set!\n");
+
+	init_permutation_job();
+	init_kernel();
+
+	SGVector<float32_t> result;
+	if (precompute)
+	{
+		SGMatrix<float32_t> kernel_matrix=get_kernel_matrix();
+		result=permutation_job(kernel_matrix);
+	}
+	else
+	{
+		auto kernel=owner.get_kernel();
+		if (kernel->get_kernel_type()==K_CUSTOM)
+			SG_SINFO("Precompute is turned off, but provided kernel is already precomputed!\n");
+		auto kernel_functor=internal::Kernel(kernel);
+		result=permutation_job(kernel_functor);
+	}
+
+	SGVector<float64_t> null_samples(result.vlen);
+	for (auto i=0; i<result.vlen; ++i)
+		null_samples[i]=owner.normalize_statistic(result[i]);
+
+	SG_SDEBUG("Leaving\n");
+	return null_samples;
+}
+
+SGVector<float64_t> CQuadraticTimeMMD::Self::sample_null_spectrum()
+{
+	SG_SDEBUG("Entering\n");
+	REQUIRE(owner.get_kernel(), "Kernel is not set!\n");
+	REQUIRE(precompute, "MMD2_SPECTRUM is not possible without precomputing the kernel matrix!\n");
+
+	index_t m=owner.get_num_samples_p();
+	index_t n=owner.get_num_samples_q();
+
+	REQUIRE(num_eigenvalues>0 && num_eigenvalues<m+n-1,
+		"Number of Eigenvalues (%d) must be in between [1, %d]\n", num_eigenvalues, m+n-1);
+
+	init_kernel();
+
+	/* imaginary matrix K=[K KL; KL' L] (MATLAB notation)
+	 * K is matrix for XX, L is matrix for YY, KL is XY, LK is YX
+	 * works since X and Y are concatenated here */
+	SGMatrix<float32_t> kernel_matrix=get_kernel_matrix();
+	SGMatrix<float32_t> K(kernel_matrix.num_rows, kernel_matrix.num_cols);
+	std::copy(kernel_matrix.data(), kernel_matrix.data()+kernel_matrix.size(), K.data());
+
+	/* center matrix K=H*K*H */
+	K.center();
+
+	/* compute eigenvalues and select num_eigenvalues largest ones */
+	Eigen::Map<Eigen::MatrixXf> c_kernel_matrix(K.matrix, K.num_rows, K.num_cols);
+	Eigen::SelfAdjointEigenSolver<Eigen::MatrixXf> eigen_solver(c_kernel_matrix);
+	REQUIRE(eigen_solver.info()==Eigen::Success, "Eigendecomposition failed!\n");
+	index_t max_num_eigenvalues=eigen_solver.eigenvalues().rows();
+
+	SGVector<float64_t> null_samples(owner.get_num_null_samples());
+
+	/* finally, sample from null distribution */
+	for (auto i=0; i<null_samples.vlen; ++i)
+	{
+		float64_t null_sample=0;
+		for (index_t j=0; j<num_eigenvalues; ++j)
+		{
+			float64_t z_j=CMath::randn_double();
+			float64_t multiple=CMath::sq(z_j);
+
+			/* take largest EV, scale by 1/(m+n) on the fly and take abs value*/
+			float64_t eigenvalue_estimate=eigen_solver.eigenvalues()[max_num_eigenvalues-1-j];
+			eigenvalue_estimate/=(m+n);
+
+			if (owner.get_statistic_type()==ST_UNBIASED_FULL)
+				multiple-=1;
+
+			null_sample+=eigenvalue_estimate*multiple;
+		}
+		null_samples[i]=null_sample;
+	}
+
+	SG_SDEBUG("Leaving\n");
+	return null_samples;
+}
+
+SGVector<float64_t> CQuadraticTimeMMD::Self::gamma_fit_null()
+{
+	SG_SDEBUG("Entering\n");
+
+	REQUIRE(owner.get_kernel(), "Kernel is not set!\n");
+	REQUIRE(precompute, "MMD2_GAMMA is not possible without precomputing the kernel matrix!\n");
+	REQUIRE(owner.get_statistic_type()==ST_BIASED_FULL, "Provided statistic has to be BIASED!\n");
+
+	index_t m=owner.get_num_samples_p();
+	index_t n=owner.get_num_samples_q();
+	REQUIRE(m==n, "Number of samples from p (%d) and q (%d) must be equal.\n", n, m)
+
+	SGVector<float64_t> result(2);
+	std::fill(result.vector, result.vector+result.vlen, 0);
+
+	init_kernel();
+
+	/* imaginary matrix K=[K KL; KL' L] (MATLAB notation)
+	 * K is matrix for XX, L is matrix for YY, KL is XY, LK is YX
+	 * works since X and Y are concatenated here */
+	SGMatrix<float32_t> kernel_matrix=get_kernel_matrix();
+
+	/* compute mean under H0 of MMD, which is
+	 * meanMMD =2/m * ( 1  - 1/m*sum(diag(KL))  );
+	 * in MATLAB.
+	 * Remove diagonals on the fly */
+	float64_t mean_mmd=0;
+	for (index_t i=0; i<m; ++i)
+	{
+		/* virtual KL matrix is in upper right corner of SHOGUN K matrix
+		 * so this sums the diagonal of the matrix between X and Y*/
+		mean_mmd+=kernel_matrix(i, m+i);
+	}
+	mean_mmd=2.0/m*(1.0-1.0/m*mean_mmd);
+
+	/* compute variance under H0 of MMD, which is
+	 * varMMD=2/m/(m-1) * 1/m/(m-1) * sum(sum( (K+L - KL - KL').^2 ));
+	 * in MATLAB, so sum up all elements */
+
+	// TODO parallelise or use linalg and precomputed kernel matrix
+	float64_t var_mmd=0;
+	for (index_t i=0; i<m; ++i)
+	{
+		for (index_t j=0; j<m; ++j)
+		{
+			/* dont add diagonal of all pairs of imaginary kernel matrices */
+			if (i==j || m+i==j || m+j==i)
+				continue;
+
+			float64_t to_add=kernel_matrix(i, j);
+			to_add+=kernel_matrix(m+i, m+j);
+			to_add-=kernel_matrix(i, m+j);
+			to_add-=kernel_matrix(m+i, j);
+			var_mmd+=CMath::pow(to_add, 2);
+		}
+	}
+
+	var_mmd*=2.0/m/(m-1)*1.0/m/(m-1);
+
+	/* parameters for gamma distribution */
+	float64_t a=CMath::pow(mean_mmd, 2)/var_mmd;
+	float64_t b=var_mmd*m/mean_mmd;
+
+	result[0]=a;
+	result[1]=b;
+
+	SG_SDEBUG("Leaving\n");
+	return result;
+}
+
+float64_t CQuadraticTimeMMD::compute_variance_h0()
+{
+	REQUIRE(get_kernel(), "Kernel is not set!\n");
+	REQUIRE(self->precompute,
+		"Computing variance estimate is not possible without precomputing the kernel matrix!\n");
+
+	self->init_kernel();
+	SGMatrix<float32_t> kernel_matrix=self->get_kernel_matrix();
+	return self->variance_h0_job(kernel_matrix);
+}
+
+float64_t CQuadraticTimeMMD::compute_variance_h1()
+{
+	REQUIRE(get_kernel(), "Kernel is not set!\n");
+	self->init_kernel();
+	self->init_variance_h1_job();
+	float64_t variance_estimate=0;
+	if (self->precompute)
+	{
+		SGMatrix<float32_t> kernel_matrix=self->get_kernel_matrix();
+		variance_estimate=self->variance_h1_job(kernel_matrix);
+	}
+	else
+	{
+		auto kernel=get_kernel();
+		if (kernel->get_kernel_type()==K_CUSTOM)
+			SG_INFO("Precompute is turned off, but provided kernel is already precomputed!\n");
+		auto kernel_functor=internal::Kernel(kernel);
+		variance_estimate=self->variance_h1_job(kernel_functor);
+	}
+	return variance_estimate;
+}
+
+float64_t CQuadraticTimeMMD::compute_p_value(float64_t statistic)
+{
+	REQUIRE(get_kernel(), "Kernel is not set!\n");
+	float64_t result=0;
+	switch (get_null_approximation_method())
+	{
+		case NAM_MMD2_GAMMA:
+		{
+			SGVector<float64_t> params=self->gamma_fit_null();
+			result=CStatistics::gamma_cdf(statistic, params[0], params[1]);
+			break;
+		}
+		default:
+			result=CHypothesisTest::compute_p_value(statistic);
+		break;
+	}
+	return result;
+}
+
+float64_t CQuadraticTimeMMD::compute_threshold(float64_t alpha)
+{
+	REQUIRE(get_kernel(), "Kernel is not set!\n");
+	float64_t result=0;
+	switch (get_null_approximation_method())
+	{
+		case NAM_MMD2_GAMMA:
+		{
+			SGVector<float64_t> params=self->gamma_fit_null();
+			result=CStatistics::gamma_inverse_cdf(alpha, params[0], params[1]);
+			break;
+		}
+		default:
+			result=CHypothesisTest::compute_threshold(alpha);
+			break;
+	}
+	return result;
+}
+
+SGVector<float64_t> CQuadraticTimeMMD::sample_null()
+{
+	REQUIRE(get_kernel(), "Kernel is not set!\n");
+	SGVector<float64_t> null_samples;
+	switch (get_null_approximation_method())
+	{
+		case NAM_MMD2_SPECTRUM:
+			null_samples=self->sample_null_spectrum();
+			break;
+		case NAM_PERMUTATION:
+			null_samples=self->sample_null_permutation();
+			break;
+		default: break;
+	}
+	return null_samples;
+}
+
+CMultiKernelQuadraticTimeMMD* CQuadraticTimeMMD::multikernel()
+{
+	return self->multi_kernel.get();
+}
+
+void CQuadraticTimeMMD::spectrum_set_num_eigenvalues(index_t num_eigenvalues)
+{
+	self->num_eigenvalues=num_eigenvalues;
+}
+
+index_t CQuadraticTimeMMD::spectrum_get_num_eigenvalues() const
+{
+	return self->num_eigenvalues;
+}
+
+void CQuadraticTimeMMD::precompute_kernel_matrix(bool precompute)
+{
+	if (self->precompute && !precompute)
+	{
+		if (get_kernel())
+		{
+			get_kernel_mgr().restore_kernel_at(0);
+			self->is_kernel_initialized=false;
+			if (get_kernel()->get_kernel_type()==K_CUSTOM)
+			{
+				SG_WARNING("The existing kernel itself is a precomputed kernel!\n");
+				precompute=true;
+				self->is_kernel_initialized=true;
+			}
+		}
+	}
+	self->precompute=precompute;
+}
+
+void CQuadraticTimeMMD::save_permutation_inds(bool save_inds)
+{
+	self->permutation_job.m_save_inds=save_inds;
+}
+
+SGMatrix<index_t> CQuadraticTimeMMD::get_permutation_inds() const
+{
+	return self->permutation_job.m_all_inds;
+}
+
+const char* CQuadraticTimeMMD::get_name() const
+{
+	return "QuadraticTimeMMD";
+}
diff --git a/src/shogun/statistical_testing/QuadraticTimeMMD.h b/src/shogun/statistical_testing/QuadraticTimeMMD.h
new file mode 100644
index 00000000000..1f64cbc916a
--- /dev/null
+++ b/src/shogun/statistical_testing/QuadraticTimeMMD.h
@@ -0,0 +1,283 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef QUADRATIC_TIME_MMD_H_
+#define QUADRATIC_TIME_MMD_H_
+
+#include <memory>
+#include <shogun/statistical_testing/MMD.h>
+
+namespace shogun
+{
+
+class CMultiKernelQuadraticTimeMMD;
+template <typename> class SGVector;
+
+/**
+ * @brief This class implements the quadratic time Maximum Mean Statistic as
+ * described in [1].
+ * The MMD is the distance of two probability distributions \f$p\f$ and \f$q\f$
+ * in a RKHS which we denote by
+ * \f[
+ * 	\hat{\eta_k}=\text{MMD}[\mathcal{F},p,q]^2=\textbf{E}_{x,x'}
+ * 	\left[ k(x,x')\right]-2\textbf{E}_{x,y}\left[ k(x,y)\right]
+ * 	+\textbf{E}_{y,y'}\left[ k(y,y')\right]=||\mu_p - \mu_q||^2_\mathcal{F}
+ * \f]
+ *
+ * Estimating variance of the asymptotic distribution of the statistic under
+ * null and alternative hypothesis can be done using compute_variance_h0() and
+ * compute_variance_h1() method.
+ *
+ * Note that all these operations can be done for multiple kernels
+ * at once as well. To use this functionality, use multikernel() method to
+ * obtain a CMultiKernelQuadraticTimeMMD instance and then call methods on that.
+ *
+ * If you do not know about your data, but want to use the MMD from a kernel
+ * matrix, just use the custom kernel constructor and initialize the features as
+ * CDummyFeatures. Everything else will work as usual.
+ *
+ * To make the computation faster, this class always pre-computes the kernel
+ * and stores the Gram matrix using merged samples from p and q. It essentially
+ * keeps a backup of the old kernel and rather uses this pre-computed one as
+ * long as the present kernel is valid. Therefore, after a computation phase
+ * is executed, upon calling get_kernel() we will obtain the pre-computed
+ * kernel matrix as a CCustomKernel object. However, if subsequently the
+ * features are updated or the underlying kernel itself is updated, it discards
+ * the pre-computed kernel matrix (frees memory) and pulls the old kernel from
+ * backup (or, simply replace that if a new kernel is provided) and then
+ * pre-computes that in the next run.
+ *
+ * It is possible to turn off the above feature by turning it off. However,
+ * it will affect the performance of the algorithms, since they are optimzied
+ * for pre-computed kernel matrices. Therefore, this should only be turned off
+ * if the storage of the kernel is a major concern. Please note that only
+ * the lower triangular part of the Gram matrix is stored, in order to exploit
+ * the symmetry.
+ *
+ * Since the methods modifies the object's state, using the methods of this
+ * class from multiple threads may result in undesired/incorrect results/behavior.
+ *
+ * NOTE: \f$n_x\f$ and \f$n_y\f$ are represented by \f$m\f$ and \f$n\f$,
+ * respectively in the implementation.
+ *
+ * [1]: Gretton, A., Borgwardt, K. M., Rasch, M. J., Schoelkopf, B., & Smola, A. (2012).
+ * A Kernel Two-Sample Test. Journal of Machine Learning Research, 13, 671-721.
+ *
+ * [2]: Gretton, A., Fukumizu, K., & Harchaoui, Z. (2011).
+ * A fast, consistent kernel two-sample test.
+ */
+class CQuadraticTimeMMD : public CMMD
+{
+	friend class CMultiKernelQuadraticTimeMMD;
+
+public:
+	/** Default constructor */
+	CQuadraticTimeMMD();
+
+	/**
+	 * Convenience constructor. Initializes the features representing samples
+	 * from both the distributions.
+	 *
+	 * @param samples_from_p Samples from p.
+	 * @param samples_from_q Samples from q.
+	 */
+	CQuadraticTimeMMD(CFeatures* samples_from_p, CFeatures* samples_from_q);
+
+	/** Destructor */
+	virtual ~CQuadraticTimeMMD();
+
+	/**
+	 * Method that initializes/replaces samples from p. It will invalidate
+	 * existing pre-computed kernel, if any, from previous run. However, if
+	 * the underlying kernel, if set already by this point, is an instance of
+	 * CCustomKernel itself, the supplied features will be ignored.
+	 *
+	 * @param samples_from_p Samples from p.
+	 */
+	virtual void set_p(CFeatures* samples_from_p);
+
+	/**
+	 * Method that initializes/replaces samples from q. It will invalidate
+	 * existing pre-computed kernel, if any, from previous run. However, if
+	 * the underlying kernel, if set already by this point, is an instance of
+	 * CCustomKernel itself, the supplied features will be ignored.
+	 *
+	 * @param samples_from_p Samples from q.
+	 */
+	virtual void set_q(CFeatures* samples_from_q);
+
+	/**
+	 * Method that creates a merged copy of CFeatures instance from both
+	 * the features, appending the samples from p and q. This method does not
+	 * cache the merged copy from previous call. So, calling this method will
+	 * create a new instance every time.
+	 *
+	 * @return The merged samples.
+	 */
+	CFeatures* get_p_and_q();
+
+	/**
+	 * Method that sets the kernel instance to be used. If a CCustomKernel is
+	 * set, then the features passed would be effectively ignored. Therefore,
+	 * if this is the intended behavior, simply passing two instances of
+	 * CDummyFeatures would do (since they cannot be left null as of now).
+	 *
+	 * If a pre-computed instance already exists from previous runs, this will
+	 * invalidate that one and free memory.
+	 *
+	 * @param kernel The kernel instance.
+	 */
+	virtual void set_kernel(CKernel* kernel);
+
+	/**
+	 * Method that learns/selects the kernel from a set of provided kernel
+	 * instances added from the add_kernel() methods. Upon selection, it
+	 * internally replaces the kernel instance, if any, that was already
+	 * present.
+	 *
+	 * Please make sure to set the train-test mode on before using this method.
+	 */
+	virtual void select_kernel();
+
+	/**
+	 * Method that computes the estimator of MMD^2 (biased/unbiased/incomplete)
+	 * as set from set_statistic_type() method. Default is unbiased.
+	 *
+	 * @return A normalized value of the MMD^2 estimator.
+	 */
+	virtual float64_t compute_statistic();
+
+	/**
+	 * Method that returns a number of null-samples, based on the null approximation
+	 * method that was set using set_null_approximation_method(). Default is permutation.
+	 *
+	 * @return Normalized values of the MMD^2 estimates under null hypothesis.
+	 */
+	virtual SGVector<float64_t> sample_null();
+
+	/**
+	 * Method that computes the p-value from the provided statistic.
+	 *
+	 * @param statistic The test statistic
+	 * @return The p-value computed using the null-appriximation method specified.
+	 */
+	virtual float64_t compute_p_value(float64_t statistic);
+
+	/**
+	 * Method that computes the threshold from the provided significance level (alpha).
+	 *
+	 * @param alpha The significance level (value should be between 0 and 1)
+	 * @return The threshold computed using the null-approximation method specified.
+	 */
+	virtual float64_t compute_threshold(float64_t alpha);
+
+	/**
+	 * Method that computes an estimate of the variance of the unbiased MMD^2 estimator
+	 * under the assumption that the null hypothesis was true.
+	 *
+	 * @return The variance estimate of the unbiased MMD^2 estimator under null.
+	 */
+	float64_t compute_variance_h0();
+
+	/**
+	 * Method that computes an estimate of the variance of the unbiased MMD^2 estimator
+	 * under the assumption that the alternative hypothesis was true.
+	 *
+	 * @return The variance estimate of the unbiased MMD^2 estimator under alternative.
+	 */
+	float64_t compute_variance_h1();
+
+	/**
+	 * Method that returns the internal instance of CMultiKernelQuadraticTimeMMD which
+	 * provides a similar API to this class to compute the estimates for multiple kernel
+	 * all at once. This internal instance shares the same set of samples with this one
+	 * but the kernel has to be added seperately using multikernel().add_kernel() method.
+	 *
+	 * @return An internal instance of CMultiKernelQuadraticTimeMMD.
+	 */
+	CMultiKernelQuadraticTimeMMD* multikernel();
+
+	/**
+	 * Method that sets the number of eigenvalues to be used when spectral estimation
+	 * of the null samples is used. Will be ignored if null-approximation method was
+	 * anything else.
+	 *
+	 * @param num_eigenvalues The number of eigenvalues to be used from the eigenspectrum
+	 * of the Gram matrix.
+	 */
+	void spectrum_set_num_eigenvalues(index_t num_eigenvalues);
+
+	/** @return The number of eigenvalues in use for the spectral test */
+	index_t spectrum_get_num_eigenvalues() const;
+
+	/**
+	 * Use this method when pre-computation of the kernel matrix is NOT desired. By default
+	 * this class always precomputes the Gram matrix. Please note that the performance will
+	 * be slow if this option is turned off.
+	 *
+	 * @param precompute Flag to whether pre-compute the kernel matrix internally or not.
+	 * If false, the kernel matrix is NOT pre-computed, otherwise it is. Default is true.
+	 */
+	void precompute_kernel_matrix(bool precompute);
+
+	/**
+	 * Method that saves the permutation indices that will be used while sampling from the
+	 * null distribution in case permutation approach was adopted. The indices will be
+	 * available only after a successful run of the permutation test. By default, the indices
+	 * are never saved.
+	 *
+	 * @param save_inds Whether to save the permutation indices or not. If true, the indices
+	 * are saved, otherwise not.
+	 */
+	void save_permutation_inds(bool save_inds);
+
+	/**
+	 * Method that returns the permutation indices, if that option was turned on by using
+	 * the save_permutation_inds(true).
+	 *
+	 * @return The permutation indices, one column per null-sample.
+	 */
+	SGMatrix<index_t> get_permutation_inds() const;
+
+	/** @return The name of the class */
+	virtual const char* get_name() const;
+
+protected:
+	virtual float64_t normalize_statistic(float64_t statistic) const;
+
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+	void init();
+};
+
+}
+#endif // QUADRATIC_TIME_MMD_H_
diff --git a/src/shogun/statistical_testing/StreamingMMD.cpp b/src/shogun/statistical_testing/StreamingMMD.cpp
new file mode 100644
index 00000000000..a16b6dc9d3d
--- /dev/null
+++ b/src/shogun/statistical_testing/StreamingMMD.cpp
@@ -0,0 +1,499 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <vector>
+#include <memory>
+#include <type_traits>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/CustomKernel.h>
+#include <shogun/kernel/CombinedKernel.h>
+#include <shogun/features/Features.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/BTestMMD.h>
+#include <shogun/statistical_testing/LinearTimeMMD.h>
+#include <shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h>
+#include <shogun/statistical_testing/internals/NextSamples.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/ComputationManager.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+#include <shogun/statistical_testing/internals/mmd/WithinBlockDirect.h>
+#include <shogun/statistical_testing/internals/mmd/WithinBlockPermutation.h>
+#include <shogun/mathematics/eigen3.h>
+
+using namespace shogun;
+using namespace internal;
+
+struct CStreamingMMD::Self
+{
+	Self(CStreamingMMD& cmmd);
+
+	void create_statistic_job();
+	void create_variance_job();
+	void create_computation_jobs();
+
+	void merge_samples(NextSamples&, std::vector<CFeatures*>&) const;
+	void compute_kernel(ComputationManager&, std::vector<CFeatures*>&, CKernel*) const;
+	void compute_jobs(ComputationManager&) const;
+
+	std::pair<float64_t, float64_t> compute_statistic_variance();
+	std::pair<SGVector<float64_t>, SGMatrix<float64_t>> compute_statistic_and_Q(const KernelManager&);
+	SGVector<float64_t> sample_null();
+
+	CStreamingMMD& owner;
+
+	bool use_gpu;
+	index_t num_null_samples;
+
+	EStatisticType statistic_type;
+	EVarianceEstimationMethod variance_estimation_method;
+	ENullApproximationMethod null_approximation_method;
+
+	std::function<float32_t(const SGMatrix<float32_t>&)> statistic_job;
+	std::function<float32_t(const SGMatrix<float32_t>&)> permutation_job;
+	std::function<float32_t(const SGMatrix<float32_t>&)> variance_job;
+};
+
+CStreamingMMD::Self::Self(CStreamingMMD& cmmd) : owner(cmmd),
+	use_gpu(false), num_null_samples(250),
+	statistic_type(ST_UNBIASED_FULL),
+	variance_estimation_method(VEM_DIRECT),
+	null_approximation_method(NAM_PERMUTATION),
+	statistic_job(nullptr), variance_job(nullptr)
+{
+}
+
+void CStreamingMMD::Self::create_computation_jobs()
+{
+	create_statistic_job();
+	create_variance_job();
+}
+
+void CStreamingMMD::Self::create_statistic_job()
+{
+	const DataManager& data_mgr=owner.get_data_mgr();
+
+	auto Bx=data_mgr.blocksize_at(0);
+	auto By=data_mgr.blocksize_at(1);
+
+	REQUIRE(Bx>0, "Blocksize for samples from P cannot be 0!\n");
+	REQUIRE(By>0, "Blocksize for samples from Q cannot be 0!\n");
+
+	auto mmd=mmd::ComputeMMD();
+	mmd.m_n_x=Bx;
+	mmd.m_n_y=By;
+	mmd.m_stype=statistic_type;
+
+	statistic_job=mmd;
+	permutation_job=mmd::WithinBlockPermutation(Bx, By, statistic_type);
+}
+
+void CStreamingMMD::Self::create_variance_job()
+{
+	switch (variance_estimation_method)
+	{
+		case VEM_DIRECT:
+			variance_job=owner.get_direct_estimation_method();
+			break;
+		case VEM_PERMUTATION:
+			variance_job=permutation_job;
+			break;
+		default : break;
+	};
+}
+
+void CStreamingMMD::Self::merge_samples(NextSamples& next_burst, std::vector<CFeatures*>& blocks) const
+{
+	blocks.resize(next_burst.num_blocks());
+#pragma omp parallel for
+	for (size_t i=0; i<blocks.size(); ++i)
+	{
+		auto block_p=next_burst[0][i].get();
+		auto block_q=next_burst[1][i].get();
+		auto block_p_and_q=FeaturesUtil::create_merged_copy(block_p, block_q);
+		blocks[i]=block_p_and_q;
+	}
+	next_burst.clear();
+}
+
+void CStreamingMMD::Self::compute_kernel(ComputationManager& cm, std::vector<CFeatures*>& blocks, CKernel* kernel) const
+{
+	REQUIRE(kernel->get_kernel_type()!=K_CUSTOM, "Underlying kernel cannot be custom!\n");
+	cm.num_data(blocks.size());
+#pragma omp parallel for
+	for (size_t i=0; i<blocks.size(); ++i)
+	{
+		try
+		{
+			auto kernel_clone=std::unique_ptr<CKernel>(static_cast<CKernel*>(kernel->clone()));
+			kernel_clone->init(blocks[i], blocks[i]);
+			cm.data(i)=kernel_clone->get_kernel_matrix<float32_t>();
+			kernel_clone->remove_lhs_and_rhs();
+		}
+		catch (ShogunException e)
+		{
+			SG_SERROR("%s, Try using less number of blocks per burst!\n", e.get_exception_string());
+		}
+	}
+}
+
+void CStreamingMMD::Self::compute_jobs(ComputationManager& cm) const
+{
+	if (use_gpu)
+		cm.use_gpu().compute_data_parallel_jobs();
+	else
+		cm.use_cpu().compute_data_parallel_jobs();
+}
+
+std::pair<float64_t, float64_t> CStreamingMMD::Self::compute_statistic_variance()
+{
+	const KernelManager& kernel_mgr=owner.get_kernel_mgr();
+	auto kernel=kernel_mgr.kernel_at(0);
+	REQUIRE(kernel != nullptr, "Kernel is not set!\n");
+
+	float64_t statistic=0;
+	float64_t permuted_samples_statistic=0;
+	float64_t variance=0;
+	index_t statistic_term_counter=1;
+	index_t variance_term_counter=1;
+
+	DataManager& data_mgr=owner.get_data_mgr();
+	data_mgr.start();
+	auto next_burst=data_mgr.next();
+	if (!next_burst.empty())
+	{
+		ComputationManager cm;
+		create_computation_jobs();
+		cm.enqueue_job(statistic_job);
+		cm.enqueue_job(variance_job);
+
+		std::vector<CFeatures*> blocks;
+
+		while (!next_burst.empty())
+		{
+			merge_samples(next_burst, blocks);
+			compute_kernel(cm, blocks, kernel);
+			blocks.resize(0);
+			compute_jobs(cm);
+
+			auto mmds=cm.result(0);
+			auto vars=cm.result(1);
+
+			for (size_t i=0; i<mmds.size(); ++i)
+			{
+				auto delta=mmds[i]-statistic;
+				statistic+=delta/statistic_term_counter;
+				statistic_term_counter++;
+			}
+
+			if (variance_estimation_method==VEM_DIRECT)
+			{
+				for (size_t i=0; i<mmds.size(); ++i)
+				{
+					auto delta=vars[i]-variance;
+					variance+=delta/variance_term_counter;
+					variance_term_counter++;
+				}
+			}
+			else
+			{
+				for (size_t i=0; i<vars.size(); ++i)
+				{
+					auto delta=vars[i]-permuted_samples_statistic;
+					permuted_samples_statistic+=delta/variance_term_counter;
+					variance+=delta*(vars[i]-permuted_samples_statistic);
+					variance_term_counter++;
+				}
+			}
+			next_burst=data_mgr.next();
+		}
+		cm.done();
+	}
+	data_mgr.end();
+
+	// normalize statistic and variance
+	statistic=owner.normalize_statistic(statistic);
+	if (variance_estimation_method==VEM_PERMUTATION)
+		variance=owner.normalize_variance(variance);
+
+	return std::make_pair(statistic, variance);
+}
+
+std::pair<SGVector<float64_t>, SGMatrix<float64_t> > CStreamingMMD::Self::compute_statistic_and_Q(const KernelManager& kernel_selection_mgr)
+{
+//	const size_t num_kernels=0;
+//	SGVector<float64_t> statistic(num_kernels);
+//	SGMatrix<float64_t> Q(num_kernels, num_kernels);
+//	return std::make_pair(statistic, Q);
+	REQUIRE(kernel_selection_mgr.num_kernels()>0, "No kernels specified for kernel learning! "
+		"Please add kernels using add_kernel() method!\n");
+
+	const size_t num_kernels=kernel_selection_mgr.num_kernels();
+	SGVector<float64_t> statistic(num_kernels);
+	SGMatrix<float64_t> Q(num_kernels, num_kernels);
+
+	std::fill(statistic.data(), statistic.data()+statistic.size(), 0);
+	std::fill(Q.data(), Q.data()+Q.size(), 0);
+
+	std::vector<index_t> term_counters_statistic(num_kernels, 1);
+	SGMatrix<index_t> term_counters_Q(num_kernels, num_kernels);
+	std::fill(term_counters_Q.data(), term_counters_Q.data()+term_counters_Q.size(), 1);
+
+	DataManager& data_mgr=owner.get_data_mgr();
+	ComputationManager cm;
+	create_computation_jobs();
+	cm.enqueue_job(statistic_job);
+
+	data_mgr.start();
+	auto next_burst=data_mgr.next();
+	std::vector<CFeatures*> blocks;
+	std::vector<std::vector<float32_t> > mmds(num_kernels);
+	while (!next_burst.empty())
+	{
+		const size_t num_blocks=next_burst.num_blocks();
+		REQUIRE(num_blocks%2==0,
+				"The number of blocks per burst (%d this burst) has to be even!\n",
+				num_blocks);
+		merge_samples(next_burst, blocks);
+		std::for_each(blocks.begin(), blocks.end(), [](CFeatures* ptr) { SG_REF(ptr); });
+		for (size_t k=0; k<num_kernels; ++k)
+		{
+			CKernel* kernel=kernel_selection_mgr.kernel_at(k);
+			compute_kernel(cm, blocks, kernel);
+			compute_jobs(cm);
+			mmds[k]=cm.result(0);
+			for (size_t i=0; i<num_blocks; ++i)
+			{
+				auto delta=mmds[k][i]-statistic[k];
+				statistic[k]+=delta/term_counters_statistic[k]++;
+			}
+		}
+		std::for_each(blocks.begin(), blocks.end(), [](CFeatures* ptr) { SG_UNREF(ptr); });
+		blocks.resize(0);
+		for (size_t i=0; i<num_kernels; ++i)
+		{
+			for (size_t j=0; j<=i; ++j)
+			{
+				for (size_t k=0; k<num_blocks-1; k+=2)
+				{
+					auto term=(mmds[i][k]-mmds[i][k+1])*(mmds[j][k]-mmds[j][k+1]);
+					Q(i, j)+=(term-Q(i, j))/term_counters_Q(i, j)++;
+				}
+				Q(j, i)=Q(i, j);
+			}
+		}
+		next_burst=data_mgr.next();
+	}
+	mmds.clear();
+
+	data_mgr.end();
+	cm.done();
+
+	std::for_each(statistic.data(), statistic.data()+statistic.size(), [this](float64_t val)
+	{
+		val=owner.normalize_statistic(val);
+	});
+	return std::make_pair(statistic, Q);
+}
+
+SGVector<float64_t> CStreamingMMD::Self::sample_null()
+{
+	const KernelManager& kernel_mgr=owner.get_kernel_mgr();
+	auto kernel=kernel_mgr.kernel_at(0);
+	REQUIRE(kernel != nullptr, "Kernel is not set!\n");
+
+	SGVector<float64_t> statistic(num_null_samples);
+	std::vector<index_t> term_counters(num_null_samples);
+
+	std::fill(statistic.vector, statistic.vector+statistic.vlen, 0);
+	std::fill(term_counters.data(), term_counters.data()+term_counters.size(), 1);
+
+	DataManager& data_mgr=owner.get_data_mgr();
+	ComputationManager cm;
+
+	create_statistic_job();
+	cm.enqueue_job(permutation_job);
+
+	std::vector<CFeatures*> blocks;
+
+	data_mgr.start();
+	auto next_burst=data_mgr.next();
+
+	while (!next_burst.empty())
+	{
+		merge_samples(next_burst, blocks);
+		compute_kernel(cm, blocks, kernel);
+		blocks.resize(0);
+
+		for (auto j=0; j<num_null_samples; ++j)
+		{
+			compute_jobs(cm);
+			auto mmds=cm.result(0);
+			for (size_t i=0; i<mmds.size(); ++i)
+			{
+				auto delta=mmds[i]-statistic[j];
+				statistic[j]+=delta/term_counters[j];
+				term_counters[j]++;
+			}
+		}
+		next_burst=data_mgr.next();
+	}
+
+	data_mgr.end();
+	cm.done();
+
+	// normalize statistic
+	std::for_each(statistic.vector, statistic.vector + statistic.vlen, [this](float64_t& value)
+	{
+		value=owner.normalize_statistic(value);
+	});
+
+	return statistic;
+}
+
+CStreamingMMD::CStreamingMMD() : CMMD()
+{
+#if EIGEN_VERSION_AT_LEAST(3,1,0)
+	Eigen::initParallel();
+#endif
+	self=std::unique_ptr<Self>(new Self(*this));
+}
+
+CStreamingMMD::~CStreamingMMD()
+{
+}
+
+float64_t CStreamingMMD::compute_statistic()
+{
+	return self->compute_statistic_variance().first;
+}
+
+float64_t CStreamingMMD::compute_variance()
+{
+	return self->compute_statistic_variance().second;
+}
+
+SGVector<float64_t> CStreamingMMD::compute_multiple()
+{
+	return self->compute_statistic_and_Q(get_kernel_selection_strategy()->get_kernel_mgr()).first;
+}
+
+std::pair<float64_t, float64_t> CStreamingMMD::compute_statistic_variance()
+{
+	return self->compute_statistic_variance();
+}
+
+std::pair<SGVector<float64_t>, SGMatrix<float64_t> > CStreamingMMD::compute_statistic_and_Q(const KernelManager& kernel_selection_mgr)
+{
+	return self->compute_statistic_and_Q(kernel_selection_mgr);
+}
+
+SGVector<float64_t> CStreamingMMD::sample_null()
+{
+	return self->sample_null();
+}
+
+void CStreamingMMD::set_num_null_samples(index_t null_samples)
+{
+	self->num_null_samples=null_samples;
+}
+
+const index_t CStreamingMMD::get_num_null_samples() const
+{
+	return self->num_null_samples;
+}
+
+void CStreamingMMD::use_gpu(bool gpu)
+{
+	self->use_gpu=gpu;
+}
+
+bool CStreamingMMD::use_gpu() const
+{
+	return self->use_gpu;
+}
+
+void CStreamingMMD::cleanup()
+{
+	for (size_t i=0; i<get_kernel_mgr().num_kernels(); ++i)
+		get_kernel_mgr().restore_kernel_at(i);
+}
+
+void CStreamingMMD::set_statistic_type(EStatisticType stype)
+{
+	self->statistic_type=stype;
+}
+
+const EStatisticType CStreamingMMD::get_statistic_type() const
+{
+	return self->statistic_type;
+}
+
+void CStreamingMMD::set_variance_estimation_method(EVarianceEstimationMethod vmethod)
+{
+	// TODO overload this
+/*	if (std::is_same<Derived, CQuadraticTimeMMD>::value && vmethod == VEM_PERMUTATION)
+	{
+		std::cerr << "cannot use permutation method for quadratic time MMD" << std::endl;
+	}*/
+	self->variance_estimation_method=vmethod;
+}
+
+const EVarianceEstimationMethod CStreamingMMD::get_variance_estimation_method() const
+{
+	return self->variance_estimation_method;
+}
+
+void CStreamingMMD::set_null_approximation_method(ENullApproximationMethod nmethod)
+{
+	// TODO overload this
+/*	if (std::is_same<Derived, CQuadraticTimeMMD>::value && nmethod == NAM_MMD1_GAUSSIAN)
+	{
+		std::cerr << "cannot use gaussian method for quadratic time MMD" << std::endl;
+	}
+	else if ((std::is_same<Derived, CBTestMMD>::value || std::is_same<Derived, CLinearTimeMMD>::value) &&
+			(nmethod == NAM_MMD2_SPECTRUM || nmethod == NAM_MMD2_GAMMA))
+	{
+		std::cerr << "cannot use spectrum/gamma method for B-test/linear time MMD" << std::endl;
+	}*/
+	self->null_approximation_method=nmethod;
+}
+
+const ENullApproximationMethod CStreamingMMD::get_null_approximation_method() const
+{
+	return self->null_approximation_method;
+}
+
+const char* CStreamingMMD::get_name() const
+{
+	return "StreamingMMD";
+}
diff --git a/src/shogun/statistical_testing/StreamingMMD.h b/src/shogun/statistical_testing/StreamingMMD.h
new file mode 100644
index 00000000000..5336b6b37e1
--- /dev/null
+++ b/src/shogun/statistical_testing/StreamingMMD.h
@@ -0,0 +1,107 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef STREAMING_MMD_H_
+#define STREAMING_MMD_H_
+
+#include <utility>
+#include <memory>
+#include <functional>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+
+namespace shogun
+{
+
+/** forward declarations */
+class CKernel;
+class CKernelSelectionStrategy;
+template <typename> class SGVector;
+template <typename> class SGMatrix;
+
+namespace internal
+{
+
+class KernelManager;
+class MaxTestPower;
+class MaxCrossValidation;
+class WeightedMaxTestPower;
+
+}
+
+class CStreamingMMD : public CMMD
+{
+	friend class internal::MaxTestPower;
+	friend class internal::WeightedMaxTestPower;
+	friend class internal::MaxCrossValidation;
+public:
+	typedef std::function<float32_t(SGMatrix<float32_t>)> operation;
+
+	CStreamingMMD();
+	virtual ~CStreamingMMD();
+
+	virtual float64_t compute_statistic();
+	virtual float64_t compute_variance();
+
+	virtual SGVector<float64_t> compute_multiple();
+
+	virtual SGVector<float64_t> sample_null();
+
+	void use_gpu(bool gpu);
+	void cleanup();
+
+	void set_statistic_type(EStatisticType stype);
+	const EStatisticType get_statistic_type() const;
+
+	void set_variance_estimation_method(EVarianceEstimationMethod vmethod);
+	const EVarianceEstimationMethod get_variance_estimation_method() const;
+
+	void set_num_null_samples(index_t null_samples);
+	const index_t get_num_null_samples() const;
+
+	void set_null_approximation_method(ENullApproximationMethod nmethod);
+	const ENullApproximationMethod get_null_approximation_method() const;
+
+	virtual const char* get_name() const;
+protected:
+	virtual const operation get_direct_estimation_method() const=0;
+	virtual float64_t normalize_statistic(float64_t statistic) const=0;
+	virtual const float64_t normalize_variance(float64_t variance) const=0;
+	bool use_gpu() const;
+	std::shared_ptr<CKernelSelectionStrategy> get_strategy();
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+	virtual std::pair<float64_t, float64_t> compute_statistic_variance();
+	std::pair<SGVector<float64_t>, SGMatrix<float64_t> > compute_statistic_and_Q(const internal::KernelManager&);
+};
+
+}
+#endif // STREAMING_MMD_H_
diff --git a/src/shogun/preprocessor/BAHSIC.cpp b/src/shogun/statistical_testing/TestEnums.h
similarity index 70%
rename from src/shogun/preprocessor/BAHSIC.cpp
rename to src/shogun/statistical_testing/TestEnums.h
index 47cdc9c6a4e..3c912f59bca 100644
--- a/src/shogun/preprocessor/BAHSIC.cpp
+++ b/src/shogun/statistical_testing/TestEnums.h
@@ -1,6 +1,7 @@
 /*
  * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -28,35 +29,43 @@
  * either expressed or implied, of the Shogun Development Team.
  */
 
-#include <shogun/statistics/HSIC.h>
-#include <shogun/preprocessor/BAHSIC.h>
+#ifndef TEST_ENUMS_H_
+#define TEST_ENUMS_H_
 
-using namespace shogun;
+#include <shogun/lib/config.h>
 
-CBAHSIC::CBAHSIC() : CKernelDependenceMaximization()
+namespace shogun
 {
-	initialize_parameters();
-}
 
-void CBAHSIC::initialize_parameters()
+enum EStatisticType
 {
-	m_estimator=new CHSIC();
-	SG_REF(m_estimator);
-	m_algorithm=BACKWARD_ELIMINATION;
-}
+	ST_UNBIASED_FULL,
+	ST_UNBIASED_INCOMPLETE,
+	ST_BIASED_FULL
+};
 
-CBAHSIC::~CBAHSIC()
+enum EVarianceEstimationMethod
 {
-	// estimator is SG_UNREF'ed in base CDependenceMaximization destructor
-}
+	VEM_DIRECT,
+	VEM_PERMUTATION
+};
 
-void CBAHSIC::set_algorithm(EFeatureSelectionAlgorithm algorithm)
+enum ENullApproximationMethod
 {
-	SG_INFO("Algorithm is set to BACKWARD_ELIMINATION for %s and therefore "
-			"cannot be set externally!\n", get_name());
-}
+	NAM_PERMUTATION,
+	NAM_MMD1_GAUSSIAN,
+	NAM_MMD2_SPECTRUM,
+	NAM_MMD2_GAMMA
+};
 
-EPreprocessorType CBAHSIC::get_type() const
+enum EKernelSelectionMethod
 {
-	return P_BAHSIC;
+	KSM_MEDIAN_HEURISTIC,
+	KSM_MAXIMIZE_MMD,
+	KSM_MAXIMIZE_POWER,
+	KSM_CROSS_VALIDATION,
+	KSM_AUTO = KSM_MAXIMIZE_POWER
+};
+
 }
+#endif // TEST_ENUMS_H_
diff --git a/src/shogun/statistical_testing/TwoDistributionTest.cpp b/src/shogun/statistical_testing/TwoDistributionTest.cpp
new file mode 100644
index 00000000000..68698c8bb3e
--- /dev/null
+++ b/src/shogun/statistical_testing/TwoDistributionTest.cpp
@@ -0,0 +1,164 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/statistical_testing/TwoDistributionTest.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/TestTypes.h>
+#include <shogun/statistical_testing/internals/NextSamples.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+
+using namespace shogun;
+using namespace internal;
+
+CTwoDistributionTest::CTwoDistributionTest() : CHypothesisTest(TwoDistributionTest::num_feats)
+{
+}
+
+CTwoDistributionTest::~CTwoDistributionTest()
+{
+}
+
+void CTwoDistributionTest::set_p(CFeatures* samples_from_p)
+{
+	REQUIRE(samples_from_p, "Samples from P cannot be NULL!\n");
+	auto& dm=get_data_mgr();
+	dm.samples_at(0)=samples_from_p;
+}
+
+CFeatures* CTwoDistributionTest::get_p() const
+{
+	const auto& dm=get_data_mgr();
+	return dm.samples_at(0);
+}
+
+void CTwoDistributionTest::set_q(CFeatures* samples_from_q)
+{
+	REQUIRE(samples_from_q, "Samples from Q cannot be NULL!\n");
+	auto& dm=get_data_mgr();
+	dm.samples_at(1)=samples_from_q;
+}
+
+CFeatures* CTwoDistributionTest::get_q() const
+{
+	const auto& dm=get_data_mgr();
+	return dm.samples_at(1);
+}
+
+void CTwoDistributionTest::set_num_samples_p(index_t num_samples_from_p)
+{
+	auto& dm=get_data_mgr();
+	dm.num_samples_at(0)=num_samples_from_p;
+}
+
+const index_t CTwoDistributionTest::get_num_samples_p() const
+{
+	const auto& dm=get_data_mgr();
+	return dm.num_samples_at(0);
+}
+
+void CTwoDistributionTest::set_num_samples_q(index_t num_samples_from_q)
+{
+	auto& dm=get_data_mgr();
+	dm.num_samples_at(1)=num_samples_from_q;
+}
+
+const index_t CTwoDistributionTest::get_num_samples_q() const
+{
+	const auto& dm=get_data_mgr();
+	return dm.num_samples_at(1);
+}
+
+CCustomDistance* CTwoDistributionTest::compute_distance(CDistance* distance)
+{
+	auto& data_mgr=get_data_mgr();
+	bool is_blockwise=data_mgr.is_blockwise();
+	data_mgr.set_blockwise(false);
+
+	data_mgr.start();
+	auto samples=data_mgr.next();
+	REQUIRE(!samples.empty(), "Could not fetch samples!\n");
+
+	CFeatures *samples_p=samples[0][0].get();
+	CFeatures *samples_q=samples[1][0].get();
+	SG_REF(samples_p);
+	SG_REF(samples_q);
+
+	distance->cleanup();
+	distance->remove_lhs_and_rhs();
+	REQUIRE(distance->init(samples_p, samples_q), "Could not initialize distance instance!\n");
+	auto dist_mat=distance->get_distance_matrix<float32_t>();
+	distance->remove_lhs_and_rhs();
+	distance->cleanup();
+
+	samples.clear();
+	data_mgr.end();
+	data_mgr.set_blockwise(is_blockwise);
+
+	auto precomputed_distance=new CCustomDistance();
+	precomputed_distance->set_full_distance_matrix_from_full(dist_mat.data(), dist_mat.num_rows, dist_mat.num_cols);
+	return precomputed_distance;
+}
+
+CCustomDistance* CTwoDistributionTest::compute_joint_distance(CDistance* distance)
+{
+	auto& data_mgr=get_data_mgr();
+	bool is_blockwise=data_mgr.is_blockwise();
+	data_mgr.set_blockwise(false);
+
+	data_mgr.start();
+	auto samples=data_mgr.next();
+	REQUIRE(!samples.empty(), "Could not fetch samples!\n");
+
+	CFeatures *samples_p=samples[0][0].get();
+	CFeatures *samples_q=samples[1][0].get();
+	auto p_and_q=FeaturesUtil::create_merged_copy(samples_p, samples_q);
+
+	samples.clear();
+	data_mgr.end();
+	data_mgr.set_blockwise(is_blockwise);
+
+	distance->cleanup();
+	distance->remove_lhs_and_rhs();
+	REQUIRE(distance->init(p_and_q, p_and_q), "Could not initialize distance instance!\n");
+	auto dist_mat=distance->get_distance_matrix<float32_t>();
+	distance->remove_lhs_and_rhs();
+	distance->cleanup();
+
+	auto precomputed_distance=new CCustomDistance();
+	precomputed_distance->set_triangle_distance_matrix_from_full(dist_mat.data(), dist_mat.num_rows, dist_mat.num_cols);
+	return precomputed_distance;
+}
+
+const char* CTwoDistributionTest::get_name() const
+{
+	return "TwoDistributionTest";
+}
diff --git a/src/shogun/statistical_testing/TwoDistributionTest.h b/src/shogun/statistical_testing/TwoDistributionTest.h
new file mode 100644
index 00000000000..6706d723cde
--- /dev/null
+++ b/src/shogun/statistical_testing/TwoDistributionTest.h
@@ -0,0 +1,157 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef TWO_DISTRIBUTION_TEST_H_
+#define TWO_DISTRIBUTION_TEST_H_
+
+#include <shogun/statistical_testing/HypothesisTest.h>
+
+namespace shogun
+{
+
+class CDistance;
+class CCustomDistance;
+
+/**
+ * @brief Class TwoDistributionTest is the base class for the statistical
+ * hypothesis testing with samples from two distributions, \f$mathbf{P}\f$
+ * and \f$\mathbf{Q}\f$.
+ *
+ * \sa {CTwoSampleTest}
+ */
+class CTwoDistributionTest : public CHypothesisTest
+{
+public:
+	/** Default constructor */
+	CTwoDistributionTest();
+
+	/** Destrutor */
+	virtual ~CTwoDistributionTest();
+
+	/**
+	 * Method that initializes the samples from \f$\mathbf{P}\f$. This method
+	 * is kept virtual for the sub-classes to perform additional initialization
+	 * tasks that have to be performed every time features are set/updated.
+	 *
+	 * @param samples_from_p The CFeatures instance representing the samples
+	 * from \f$\mathbf{P}\f$.
+	 */
+	virtual void set_p(CFeatures* samples_from_p);
+
+	/** @return The samples from \f$\mathbf{P}\f$. */
+	CFeatures* get_p() const;
+
+	/**
+	 * Method that initializes the samples from \f$\mathbf{Q}\f$. This method
+	 * is kept virtual for the sub-classes to perform additional initialization
+	 * tasks that have to be performed every time features are set/updated.
+	 *
+	 * @param samples_from_q The CFeatures instance representing the samples
+	 * from \f$\mathbf{Q}\f$.
+	 */
+	virtual void set_q(CFeatures* samples_from_q);
+
+	/** @return The samples from \f$\mathbf{Q}\f$. */
+	CFeatures* get_q() const;
+
+	/**
+	 * Method that initializes the number of samples to be drawn from distribution
+	 * \f$\mathbf{P}\f$. Please ensure to call this method if you are intending to
+	 * use streaming data generators that generate the samples on the fly. For
+	 * other types of features, the number of samples is set internally from the
+	 * features object itself, therefore this method should not be used.
+	 *
+	 * @param num_samples_from_p The CFeatures instance representing the samples
+	 * from \f$\mathbf{P}\f$.
+	 */
+	void set_num_samples_p(index_t num_samples_from_p);
+
+	/** @return The number of samples from \f$\mathbf{P}\f$. */
+	const index_t get_num_samples_p() const;
+
+	/**
+	 * Method that initializes the number of samples to be drawn from distribution
+	 * \f$\mathbf{Q}\f$. Please ensure to call this method if you are intending to
+	 * use streaming data generators that generate the samples on the fly. For
+	 * other types of features, the number of samples is set internally from the
+	 * features object itself, therefore this method should not be used.
+	 *
+	 * @param num_samples_from_q The CFeatures instance representing the samples
+	 * from \f$\mathbf{Q}\f$.
+	 */
+	void set_num_samples_q(index_t num_samples_from_q);
+
+	/** @return The number of samples from \f$\mathbf{Q}\f$. */
+	const index_t get_num_samples_q() const;
+
+	/**
+	 * Method that pre-computes the pair-wise distance between the samples using
+	 * the provided distance instance.
+	 *
+	 * @param distance The distance instance used for pre-computing the pair-wise
+	 * distance.
+	 * @return A newly created CCustomDistance instance representing the
+	 * pre-computed pair-wise distance between the samples.
+	 */
+	CCustomDistance* compute_distance(CDistance* distance);
+
+	/**
+	 * Method that pre-computes the pair-wise distance between the joint samples using
+	 * the provided distance instance. A temporary object appending the samples from
+	 * both the distributions is created in order to perform the task.
+	 *
+	 * @param distance The distance instance used for pre-computing the pair-wise
+	 * distance.
+	 * @return A newly created CCustomDistance instance representing the
+	 * pre-computed pair-wise distance between the joint samples.
+	 */
+	CCustomDistance* compute_joint_distance(CDistance* distance);
+
+	/**
+	 * Interface for computing the test-statistic for the hypothesis test.
+	 *
+	 * @return test statistic for the given data/parameters/methods
+	 */
+	virtual float64_t compute_statistic()=0;
+
+	/**
+	 * Interface for computing the samples under the null-hypothesis.
+	 *
+	 * @return vector of all statistics
+	 */
+	virtual SGVector<float64_t> sample_null()=0;
+
+	/** @return The name of the class */
+	virtual const char* get_name() const;
+};
+
+}
+#endif // TWO_DISTRIBUTION_TEST_H_
diff --git a/src/shogun/statistical_testing/TwoSampleTest.cpp b/src/shogun/statistical_testing/TwoSampleTest.cpp
new file mode 100644
index 00000000000..9f440cf5fba
--- /dev/null
+++ b/src/shogun/statistical_testing/TwoSampleTest.cpp
@@ -0,0 +1,78 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <shogun/kernel/Kernel.h>
+#include <shogun/statistical_testing/TwoSampleTest.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/TestTypes.h>
+
+using namespace shogun;
+using namespace internal;
+
+struct CTwoSampleTest::Self
+{
+	Self(index_t num_kernels);
+	KernelManager kernel_mgr;
+};
+
+CTwoSampleTest::Self::Self(index_t num_kernels) : kernel_mgr(num_kernels)
+{
+}
+
+CTwoSampleTest::CTwoSampleTest() : CTwoDistributionTest()
+{
+	self=std::unique_ptr<Self>(new Self(TwoSampleTest::num_kernels));
+}
+
+CTwoSampleTest::CTwoSampleTest(CFeatures* samples_from_p, CFeatures* samples_from_q) : CTwoDistributionTest()
+{
+	self=std::unique_ptr<Self>(new Self(TwoSampleTest::num_kernels));
+	set_p(samples_from_p);
+	set_q(samples_from_q);
+}
+
+CTwoSampleTest::~CTwoSampleTest()
+{
+}
+
+void CTwoSampleTest::set_kernel(CKernel* kernel)
+{
+	REQUIRE(kernel, "Kernel cannot be NULL!\n");
+	self->kernel_mgr.kernel_at(0)=kernel;
+	self->kernel_mgr.restore_kernel_at(0);
+}
+
+CKernel* CTwoSampleTest::get_kernel() const
+{
+	return get_kernel_mgr().kernel_at(0);
+}
+
+const char* CTwoSampleTest::get_name() const
+{
+	return "TwoSampleTest";
+}
+
+KernelManager& CTwoSampleTest::get_kernel_mgr()
+{
+	return self->kernel_mgr;
+}
+
+const KernelManager& CTwoSampleTest::get_kernel_mgr() const
+{
+	return self->kernel_mgr;
+}
diff --git a/src/shogun/statistical_testing/TwoSampleTest.h b/src/shogun/statistical_testing/TwoSampleTest.h
new file mode 100644
index 00000000000..93659722d70
--- /dev/null
+++ b/src/shogun/statistical_testing/TwoSampleTest.h
@@ -0,0 +1,99 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TWO_SAMPLE_TEST_H_
+#define TWO_SAMPLE_TEST_H_
+
+#include <memory>
+#include <shogun/statistical_testing/TwoDistributionTest.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CFeatures;
+
+namespace internal
+{
+	class KernelManager;
+}
+
+/** @brief Kernel two sample test base class. Provides an interface for
+ * performing a two-sample test using a kernel, i.e. Given samples from two
+ * distributions \f$p\f$ and \f$q\f$, the null-hypothesis is: \f$H_0: p=q\f$,
+ * the alternative hypothesis: \f$H_1: p\neq q\f$.
+ *
+ * In this class, this is done using a single kernel for the data.
+ *
+ * Abstract base class.
+ */
+class CTwoSampleTest : public CTwoDistributionTest
+{
+public:
+	/** Default constructor */
+	CTwoSampleTest();
+
+	/**
+	 * Convenience constructor that initializes the samples from two distributions.
+	 *
+	 * @param samples_from_p Samples from \f$p\f$
+	 * @param samples_from_q Samples from \f$q\f$
+	 */
+	CTwoSampleTest(CFeatures* samples_from_p, CFeatures* samples_from_q);
+
+	/** Destructor */
+	virtual ~CTwoSampleTest();
+
+	/**
+	 * Method that sets the kernel that is used for performing the two-sample test.
+	 * It is kept virtual so that sub-classes can perform other initialization tasks
+	 * that has to be trigger every time a kernel is set/updated.
+	 *
+	 * @param kernel The kernel instance.
+	 */
+	virtual void set_kernel(CKernel* kernel);
+
+	/** @return The kernel instance that is presently being used for performing the test */
+	CKernel* get_kernel() const;
+
+	/**
+	 * Interface for computing the test-statistic for the hypothesis test.
+	 *
+	 * @return test statistic for the given data/parameters/methods
+	 */
+	virtual float64_t compute_statistic()=0;
+
+	/**
+	 * Interface for computing the samples under the null-hypothesis.
+	 *
+	 * @return vector of all statistics
+	 */
+	virtual SGVector<float64_t> sample_null()=0;
+
+	/** @return The name of the class */
+	virtual const char* get_name() const;
+protected:
+	internal::KernelManager& get_kernel_mgr();
+	const internal::KernelManager& get_kernel_mgr() const;
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+};
+
+}
+#endif // TWO_SAMPLE_TEST_H_
diff --git a/src/shogun/statistical_testing/internals/Block.cpp b/src/shogun/statistical_testing/internals/Block.cpp
new file mode 100644
index 00000000000..7750f7c892c
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/Block.cpp
@@ -0,0 +1,85 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <numeric>
+#include <shogun/lib/SGVector.h>
+#include <shogun/features/Features.h>
+#include <shogun/statistical_testing/internals/Block.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+
+using namespace shogun;
+using namespace internal;
+
+Block::Block(CFeatures* feats, index_t index, index_t size) : m_feats(feats)
+{
+	REQUIRE(m_feats!=nullptr, "Underlying feature object cannot be null!\n");
+
+	// increase the refcount of the underlying feature object
+	// we want this object to be alive till the last block is free'd
+	SG_REF(m_feats);
+
+	// create a shallow copy and subset current block separately
+	CFeatures* block=FeaturesUtil::create_shallow_copy(feats);
+	ASSERT(block->ref_count()==0);
+
+	SGVector<index_t> inds(size);
+	std::iota(inds.vector, inds.vector+inds.vlen, index*size);
+	block->add_subset(inds);
+
+	// since this block object is internal, we simply use a shared_ptr
+	m_block=std::shared_ptr<CFeatures>(block);
+}
+
+Block::Block(const Block& other) : m_block(other.m_block), m_feats(other.m_feats)
+{
+	SG_REF(m_feats);
+}
+
+Block& Block::operator=(const Block& other)
+{
+	m_block=other.m_block;
+	m_feats=other.m_feats;
+	SG_REF(m_feats);
+	return *this;
+}
+
+Block::~Block()
+{
+	SG_UNREF(m_feats);
+}
+
+std::vector<Block> Block::create_blocks(CFeatures* feats, index_t num_blocks, index_t size)
+{
+	std::vector<Block> vec;
+	for (index_t i=0; i<num_blocks; ++i)
+		vec.push_back(Block(feats, i, size));
+	return vec;
+}
diff --git a/src/shogun/statistical_testing/internals/Block.h b/src/shogun/statistical_testing/internals/Block.h
new file mode 100644
index 00000000000..0890a7c188a
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/Block.h
@@ -0,0 +1,145 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <memory>
+#include <vector>
+#include <shogun/lib/common.h>
+
+#ifndef BLOCK_H__
+#define BLOCK_H__
+
+namespace shogun
+{
+
+class CFeatures;
+
+namespace internal
+{
+
+/**
+ * @brief Class that holds a block feature. A block feature is a shallow
+ * copy of an underlying (non-owning) feature object. In its constructor,
+ * it increases the refcount of the original object (since it has to be
+ * alive as long as the block is alive) and it decreases the refcount of
+ * the original object in destructor.
+ */
+class Block
+{
+private:
+	/**
+	 * Constructor to create a block object. It makes a shallow copy of
+	 * the underlying feature object, and adds subset according to the
+	 * block begin index and the blocksize.
+	 *
+	 * Increases the reference count of the underlying feature object.
+	 *
+	 * @param feats The underlying feature object.
+	 * @param index The index of the block.
+	 * @param size The size of the block (number of feature vectors).
+	 */
+	Block(CFeatures* feats, index_t index, index_t size);
+public:
+	/**
+	 * Copy constructor. Every time a block is copied or assigned, the underlying
+	 * feature object is SG_REF'd.
+	 */
+	Block(const Block& other);
+
+	/**
+	 * Assignment operator. Every time a block is copied or assigned, the underlying
+	 * feature object is SG_REF'd.
+	 */
+	Block& operator=(const Block& other);
+
+	/**
+	 * Destructor. Decreases the reference count of the underlying feature object.
+	 */
+	~Block();
+
+	/**
+	 * Method that creates a number of block objects. See @Block for details.
+	 *
+	 * @param feats The underlying feature object.
+	 * @param num_blocks The number of blocks to be formed.
+	 * @param size The size of the block (number of feature vectors).
+	 */
+	static std::vector<Block> create_blocks(CFeatures* feats, index_t num_blocks, index_t size);
+
+	/**
+	 * Operator overloading for getting the block object as a shared ptr (non-const).
+	 */
+	inline operator std::shared_ptr<CFeatures>()
+	{
+		return m_block;
+	}
+
+	/**
+	 * Operator overloading for getting the block object as a naked ptr (non-const, unsafe).
+	 */
+	inline operator CFeatures*()
+	{
+		return m_block.get();
+	}
+
+	/**
+	 * Operator overloading for getting the block object as a naked ptr (const).
+	 */
+	inline operator const CFeatures*() const
+	{
+		return m_block.get();
+	}
+
+	/**
+	 * @return the block feature object (non-const, unsafe).
+	 */
+	inline CFeatures* get()
+	{
+		return static_cast<CFeatures*>(*this);
+	}
+
+	/**
+	 * @return the block feature object (const).
+	 */
+	inline const CFeatures* get() const
+	{
+		return static_cast<const CFeatures*>(*this);
+	}
+private:
+	/** Shallow copy representing the block */
+	std::shared_ptr<CFeatures> m_block;
+
+	/** Underlying feature object */
+	CFeatures* m_feats;
+};
+
+}
+
+}
+#endif // BLOCK_H__
diff --git a/src/shogun/statistical_testing/internals/BlockwiseDetails.cpp b/src/shogun/statistical_testing/internals/BlockwiseDetails.cpp
new file mode 100644
index 00000000000..e8ceb4519b9
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/BlockwiseDetails.cpp
@@ -0,0 +1,54 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/statistical_testing/internals/BlockwiseDetails.h>
+
+using namespace shogun;
+using namespace internal;
+
+BlockwiseDetails::BlockwiseDetails() : m_blocksize(0), m_num_blocks_per_burst(1),
+	m_max_num_samples_per_burst(0), m_next_block_index(0), m_total_num_blocks(0),
+	m_full_data(true)
+{
+}
+
+BlockwiseDetails& BlockwiseDetails::with_blocksize(index_t blocksize)
+{
+	m_blocksize = blocksize;
+	m_max_num_samples_per_burst = m_blocksize * m_num_blocks_per_burst;
+	return *this;
+}
+
+BlockwiseDetails& BlockwiseDetails::with_num_blocks_per_burst(index_t num_blocks_per_burst)
+{
+	m_num_blocks_per_burst = num_blocks_per_burst;
+	m_max_num_samples_per_burst = m_blocksize * m_num_blocks_per_burst;
+	return *this;
+}
diff --git a/src/shogun/statistical_testing/internals/BlockwiseDetails.h b/src/shogun/statistical_testing/internals/BlockwiseDetails.h
new file mode 100644
index 00000000000..916fd53e578
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/BlockwiseDetails.h
@@ -0,0 +1,97 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/lib/common.h>
+
+#ifndef BLOCK_WISE_DETAILS_H__
+#define BLOCK_WISE_DETAILS_H__
+
+namespace shogun
+{
+
+namespace internal
+{
+
+/**
+ * @brief Class that holds block-details for the data-fetchers.
+ * There are one instance of this class per fetcher.
+ */
+class BlockwiseDetails
+{
+	friend class DataFetcher;
+	friend class StreamingDataFetcher;
+	friend class DataManager;
+
+public:
+
+	/**
+	 * Default constructor.
+	 */
+	BlockwiseDetails();
+
+	/**
+	 * Method that sets the blocksize for current fetcher.
+	 * @param blocksize the size of the block
+	 * @return an instance of the current object
+	 */
+	BlockwiseDetails& with_blocksize(index_t blocksize);
+
+	/**
+	 * Method that sets the number of blocks to be fetched per burst for current fetcher.
+	 * @param num_blocks_per_burst the number of blocks to be fetched per burst
+	 * @return an instance of the current object
+	 */
+	BlockwiseDetails& with_num_blocks_per_burst(index_t num_blocks_per_burst);
+
+private:
+
+	/** The size of the blocks */
+	index_t m_blocksize;
+
+	/** The number of blocks fetched per burst */
+	index_t m_num_blocks_per_burst;
+
+	/** The maximum number of samples fetched per burst */
+	index_t m_max_num_samples_per_burst;
+
+	/** Index for the next block to be fetched. Set by data fetchers */
+	index_t m_next_block_index;
+
+	/** Total number of blocks to be fetched. Set by data fetchers */
+	index_t m_total_num_blocks;
+
+	/** Whether the block should consist of full data (i.e. no block at all) */
+	bool m_full_data;
+};
+
+}
+
+}
+#endif // BLOCK_WISE_DETAILS_H__
diff --git a/src/shogun/statistical_testing/internals/ComputationManager.cpp b/src/shogun/statistical_testing/internals/ComputationManager.cpp
new file mode 100644
index 00000000000..3551f299c9d
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/ComputationManager.cpp
@@ -0,0 +1,131 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/statistical_testing/internals/ComputationManager.h>
+
+using namespace shogun;
+using namespace internal;
+
+ComputationManager::ComputationManager()
+{
+}
+
+ComputationManager::~ComputationManager()
+{
+}
+
+void ComputationManager::num_data(index_t n)
+{
+	data_array.resize(n);
+}
+
+SGMatrix<float32_t>& ComputationManager::data(index_t i)
+{
+	return data_array[i];
+}
+
+void ComputationManager::enqueue_job(std::function<float32_t(SGMatrix<float32_t>)> job)
+{
+	job_array.push_back(job);
+}
+
+void ComputationManager::compute_data_parallel_jobs()
+{
+	// this is used when there are more number of data blocks to be processed
+	// than there are jobs
+	result_array.resize(job_array.size());
+	for (size_t j=0; j<job_array.size(); ++j)
+		result_array[j].resize(data_array.size());
+
+	if (gpu)
+	{
+		// TODO current_job_results = compute_job.compute_using_gpu(data_array);
+	}
+	else
+	{
+#pragma omp parallel for
+		for (size_t i=0; i<data_array.size(); ++i)
+		{
+			// using a temporary vector to hold the result, because it is
+			// cache friendly, since the original result matrix would lead
+			// to several cache misses, specially because the data is also
+			// being used here
+			std::vector<float32_t> current_data_results(job_array.size());
+			for (size_t j=0; j<job_array.size(); ++j)
+			{
+				const auto& compute_job=job_array[j];
+				current_data_results[j]=compute_job(data_array[i]);
+			}
+			// data is no more required, less cache miss when we just have to
+			// store the results
+			for (size_t j=0; j<current_data_results.size(); ++j)
+				result_array[j][i]=current_data_results[j];
+		}
+	}
+}
+
+void ComputationManager::compute_task_parallel_jobs()
+{
+	// this is used when there are more number of jobs to be processed
+	// than there are data blocks
+	result_array.resize(job_array.size());
+	for (size_t j=0; j<job_array.size(); ++j)
+		result_array[j].resize(data_array.size());
+
+	if (gpu)
+	{
+		// TODO current_job_results = compute_job.compute_using_gpu(data_array);
+	}
+	else
+	{
+		// TODO figure out other ways to deal with the parallelization in presence of
+		// eigen3. presently due to that, using OpenMP here messes things up!
+//#pragma omp parallel for
+		for (size_t j=0; j<job_array.size(); ++j)
+		{
+			const auto& compute_job=job_array[j];
+			// result_array[j][i] is contiguous, cache miss is minimized
+			for (size_t i=0; i<data_array.size(); ++i)
+				result_array[j][i]=compute_job(data_array[i]);
+		}
+	}
+}
+
+void ComputationManager::done()
+{
+	job_array.resize(0);
+	result_array.resize(0);
+}
+
+std::vector<float32_t>& ComputationManager::result(index_t i)
+{
+	return result_array[i];
+}
+
+ComputationManager& ComputationManager::use_gpu()
+{
+	gpu=true;
+	return *this;
+}
+
+ComputationManager& ComputationManager::use_cpu()
+{
+	gpu=false;
+	return *this;
+}
diff --git a/src/shogun/statistical_testing/internals/ComputationManager.h b/src/shogun/statistical_testing/internals/ComputationManager.h
new file mode 100644
index 00000000000..1eb54894294
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/ComputationManager.h
@@ -0,0 +1,62 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#ifndef COMPUTATION_MANAGER_H__
+#define COMPUTATION_MANAGER_H__
+
+#include <vector>
+#include <functional>
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+template <typename T> class SGMatrix;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class ComputationManager
+{
+public:
+	ComputationManager();
+	~ComputationManager();
+
+	void num_data(index_t n);
+	SGMatrix<float32_t>& data(index_t i);
+
+	void enqueue_job(std::function<float32_t(SGMatrix<float32_t>)> job);
+	void compute_data_parallel_jobs();
+	void compute_task_parallel_jobs();
+	void done();
+	std::vector<float32_t>& result(index_t i);
+
+	ComputationManager& use_cpu();
+	ComputationManager& use_gpu();
+private:
+	bool gpu;
+	std::vector<SGMatrix<float32_t> > data_array;
+	std::vector<std::function<float32_t(const SGMatrix<float32_t>&)> > job_array;
+	std::vector<std::vector<float32_t> > result_array;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+
+} // namespace internal
+
+} // namespace shogun
+#endif // COMPUTATION_MANAGER_H__
diff --git a/src/shogun/statistical_testing/internals/DataFetcher.cpp b/src/shogun/statistical_testing/internals/DataFetcher.cpp
new file mode 100644
index 00000000000..602c7bafc15
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/DataFetcher.cpp
@@ -0,0 +1,277 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <algorithm>
+#include <numeric>
+#include <shogun/features/Features.h>
+#include <shogun/statistical_testing/internals/DataFetcher.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+
+using namespace shogun;
+using namespace internal;
+
+DataFetcher::DataFetcher() : m_num_samples(0), train_test_mode(false),
+	train_mode(false), m_samples(nullptr), features_shuffled(false)
+{
+}
+
+DataFetcher::DataFetcher(CFeatures* samples) : train_test_mode(false),
+   	train_mode(false), m_samples(samples), features_shuffled(false)
+{
+	REQUIRE(m_samples!=nullptr, "Samples cannot be null!\n");
+	SG_REF(m_samples);
+	m_num_samples=m_samples->get_num_vectors();
+}
+
+DataFetcher::~DataFetcher()
+{
+	SG_UNREF(m_samples);
+}
+
+void DataFetcher::set_blockwise(bool blockwise)
+{
+	if (blockwise)
+	{
+		m_block_details=last_blockwise_details;
+		SG_SDEBUG("Restoring the blockwise details!\n");
+		m_block_details.m_full_data=false;
+	}
+	else
+	{
+		last_blockwise_details=m_block_details;
+		SG_SDEBUG("Saving the blockwise details!\n");
+		m_block_details=BlockwiseDetails();
+	}
+}
+
+void DataFetcher::set_train_test_mode(bool on)
+{
+	train_test_mode=on;
+}
+
+bool DataFetcher::is_train_test_mode() const
+{
+	return train_test_mode;
+}
+
+void DataFetcher::set_train_mode(bool on)
+{
+	train_mode=on;
+}
+
+bool DataFetcher::is_train_mode() const
+{
+	return train_mode;
+}
+
+void DataFetcher::set_train_test_ratio(float64_t ratio)
+{
+	train_test_ratio=ratio;
+}
+
+float64_t DataFetcher::get_train_test_ratio() const
+{
+	return train_test_ratio;
+}
+
+void DataFetcher::shuffle_features()
+{
+	REQUIRE(train_test_mode, "This method is allowed only when Train/Test method is active!\n");
+	if (features_shuffled)
+	{
+		SG_SWARNING("Features are already shuffled! Call to shuffle_features() has no effect."
+		"If you want to reshuffle, please call unshuffle_features() first and then call this method!\n");
+	}
+	else
+	{
+		const index_t size=m_samples->get_num_vectors();
+		SG_SDEBUG("Current number of feature vectors = %d\n", size);
+		if (shuffle_subset.size()<size)
+		{
+			SG_SDEBUG("Resizing the shuffle indices vector (from %d to %d)\n", shuffle_subset.size(), size);
+			shuffle_subset=SGVector<index_t>(size);
+		}
+		std::iota(shuffle_subset.data(), shuffle_subset.data()+shuffle_subset.size(), 0);
+		CMath::permute(shuffle_subset);
+//		shuffle_subset.display_vector("shuffle_subset");
+
+		SG_SDEBUG("Shuffling %d feature vectors\n", size);
+		m_samples->add_subset(shuffle_subset);
+
+		features_shuffled=true;
+	}
+}
+
+void DataFetcher::unshuffle_features()
+{
+	REQUIRE(train_test_mode, "This method is allowed only when Train/Test method is active!\n");
+	if (features_shuffled)
+	{
+		m_samples->remove_subset();
+		features_shuffled=false;
+	}
+	else
+	{
+		SG_SWARNING("Features are NOT shuffled! Call to unshuffle_features() has no effect."
+		"If you want to reshuffle, please call shuffle_features() instead!\n");
+	}
+}
+
+void DataFetcher::use_fold(index_t idx)
+{
+	allocate_active_subset();
+	auto num_samples_per_fold=get_num_samples()/get_num_folds();
+	auto start_idx=idx*num_samples_per_fold;
+	if (train_mode)
+	{
+		std::iota(active_subset.data(), active_subset.data()+active_subset.size(), 0);
+		if (start_idx<active_subset.size())
+		{
+			std::for_each(active_subset.data()+start_idx, active_subset.data()+active_subset.size(),
+			[&num_samples_per_fold](index_t& val)
+			{
+				val+=num_samples_per_fold;
+			});
+		}
+	}
+	else
+		std::iota(active_subset.data(), active_subset.data()+active_subset.size(), start_idx);
+//	active_subset.display_vector("active_subset");
+}
+
+void DataFetcher::init_active_subset()
+{
+	allocate_active_subset();
+	index_t start_index=0;
+	if (!train_mode)
+		start_index=m_samples->get_num_vectors()*train_test_ratio/(train_test_ratio+1);
+	std::iota(active_subset.data(), active_subset.data()+active_subset.size(), start_index);
+//	active_subset.display_vector("active_subset");
+}
+
+void DataFetcher::start()
+{
+	REQUIRE(get_num_samples()>0, "Number of samples is 0!\n");
+	if (train_test_mode)
+	{
+		m_samples->add_subset(active_subset);
+		SG_SDEBUG("Added active subset!\n");
+		SG_SINFO("Currently active number of samples is %d\n", get_num_samples());
+	}
+
+	if (m_block_details.m_full_data || m_block_details.m_blocksize>get_num_samples())
+	{
+		SG_SINFO("Fetching entire data (%d samples)!\n", get_num_samples());
+		m_block_details.with_blocksize(get_num_samples());
+	}
+	m_block_details.m_total_num_blocks=get_num_samples()/m_block_details.m_blocksize;
+	reset();
+}
+
+CFeatures* DataFetcher::next()
+{
+	CFeatures* next_samples=nullptr;
+	// figure out how many samples to fetch in this burst
+	auto num_already_fetched=m_block_details.m_next_block_index*m_block_details.m_blocksize;
+	auto num_more_samples=get_num_samples()-num_already_fetched;
+	if (num_more_samples>0)
+	{
+		// create a shallow copy and add proper index subset
+		next_samples=FeaturesUtil::create_shallow_copy(m_samples);
+		auto num_samples_this_burst=std::min(m_block_details.m_max_num_samples_per_burst, num_more_samples);
+		if (num_samples_this_burst<next_samples->get_num_vectors())
+		{
+			SGVector<index_t> inds(num_samples_this_burst);
+			std::iota(inds.vector, inds.vector+inds.vlen, num_already_fetched);
+			next_samples->add_subset(inds);
+		}
+		m_block_details.m_next_block_index+=m_block_details.m_num_blocks_per_burst;
+	}
+	return next_samples;
+}
+
+void DataFetcher::reset()
+{
+	m_block_details.m_next_block_index=0;
+}
+
+void DataFetcher::end()
+{
+	if (train_test_mode)
+	{
+		m_samples->remove_subset();
+		SG_SDEBUG("Removed active subset!\n");
+		SG_SINFO("Currently active number of samples is %d\n", get_num_samples());
+	}
+}
+
+index_t DataFetcher::get_num_samples() const
+{
+	if (train_test_mode)
+	{
+		if (train_mode)
+			return m_num_samples*train_test_ratio/(train_test_ratio+1);
+		else
+			return m_num_samples/(train_test_ratio+1);
+	}
+	return m_samples->get_num_vectors();
+}
+
+index_t DataFetcher::get_num_folds() const
+{
+	return 1+ceil(get_train_test_ratio());
+}
+
+index_t DataFetcher::get_num_training_samples() const
+{
+	return get_num_samples()*get_train_test_ratio()/(get_train_test_ratio()+1);
+}
+
+index_t DataFetcher::get_num_testing_samples() const
+{
+	return get_num_samples()/(get_train_test_ratio()+1);
+}
+
+BlockwiseDetails& DataFetcher::fetch_blockwise()
+{
+	m_block_details.m_full_data=false;
+	return m_block_details;
+}
+
+void DataFetcher::allocate_active_subset()
+{
+	REQUIRE(train_test_mode, "This method is allowed only when Train/Test method is active!\n");
+	index_t num_active_samples=0;
+	if (train_mode)
+	{
+		num_active_samples=m_samples->get_num_vectors()*train_test_ratio/(train_test_ratio+1);
+		SG_SINFO("Using %d number of samples for this fold as training samples!\n", num_active_samples);
+	}
+	else
+	{
+		num_active_samples=m_samples->get_num_vectors()/(train_test_ratio+1);
+		SG_SINFO("Using %d number of samples for this fold as testing samples!\n", num_active_samples);
+	}
+
+	ASSERT(num_active_samples>0);
+	if (active_subset.size()!=num_active_samples)
+	{
+		SG_SDEBUG("Resizing the active subset from %d to %d\n", active_subset.size(), num_active_samples);
+		active_subset=SGVector<index_t>(num_active_samples);
+	}
+}
diff --git a/src/shogun/statistical_testing/internals/DataFetcher.h b/src/shogun/statistical_testing/internals/DataFetcher.h
new file mode 100644
index 00000000000..bb22ea2f19c
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/DataFetcher.h
@@ -0,0 +1,109 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <memory>
+#include <shogun/lib/common.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/statistical_testing/internals/BlockwiseDetails.h>
+
+#ifndef DATA_FETCHER_H__
+#define DATA_FETCHER_H__
+
+namespace shogun
+{
+
+class CFeatures;
+
+namespace internal
+{
+
+class DataManager;
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class DataFetcher
+{
+	friend class DataManager;
+	friend class InitPerFeature;
+public:
+	DataFetcher(CFeatures* samples);
+	virtual ~DataFetcher();
+
+	void set_blockwise(bool blockwise);
+
+	void set_train_test_mode(bool on);
+	bool is_train_test_mode() const;
+
+	void set_train_mode(bool on);
+	bool is_train_mode() const;
+
+	void set_train_test_ratio(float64_t ratio);
+	float64_t get_train_test_ratio() const;
+
+	virtual void shuffle_features();
+	virtual void unshuffle_features();
+
+	virtual void use_fold(index_t i);
+	virtual void init_active_subset();
+
+	virtual void start();
+	virtual CFeatures* next();
+	virtual void reset();
+	virtual void end();
+
+	virtual index_t get_num_samples() const;
+
+	index_t get_num_folds() const;
+	index_t get_num_training_samples() const;
+	index_t get_num_testing_samples() const;
+
+	BlockwiseDetails& fetch_blockwise();
+	virtual const char* get_name() const
+	{
+		return "DataFetcher";
+	}
+protected:
+	DataFetcher();
+	BlockwiseDetails m_block_details;
+	index_t m_num_samples;
+	bool train_test_mode;
+	bool train_mode;
+	float64_t train_test_ratio;
+private:
+	CFeatures* m_samples;
+	SGVector<index_t> shuffle_subset;
+	SGVector<index_t> active_subset;
+	bool features_shuffled;
+	BlockwiseDetails last_blockwise_details;
+	void allocate_active_subset();
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+#endif // DATA_FETCHER_H__
diff --git a/src/shogun/statistical_testing/internals/DataFetcherFactory.cpp b/src/shogun/statistical_testing/internals/DataFetcherFactory.cpp
new file mode 100644
index 00000000000..1cb915e7398
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/DataFetcherFactory.cpp
@@ -0,0 +1,37 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <shogun/features/Features.h>
+#include <shogun/features/streaming/StreamingFeatures.h>
+#include <shogun/statistical_testing/internals/DataFetcher.h>
+#include <shogun/statistical_testing/internals/StreamingDataFetcher.h>
+#include <shogun/statistical_testing/internals/DataFetcherFactory.h>
+
+using namespace shogun;
+using namespace internal;
+
+DataFetcher* DataFetcherFactory::get_instance(CFeatures* feats)
+{
+	EFeatureClass fclass = feats->get_feature_class();
+	if (fclass == C_STREAMING_DENSE || fclass == C_STREAMING_SPARSE || fclass == C_STREAMING_STRING)
+	{
+		return new StreamingDataFetcher(static_cast<CStreamingFeatures*>(feats));
+	}
+	return new DataFetcher(feats);
+}
+
diff --git a/src/shogun/statistical_testing/internals/DataFetcherFactory.h b/src/shogun/statistical_testing/internals/DataFetcherFactory.h
new file mode 100644
index 00000000000..702744c5861
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/DataFetcherFactory.h
@@ -0,0 +1,48 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <memory>
+#include <shogun/lib/config.h>
+
+#ifndef DATA_FETCHER_FACTORY_H__
+#define DATA_FETCHER_FACTORY_H__
+
+namespace shogun
+{
+
+class CFeatures;
+
+namespace internal
+{
+
+class DataFetcher;
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+struct DataFetcherFactory
+{
+	DataFetcherFactory() = delete;
+	DataFetcherFactory(const DataFetcherFactory& other) = delete;
+	DataFetcherFactory& operator=(const DataFetcherFactory& other) = delete;
+	~DataFetcherFactory() = delete;
+
+	static DataFetcher* get_instance(CFeatures* feats);
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+#endif // DATA_FETCHER_FACTORY_H__
diff --git a/src/shogun/statistical_testing/internals/DataManager.cpp b/src/shogun/statistical_testing/internals/DataManager.cpp
new file mode 100644
index 00000000000..baa96d8aa65
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/DataManager.cpp
@@ -0,0 +1,426 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <memory>
+#include <shogun/io/SGIO.h>
+#include <shogun/features/Features.h>
+#include <shogun/statistical_testing/internals/Block.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/NextSamples.h>
+#include <shogun/statistical_testing/internals/DataFetcher.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <shogun/statistical_testing/internals/DataFetcherFactory.h>
+
+using namespace shogun;
+using namespace internal;
+
+DataManager::DataManager(size_t num_distributions)
+{
+	SG_SDEBUG("Data manager instance initialized with %d data sources!\n", num_distributions);
+	fetchers.resize(num_distributions);
+	std::fill(fetchers.begin(), fetchers.end(), nullptr);
+
+	train_test_mode=default_train_test_mode;
+	train_mode=default_train_mode;
+	train_test_ratio=default_train_test_ratio;
+	cross_validation_mode=default_cross_validation_mode;
+}
+
+DataManager::~DataManager()
+{
+}
+
+index_t DataManager::get_num_samples() const
+{
+	SG_SDEBUG("Entering!\n");
+	index_t n=0;
+	typedef const std::unique_ptr<DataFetcher> fetcher_type;
+	if (std::any_of(fetchers.begin(), fetchers.end(), [](fetcher_type& f) { return f->m_num_samples==0; }))
+		SG_SERROR("number of samples from all the distributions are not set!")
+	else
+		std::for_each(fetchers.begin(), fetchers.end(), [&n](fetcher_type& f) { n+=f->m_num_samples; });
+	SG_SDEBUG("Leaving!\n");
+	return n;
+}
+
+index_t DataManager::get_min_blocksize() const
+{
+	SG_SDEBUG("Entering!\n");
+	index_t min_blocksize=0;
+	typedef const std::unique_ptr<DataFetcher> fetcher_type;
+	if (std::any_of(fetchers.begin(), fetchers.end(), [](fetcher_type& f) { return f->m_num_samples==0; }))
+		SG_SERROR("number of samples from all the distributions are not set!")
+	else
+	{
+		index_t divisor=0;
+		std::function<index_t(index_t, index_t)> gcd=[&gcd](index_t m, index_t n)
+		{
+			return n==0?m:gcd(n, m%n);
+		};
+		for (size_t i=0; i<fetchers.size(); ++i)
+			divisor=gcd(divisor, fetchers[i]->m_num_samples);
+		min_blocksize=get_num_samples()/divisor;
+	}
+	SG_SDEBUG("min blocksize is %d!", min_blocksize);
+	SG_SDEBUG("Leaving!\n");
+	return min_blocksize;
+}
+
+void DataManager::set_blocksize(index_t blocksize)
+{
+	SG_SDEBUG("Entering!\n");
+	auto n=get_num_samples();
+
+	REQUIRE(n>0,
+			"Total number of samples is 0! Please set the number of samples!\n");
+	REQUIRE(blocksize>0 && blocksize<=n,
+			"The blocksize has to be within [0, %d], given = %d!\n",
+			n, blocksize);
+	REQUIRE(n%blocksize==0,
+		"Total number of samples (%d) has to be divisble by the blocksize (%d)!\n",
+		n, blocksize);
+
+	for (size_t i=0; i<fetchers.size(); ++i)
+	{
+		index_t m=fetchers[i]->m_num_samples;
+		REQUIRE((blocksize*m)%n==0,
+			"Blocksize (%d) cannot be even distributed with a ratio of %f!\n",
+			blocksize, m/n);
+		fetchers[i]->fetch_blockwise().with_blocksize(blocksize*m/n);
+		SG_SDEBUG("block[%d].size = ", i, blocksize*m/n);
+	}
+	SG_SDEBUG("Leaving!\n");
+}
+
+void DataManager::set_num_blocks_per_burst(index_t num_blocks_per_burst)
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(num_blocks_per_burst>0,
+	   	"Number of blocks per burst (%d) has to be greater than 0!\n",
+		num_blocks_per_burst);
+
+	index_t blocksize=0;
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [&blocksize](fetcher_type& f)
+	{
+		blocksize+=f->m_block_details.m_blocksize;
+	});
+	REQUIRE(blocksize>0,
+			"Blocksizes are not set!\n");
+
+	index_t max_num_blocks_per_burst=get_num_samples()/blocksize;
+	if (num_blocks_per_burst>max_num_blocks_per_burst)
+	{
+		SG_SINFO("There can only be %d blocks per burst given the blocksize (%d)!\n", max_num_blocks_per_burst, blocksize);
+		SG_SINFO("Setting num blocks per burst to be %d instead!\n", max_num_blocks_per_burst);
+		num_blocks_per_burst=max_num_blocks_per_burst;
+	}
+
+	for (size_t i=0; i<fetchers.size(); ++i)
+		fetchers[i]->fetch_blockwise().with_num_blocks_per_burst(num_blocks_per_burst);
+	SG_SDEBUG("Leaving!\n");
+}
+
+InitPerFeature DataManager::samples_at(size_t i)
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<fetchers.size(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, fetchers.size()-1);
+	SG_SDEBUG("Leaving!\n");
+	return InitPerFeature(fetchers[i]);
+}
+
+CFeatures* DataManager::samples_at(size_t i) const
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<fetchers.size(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, fetchers.size()-1);
+	SG_SDEBUG("Leaving!\n");
+	if (fetchers[i]!=nullptr)
+		return fetchers[i]->m_samples;
+	else
+		return nullptr;
+}
+
+index_t& DataManager::num_samples_at(size_t i)
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<fetchers.size(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, fetchers.size()-1);
+	SG_SDEBUG("Leaving!\n");
+	return fetchers[i]->m_num_samples;
+}
+
+const index_t DataManager::num_samples_at(size_t i) const
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<fetchers.size(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, fetchers.size()-1);
+	SG_SDEBUG("Leaving!\n");
+	if (fetchers[i]!=nullptr)
+		return fetchers[i]->get_num_samples();
+	else
+		return 0;
+}
+
+const index_t DataManager::blocksize_at(size_t i) const
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<fetchers.size(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, fetchers.size()-1);
+	SG_SDEBUG("Leaving!\n");
+	if (fetchers[i]!=nullptr)
+		return fetchers[i]->m_block_details.m_blocksize;
+	else
+		return 0;
+}
+
+void DataManager::set_blockwise(bool blockwise)
+{
+	SG_SDEBUG("Entering!\n");
+	for (size_t i=0; i<fetchers.size(); ++i)
+		fetchers[i]->set_blockwise(blockwise);
+	SG_SDEBUG("Leaving!\n");
+}
+
+const bool DataManager::is_blockwise() const
+{
+	SG_SDEBUG("Entering!\n");
+	bool blockwise=true;
+	for (size_t i=0; i<fetchers.size(); ++i)
+		blockwise&=!fetchers[i]->m_block_details.m_full_data;
+	SG_SDEBUG("Leaving!\n");
+	return blockwise;
+}
+
+void DataManager::set_train_test_mode(bool on)
+{
+	train_test_mode=on;
+	if (!train_test_mode)
+	{
+		train_mode=default_train_mode;
+		train_test_ratio=default_train_test_ratio;
+		cross_validation_mode=default_cross_validation_mode;
+	}
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [this, on](fetcher_type& f)
+	{
+		f->set_train_test_mode(on);
+		if (on)
+		{
+			f->set_train_mode(train_mode);
+			f->set_train_test_ratio(train_test_ratio);
+		}
+	});
+}
+
+bool DataManager::is_train_test_mode() const
+{
+	return train_test_mode;
+}
+
+void DataManager::set_train_mode(bool on)
+{
+	if (train_test_mode)
+		train_mode=on;
+	else
+	{
+		SG_SERROR("Train mode cannot be used without turning on Train/Test mode first!"
+		"Please call set_train_test_mode(True) before using this method!\n");
+	}
+}
+
+bool DataManager::is_train_mode() const
+{
+	return train_mode;
+}
+
+void DataManager::set_cross_validation_mode(bool on)
+{
+	if (train_test_mode)
+		cross_validation_mode=on;
+	else
+	{
+		SG_SERROR("Cross-validation mode cannot be used without turning on Train/Test mode first!"
+		"Please call set_train_test_mode(True) before using this method!\n");
+	}
+}
+
+bool DataManager::is_cross_validation_mode() const
+{
+	return cross_validation_mode;
+}
+
+void DataManager::set_train_test_ratio(float64_t ratio)
+{
+	if (train_test_mode)
+		train_test_ratio=ratio;
+	else
+	{
+		SG_SERROR("Train-test ratio cannot be set without turning on Train/Test mode first!"
+		"Please call set_train_test_mode(True) before using this method!\n");
+	}
+}
+
+float64_t DataManager::get_train_test_ratio() const
+{
+	return train_test_ratio;
+}
+
+index_t DataManager::get_num_folds() const
+{
+	return ceil(get_train_test_ratio())+1;
+}
+
+void DataManager::shuffle_features()
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [](fetcher_type& f) { f->shuffle_features(); });
+	SG_SDEBUG("Leaving!\n");
+}
+
+void DataManager::unshuffle_features()
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [](fetcher_type& f) { f->unshuffle_features(); });
+	SG_SDEBUG("Leaving!\n");
+}
+
+void DataManager::init_active_subset()
+{
+	SG_SDEBUG("Entering!\n");
+
+	REQUIRE(train_test_mode,
+		"Train-test subset cannot be used without turning on Train/Test mode first!"
+		"Please call set_train_test_mode(True) before using this method!\n");
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [this](fetcher_type& f)
+	{
+   		f->set_train_mode(train_mode);
+		f->set_train_test_ratio(train_test_ratio);
+		f->init_active_subset();
+	});
+	SG_SDEBUG("Leaving!\n");
+}
+
+void DataManager::use_fold(index_t idx)
+{
+	SG_SDEBUG("Entering!\n");
+
+	REQUIRE(train_test_mode,
+		"Fold subset cannot be used without turning on Train/Test mode first!"
+		"Please call set_train_test_mode(True) before using this method!\n");
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+	REQUIRE(idx>=0, "Fold index has to be in [0, %d]!", get_num_folds()-1);
+	REQUIRE(idx<get_num_folds(), "Fold index has to be in [0, %d]!", get_num_folds()-1);
+
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [this, idx](fetcher_type& f)
+	{
+		f->set_train_mode(train_mode);
+		f->set_train_test_ratio(train_test_ratio);
+		f->use_fold(idx);
+	});
+	SG_SDEBUG("Leaving!\n");
+}
+
+void DataManager::start()
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+
+	if (train_test_mode && !cross_validation_mode)
+		init_active_subset();
+
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [](fetcher_type& f) { f->start(); });
+	SG_SDEBUG("Leaving!\n");
+}
+
+NextSamples DataManager::next()
+{
+	SG_SDEBUG("Entering!\n");
+
+	// sets the number of feature objects (number of distributions)
+	NextSamples next_samples(fetchers.size());
+
+	// fetch a number of blocks (per burst) from each distribution
+	for (size_t i=0; i<fetchers.size(); ++i)
+	{
+		auto feats=fetchers[i]->next();
+		if (feats!=nullptr)
+		{
+			ASSERT(feats->ref_count()==0);
+
+			auto blocksize=fetchers[i]->m_block_details.m_blocksize;
+			auto num_blocks_curr_burst=feats->get_num_vectors()/blocksize;
+
+			// use same number of blocks from all the distributions
+			if (next_samples.m_num_blocks==0)
+				next_samples.m_num_blocks=num_blocks_curr_burst;
+			else
+				ASSERT(next_samples.m_num_blocks==num_blocks_curr_burst);
+
+			next_samples[i]=Block::create_blocks(feats, num_blocks_curr_burst, blocksize);
+		}
+	}
+	SG_SDEBUG("Leaving!\n");
+	return next_samples;
+}
+
+void DataManager::end()
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [](fetcher_type& f) { f->end(); });
+	SG_SDEBUG("Leaving!\n");
+}
+
+void DataManager::reset()
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(fetchers.size()>0, "Features are not set!");
+	typedef std::unique_ptr<DataFetcher> fetcher_type;
+	std::for_each(fetchers.begin(), fetchers.end(), [](fetcher_type& f) { f->reset(); });
+	SG_SDEBUG("Leaving!\n");
+}
diff --git a/src/shogun/statistical_testing/internals/DataManager.h b/src/shogun/statistical_testing/internals/DataManager.h
new file mode 100644
index 00000000000..70324daebf1
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/DataManager.h
@@ -0,0 +1,238 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef DATA_MANAGER_H__
+#define DATA_MANAGER_H__
+
+#include <vector>
+#include <memory>
+#include <shogun/statistical_testing/internals/InitPerFeature.h>
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+class CFeatures;
+
+namespace internal
+{
+
+class DataFetcher;
+class NextSamples;
+
+/**
+ * @brief Class DataManager for fetching/streaming test data block-wise.
+ * It can handle data coming from multiple sources. The number of data
+ * sources is represented by the num_distributions parameter in the constructor
+ * of the data manager. It can handle heterogenous data sources, and it can
+ * stream multiple blocks per burst, as the computation would require. The size
+ * of the blocks and the number of blocks to be fetched per burst can be set
+ * externally.
+ *
+ * This class is designed to be used on a stack. An instance of DataManager
+ * should not be serialzied or copied or moved around. In Shogun, it is helpful
+ * when used inside just the implementation inside a PIMPL.
+ */
+class DataManager
+{
+public:
+	/**
+	 * Default constructor.
+	 *
+	 * @param num_distributions number of data sources (i.e. CFeature objects)
+	 */
+	DataManager(size_t num_distributions);
+
+	/**
+	 * Disabled copy constructor
+	 * @param other other instance
+	 */
+	DataManager(const DataManager& other) = delete;
+
+	/**
+	 * Disabled assignment operator
+	 * @param other other instance
+	 */
+	DataManager& operator=(const DataManager& other) = delete;
+
+	/**
+	 * Destructor
+	 */
+	~DataManager();
+
+	/**
+	 * Sets the blocksize for block-wise data fetching. It divides the block-size
+	 * per data source according to the total number of feature vectors available
+	 * from that source. More formally, if there are \f$K\f$ data sources, \f$X_k\f$,
+	 * \f$k=\[1,K]\f$, with number of feature vectors \f$n_{X_k}\f$ from each, then
+	 * setting a block-size of \f$B\f$ would mean that in each next() call of the
+	 * data manager instance, it will fetch \f$rho_{X_k} B\f$ samples from each
+	 * \f$X_k\f$, where \f$rho_{X_k}=n_{X_k}/n\f$, \f$n=sum_k n_{X_k}\f$.
+	 *
+	 * @param blocksize The size of the block consisting of data from all the sources.
+	 */
+	void set_blocksize(index_t blocksize);
+
+	/**
+	 * In order to speed up the computation, usually a number of blocks are fetched at
+	 * once per next() call. This method sets that number.
+	 *
+	 * @param num_blocks_per_burst The number of blocks to be fetched in a burst.
+	 */
+	void set_num_blocks_per_burst(index_t num_blocks_per_burst);
+
+	/**
+	 * Setter for feature object as a data source. Since multiple data sources are
+	 * supported, this method takes an index in which the feature object is set.
+	 * Internally, it initializes a data fetcher object for the provided feature
+	 * object.
+	 *
+	 * Example usage:
+	 * @code
+	 *
+	 * DataManager data_mgr;
+	 * // feats_0 = some CFeatures instance
+	 * // feats_1 = some CFeatures instance
+	 * data_mgr.sample_at(0) = feats_0;
+	 * data_mgr.sample_at(1) = feats_1;
+	 *
+	 * @endcode
+	 *
+	 * @param i The data source index, at which the feature object is to be set as a
+	 * data source.
+	 * @return An initializer for the specified data source (that sets up a fetcher
+	 * for this feature), to be used as lvalue.
+	 */
+	InitPerFeature samples_at(size_t i);
+
+	/**
+	 * Getter for feature object at a give data source index.
+	 *
+	 * @param i The data source index, from which the feature object is to be obtained
+	 * @return The underlying CFeatures object at the specified data source.
+	 */
+	CFeatures* samples_at(size_t i) const;
+
+	/**
+	 * Setter for the number of samples. Setting this number is mandatory for
+	 * streaming features. For other type of feature objects, this number equals
+	 * the number of vectors, and is set internally.
+	 *
+	 * Example usage:
+	 * @code
+	 *
+	 * DataManager data_mgr;
+	 * data_mgr.num_sample_at(0) = 10;
+	 * data_mgr.num_sample_at(1) = 15;
+	 *
+	 * @endcode
+	 *
+	 * @param i The data source index, at which the number of samples is to be set.
+	 * @return A reference for the number of samples for the specified data source
+	 * to be used as lvalue.
+	 */
+	index_t& num_samples_at(size_t i);
+
+	/**
+	 * Getter for the number of samples.
+	 *
+	 * @param i The data source index, from which the number of samples is to be obtained.
+	 * @return The number of samples for the specified data source.
+	 */
+	const index_t num_samples_at(size_t i) const;
+
+	/**
+	 * Getter for the number of samples from a specified data source in a block.
+	 *
+	 * @param i The data source index.
+	 * @return The number of samples from i-th data source in a block.
+	 */
+	const index_t blocksize_at(size_t i) const;
+
+	/**
+	 * @return Total number of samples that can be fetched from all the data sources.
+	 */
+	index_t get_num_samples() const;
+
+	/**
+	 * @return The minimum block-size that can be fetched from the specified data sources.
+	 * For example, if there are two data sources, with samples 20 and 30, respectively,
+	 * then minimum blocksize can be 5 (2 from 1st data source, 3 from the 2nd), and there
+	 * can be then 10 such blocks.
+	 */
+	index_t get_min_blocksize() const;
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+	void set_blockwise(bool blockwise);
+	const bool is_blockwise() const;
+
+	void set_train_test_mode(bool on);
+	bool is_train_test_mode() const;
+
+	void set_train_mode(bool on);
+	bool is_train_mode() const;
+
+	void set_cross_validation_mode(bool on);
+	bool is_cross_validation_mode() const;
+
+	void set_train_test_ratio(float64_t ratio);
+	float64_t get_train_test_ratio() const;
+
+	index_t get_num_folds() const;
+
+	void shuffle_features();
+	void unshuffle_features();
+
+	void use_fold(index_t i);
+	void init_active_subset();
+
+	void start();
+	NextSamples next();
+	void end();
+	void reset();
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+private:
+	std::vector<std::unique_ptr<DataFetcher> > fetchers;
+
+	bool train_test_mode; // -> if ON, then train/test/fold subset is used (in start()) in end() method, we remove these subsets.
+	bool cross_validation_mode; // -> if ON, then shuffle subset is used, remove it after train_test mode in end()
+	bool train_mode; // -> if train/test mode ON or cross-validation mode on, this one is used.
+	float64_t train_test_ratio;
+
+	constexpr static bool default_train_test_mode=false;
+	constexpr static bool default_train_mode=false;
+	constexpr static bool default_cross_validation_mode=false;
+	constexpr static float64_t default_train_test_ratio=1.0;
+};
+
+}
+
+}
+
+#endif // DATA_MANAGER_H__
diff --git a/src/shogun/statistical_testing/internals/FeaturesUtil.cpp b/src/shogun/statistical_testing/internals/FeaturesUtil.cpp
new file mode 100644
index 00000000000..b765688ea1f
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/FeaturesUtil.cpp
@@ -0,0 +1,149 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <stack>
+#include <algorithm>
+#include <shogun/io/SGIO.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/FeatureTypes.h>
+#include <shogun/features/Subset.h>
+#include <shogun/features/SubsetStack.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+
+using namespace shogun;
+using namespace internal;
+
+CFeatures* FeaturesUtil::create_shallow_copy(CFeatures* other)
+{
+	SG_SDEBUG("Entering!\n");
+	CFeatures* shallow_copy=nullptr;
+	if (other->get_feature_type()==F_DREAL && other->get_feature_class()==C_DENSE)
+	{
+		auto casted=static_cast<CDenseFeatures<float64_t>*>(other);
+
+		// use the same underlying feature matrix, no ref-count
+		int32_t num_feats=0, num_vecs=0;
+		float64_t* data=casted->get_feature_matrix(num_feats, num_vecs);
+		SG_SDEBUG("Using underlying feature matrix with %d dimensions and %d feature vectors!\n", num_feats, num_vecs);
+		SGMatrix<float64_t> feats_matrix(data, num_feats, num_vecs, false);
+		shallow_copy=new CDenseFeatures<float64_t>(feats_matrix);
+		clone_subset_stack(other, shallow_copy);
+	}
+	else
+		SG_SNOTIMPLEMENTED;
+	SG_SDEBUG("Leaving!\n");
+	return shallow_copy;
+}
+
+CFeatures* FeaturesUtil::create_merged_copy(CFeatures* feats_a, CFeatures* feats_b)
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(feats_a->get_feature_type()==feats_b->get_feature_type(),
+			"The feature types of the underlying feature objects should be same!\n");
+	REQUIRE(feats_a->get_feature_class()==feats_b->get_feature_class(),
+			"The feature classes of the underlying feature objects should be same!\n");
+
+	CFeatures* merged_copy=nullptr;
+
+	if (feats_a->get_feature_type()==F_DREAL && feats_a->get_feature_class()==C_DENSE)
+	{
+		auto casted_a=static_cast<CDenseFeatures<float64_t>*>(feats_a);
+		auto casted_b=static_cast<CDenseFeatures<float64_t>*>(feats_b);
+
+		REQUIRE(casted_a->get_num_features()==casted_b->get_num_features(),
+				"The number of features from a (%d) has to be equal with that of b (%d)!\n",
+				casted_a->get_num_features(), casted_b->get_num_features());
+
+		SGMatrix<float64_t> data_a=casted_a->get_feature_matrix();
+		SGMatrix<float64_t> data_b=casted_b->get_feature_matrix();
+		ASSERT(data_a.num_rows==data_b.num_rows);
+
+		SGMatrix<float64_t> merged(data_a.num_rows, data_a.num_cols+data_b.num_cols);
+		std::copy(data_a.data(), data_a.data()+data_a.size(), merged.data());
+		std::copy(data_b.data(), data_b.data()+data_b.size(), merged.data()+data_a.size());
+
+		merged_copy=new CDenseFeatures<float64_t>(merged);
+	}
+	else
+		SG_SNOTIMPLEMENTED;
+
+	SG_SDEBUG("Leaving!\n");
+	return merged_copy;
+}
+
+void FeaturesUtil::clone_subset_stack(CFeatures* src, CFeatures* dst)
+{
+	SG_SDEBUG("Entering!\n");
+	CSubsetStack* src_subset_stack=src->get_subset_stack();
+	if (src_subset_stack->has_subsets())
+	{
+		SG_SDEBUG("Subset present, cloning the subsets!\n");
+		CSubsetStack* subset_stack=static_cast<CSubsetStack*>(src_subset_stack->clone());
+		std::stack<SGVector<index_t> > stack;
+		while (subset_stack->has_subsets())
+		{
+			stack.push(subset_stack->get_last_subset()->get_subset_idx());
+			subset_stack->remove_subset();
+		}
+		SG_UNREF(subset_stack);
+		SG_SDEBUG("Number of subsets to be added is %d!\n", stack.size());
+		if (stack.size()>1)
+		{
+			SGVector<index_t> ref=stack.top();
+			dst->add_subset(ref);
+			stack.pop();
+			do
+			{
+				SGVector<index_t> inds=stack.top();
+				for (auto i=0, j=0; i<ref.size() && j<inds.size(); ++i)
+				{
+					if (ref[i]==inds[j])
+						inds[j++]=i;
+				}
+				dst->add_subset(inds);
+				inds=ref;
+				stack.pop();
+			} while (!stack.empty());
+		}
+		else
+		{
+			while (!stack.empty())
+			{
+				dst->add_subset(stack.top());
+				stack.pop();
+			}
+		}
+	}
+	SG_UNREF(src_subset_stack);
+	SG_SDEBUG("Leaving!\n");
+}
diff --git a/src/shogun/statistical_testing/internals/FeaturesUtil.h b/src/shogun/statistical_testing/internals/FeaturesUtil.h
new file mode 100644
index 00000000000..760e5be3316
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/FeaturesUtil.h
@@ -0,0 +1,83 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef FEATURES_UTIL_H__
+#define FEATURES_UTIL_H__
+
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+class CFeatures;
+
+namespace internal
+{
+
+/**
+ * @brief Class FeaturesUtil for providing generic helper methods for
+ * handling Shogun's feature objects for the big-testing framework.
+ */
+struct FeaturesUtil
+{
+	/**
+	 * This creates a shallow copy of the feature object. It uses the same
+	 * underlying feature storage as the original object, but it clones all
+	 * the subsets.
+	 *
+	 * @param other The feature object whose shallow copy has to be created.
+	 * @return A shallow copy of the feature object.
+	 */
+	static CFeatures* create_shallow_copy(CFeatures* other);
+
+	/**
+	 * This creates a merged copy of the two feature objects.
+	 *
+	 * @param feats_a First feature object.
+	 * @param feats_b Second feature object.
+	 * @return A merged copy of the feature objects with total number of feature
+	 * vectors of feats_a.num_vectors+feats_b.num_vectors.
+	 */
+	static CFeatures* create_merged_copy(CFeatures* feats_a, CFeatures* feats_b);
+
+	/**
+	 * This copies the subset stack from the src features object to the dst.
+	 *
+	 * @param src The source features object
+	 * @param dst The destination features object
+	 */
+	static void clone_subset_stack(CFeatures* src, CFeatures* dst);
+};
+
+}
+
+}
+
+#endif // FEATURES_UTIL_H__
diff --git a/src/shogun/statistical_testing/internals/InitPerFeature.cpp b/src/shogun/statistical_testing/internals/InitPerFeature.cpp
new file mode 100644
index 00000000000..b36223d5e0f
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/InitPerFeature.cpp
@@ -0,0 +1,44 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2014  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <shogun/statistical_testing/internals/InitPerFeature.h>
+#include <shogun/statistical_testing/internals/DataFetcher.h>
+#include <shogun/statistical_testing/internals/DataFetcherFactory.h>
+#include <shogun/features/Features.h>
+
+using namespace shogun;
+using namespace internal;
+
+InitPerFeature::InitPerFeature(std::unique_ptr<DataFetcher>& fetcher) : m_fetcher(fetcher)
+{
+}
+
+InitPerFeature::~InitPerFeature()
+{
+}
+
+InitPerFeature& InitPerFeature::operator=(CFeatures* feats)
+{
+	m_fetcher = std::unique_ptr<DataFetcher>(DataFetcherFactory::get_instance(feats));
+	return *this;
+}
+
+InitPerFeature::operator const CFeatures*() const
+{
+	return m_fetcher->m_samples;
+}
diff --git a/src/shogun/statistical_testing/internals/InitPerFeature.h b/src/shogun/statistical_testing/internals/InitPerFeature.h
new file mode 100644
index 00000000000..8e94c993b36
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/InitPerFeature.h
@@ -0,0 +1,53 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#ifndef INIT_PER_FEATURE_H__
+#define INIT_PER_FEATURE_H__
+
+#include <memory>
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+class CFeatures;
+
+namespace internal
+{
+
+class DataFetcher;
+class DataManager;
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class InitPerFeature
+{
+	friend class DataManager;
+private:
+	explicit InitPerFeature(std::unique_ptr<DataFetcher>& fetcher);
+public:
+	~InitPerFeature();
+	InitPerFeature& operator=(CFeatures* feats);
+	operator const CFeatures*() const;
+private:
+	std::unique_ptr<DataFetcher>& m_fetcher;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // INIT_PER_FEATURE_H__
diff --git a/src/shogun/statistical_testing/internals/InitPerKernel.cpp b/src/shogun/statistical_testing/internals/InitPerKernel.cpp
new file mode 100644
index 00000000000..1d80e531f56
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/InitPerKernel.cpp
@@ -0,0 +1,43 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <shogun/kernel/Kernel.h>
+#include <shogun/statistical_testing/internals/InitPerKernel.h>
+
+using namespace shogun;
+using namespace internal;
+
+InitPerKernel::InitPerKernel(std::shared_ptr<CKernel>& kernel) : m_kernel(kernel)
+{
+}
+
+InitPerKernel::~InitPerKernel()
+{
+}
+
+InitPerKernel& InitPerKernel::operator=(CKernel* kernel)
+{
+	SG_REF(kernel);
+	m_kernel = std::shared_ptr<CKernel>(kernel, [](CKernel* ptr) { SG_UNREF(ptr); });
+	return *this;
+}
+
+InitPerKernel::operator CKernel*() const
+{
+	return m_kernel.get();
+}
diff --git a/src/shogun/statistical_testing/internals/InitPerKernel.h b/src/shogun/statistical_testing/internals/InitPerKernel.h
new file mode 100644
index 00000000000..94cc3b2778a
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/InitPerKernel.h
@@ -0,0 +1,50 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#ifndef INIT_PER_KERNEL_H__
+#define INIT_PER_KERNEL_H__
+
+#include <memory>
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+class CKernel;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class InitPerKernel
+{
+	friend class KernelManager;
+private:
+	explicit InitPerKernel(std::shared_ptr<CKernel>& kernel);
+public:
+	~InitPerKernel();
+	InitPerKernel& operator=(CKernel* kernel);
+	operator CKernel*() const;
+private:
+	std::shared_ptr<CKernel>& m_kernel;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // INIT_PER_KERNEL_H__
diff --git a/src/shogun/statistical_testing/internals/Kernel.h b/src/shogun/statistical_testing/internals/Kernel.h
new file mode 100644
index 00000000000..aaf93c7e637
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/Kernel.h
@@ -0,0 +1,107 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/lib/common.h>
+#include <shogun/kernel/Kernel.h>
+
+#ifndef KERNEL_FUNCTOR_H__
+#define KERNEL_FUNCTOR_H__
+
+namespace shogun
+{
+
+class CKernel;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class Kernel
+{
+public:
+	explicit Kernel(CKernel* kernel) : m_kernel(kernel)
+	{
+	}
+
+	inline float32_t operator()(int32_t i, int32_t j) const
+	{
+		return m_kernel->kernel(i, j);
+	}
+private:
+	CKernel* m_kernel;
+};
+
+class SelfAdjointPrecomputedKernel
+{
+public:
+	SelfAdjointPrecomputedKernel() : m_num_feat_vec(0)
+	{
+	}
+	explicit SelfAdjointPrecomputedKernel(SGVector<float32_t> self_adjoint_kernel_matrix) : m_num_feat_vec(0)
+	{
+		REQUIRE(self_adjoint_kernel_matrix.size()>0, "Provided kernel matrix cannot be of size 0!\n");
+		m_self_adjoint_kernel_matrix=self_adjoint_kernel_matrix;
+	}
+	void precompute(CKernel* kernel)
+	{
+		REQUIRE(kernel, "Kernel instance cannot be NULL!\n");
+		REQUIRE(kernel->get_num_vec_lhs()==kernel->get_num_vec_rhs(),
+			"Kernel instance is not symmetric (%dx%d)!\n", kernel->get_num_vec_lhs(), kernel->get_num_vec_rhs());
+		m_num_feat_vec=kernel->get_num_vec_lhs();
+		auto size=m_num_feat_vec*(m_num_feat_vec+1)/2;
+		if (m_self_adjoint_kernel_matrix.size()==0 || m_self_adjoint_kernel_matrix.size()!=size)
+			m_self_adjoint_kernel_matrix=SGVector<float32_t>(size);
+		for (auto i=0; i<m_num_feat_vec; ++i)
+		{
+			for (auto j=i; j<m_num_feat_vec; ++j)
+			{
+				auto index=i*m_num_feat_vec-i*(i+1)/2+j;
+				m_self_adjoint_kernel_matrix[index]=kernel->kernel(i, j);
+			}
+		}
+	}
+	inline float32_t operator()(int32_t i, int32_t j) const
+	{
+		ASSERT(m_num_feat_vec);
+		ASSERT(i>=0 && i<m_num_feat_vec);
+		ASSERT(j>=0 && j<m_num_feat_vec);
+		if (i>j)
+			std::swap(i, j);
+		auto index=i*m_num_feat_vec-i*(i+1)/2+j;
+		return m_self_adjoint_kernel_matrix[index];
+	}
+private:
+	SGVector<float32_t> m_self_adjoint_kernel_matrix;
+	index_t m_num_feat_vec;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+#endif // KERNEL_FUNCTOR_H__
diff --git a/src/shogun/statistical_testing/internals/KernelManager.cpp b/src/shogun/statistical_testing/internals/KernelManager.cpp
new file mode 100644
index 00000000000..03262d2c667
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/KernelManager.cpp
@@ -0,0 +1,225 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <vector>
+#include <memory>
+#include <shogun/io/SGIO.h>
+#include <shogun/distance/Distance.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/CustomKernel.h>
+#include <shogun/distance/EuclideanDistance.h>
+#include <shogun/distance/ManhattanMetric.h>
+#include <shogun/kernel/ShiftInvariantKernel.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+
+using namespace shogun;
+using namespace internal;
+
+KernelManager::KernelManager()
+{
+	SG_SDEBUG("Kernel manager instance initialized!\n");
+}
+
+KernelManager::KernelManager(index_t num_kernels)
+{
+	SG_SDEBUG("Kernel manager instance initialized with %d kernels!\n", num_kernels);
+	m_kernels.resize(num_kernels);
+	m_precomputed_kernels.resize(num_kernels);
+	std::fill(m_kernels.begin(), m_kernels.end(), nullptr);
+	std::fill(m_precomputed_kernels.begin(), m_precomputed_kernels.end(), nullptr);
+}
+
+KernelManager::~KernelManager()
+{
+	clear();
+}
+
+void KernelManager::clear()
+{
+	m_kernels.resize(0);
+	m_precomputed_kernels.resize(0);
+}
+
+InitPerKernel KernelManager::kernel_at(size_t i)
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<num_kernels(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, num_kernels()-1);
+	SG_SDEBUG("Leaving!\n");
+	return InitPerKernel(m_kernels[i]);
+}
+
+CKernel* KernelManager::kernel_at(size_t i) const
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<num_kernels(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, num_kernels()-1);
+	if (m_precomputed_kernels[i]==nullptr)
+	{
+		SG_SDEBUG("Leaving!\n");
+		return m_kernels[i].get();
+	}
+	SG_SDEBUG("Precomputed kernel exists!\n");
+	SG_SDEBUG("Leaving!\n");
+	return m_precomputed_kernels[i].get();
+}
+
+void KernelManager::push_back(CKernel* kernel)
+{
+	SG_SDEBUG("Entering!\n");
+	SG_REF(kernel);
+	m_kernels.push_back(std::shared_ptr<CKernel>(kernel, [](CKernel* ptr) { SG_UNREF(ptr); }));
+	m_precomputed_kernels.push_back(nullptr);
+	SG_SDEBUG("Leaving!\n");
+}
+
+const size_t KernelManager::num_kernels() const
+{
+	return m_kernels.size();
+}
+
+void KernelManager::precompute_kernel_at(size_t i)
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<num_kernels(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, num_kernels()-1);
+	auto kernel=m_kernels[i].get();
+	if (kernel->get_kernel_type()!=K_CUSTOM)
+	{
+		// TODO give option to use different policies to precompute the kernel matrix
+		// this one here is default setting : use shogun's pthread parallelism to compute
+		// the kernel matrix.
+		SGMatrix<float32_t> kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+		m_precomputed_kernels[i]=std::shared_ptr<CCustomKernel>(new CCustomKernel(kernel_matrix));
+		SG_SDEBUG("Kernel type %s is precomputed and replaced internally with %s!\n",
+			kernel->get_name(), m_precomputed_kernels[i]->get_name());
+	}
+	SG_SDEBUG("Leaving!\n");
+}
+
+void KernelManager::restore_kernel_at(size_t i)
+{
+	SG_SDEBUG("Entering!\n");
+	REQUIRE(i<num_kernels(),
+			"Value of i (%d) should be between 0 and %d, inclusive!",
+			i, num_kernels()-1);
+	m_precomputed_kernels[i]=nullptr;
+	SG_SDEBUG("Precomputed kernel (if any) was deleted!\n");
+	SG_SDEBUG("Leaving!\n");
+}
+
+bool KernelManager::same_distance_type() const
+{
+	ASSERT(num_kernels()>0);
+	bool same=false;
+	EDistanceType distance_type=D_UNKNOWN;
+	for (size_t i=0; i<num_kernels(); ++i)
+	{
+		CShiftInvariantKernel* shift_invariant_kernel=dynamic_cast<CShiftInvariantKernel*>(kernel_at(i));
+		if (shift_invariant_kernel!=nullptr)
+		{
+			auto current_distance_type=shift_invariant_kernel->get_distance_type();
+			if (distance_type==D_UNKNOWN)
+			{
+				distance_type=current_distance_type;
+				same=true;
+			}
+			else if (distance_type==current_distance_type)
+				same=true;
+			else
+			{
+				same=false;
+				break;
+			}
+		}
+		else
+		{
+			same=false;
+			SG_SINFO("Kernel at location %d is not of CShiftInvariantKernel type (was of %s type)!\n",
+				i, kernel_at(i)->get_name());
+			break;
+		}
+	}
+	return same;
+}
+
+CDistance* KernelManager::get_distance_instance() const
+{
+	REQUIRE(same_distance_type(), "Distance types for all the kernels are not the same!\n");
+
+	CDistance* distance=nullptr;
+	CShiftInvariantKernel* kernel_0=dynamic_cast<CShiftInvariantKernel*>(kernel_at(0));
+	REQUIRE(kernel_0, "Kernel (%s) must be of CShiftInvariantKernel type!\n", kernel_at(0)->get_name());
+	if (kernel_0->get_distance_type()==D_EUCLIDEAN)
+	{
+		auto euclidean_distance=new CEuclideanDistance();
+		euclidean_distance->set_disable_sqrt(true);
+		distance=euclidean_distance;
+	}
+	else if (kernel_0->get_distance_type()==D_MANHATTAN)
+	{
+		auto manhattan_distance=new CManhattanMetric();
+		distance=manhattan_distance;
+	}
+	else
+	{
+		SG_SERROR("Unsupported distance type!\n");
+	}
+	return distance;
+}
+
+void KernelManager::set_precomputed_distance(CCustomDistance* distance) const
+{
+	for (size_t i=0; i<num_kernels(); ++i)
+	{
+		CKernel* kernel=kernel_at(i);
+		CShiftInvariantKernel* shift_inv_kernel=dynamic_cast<CShiftInvariantKernel*>(kernel);
+		REQUIRE(shift_inv_kernel!=nullptr, "Kernel instance (was %s) must be of CShiftInvarintKernel type!\n", kernel->get_name());
+		shift_inv_kernel->m_precomputed_distance=distance;
+		shift_inv_kernel->num_lhs=distance->get_num_vec_lhs();
+		shift_inv_kernel->num_rhs=distance->get_num_vec_rhs();
+	}
+}
+
+void KernelManager::unset_precomputed_distance() const
+{
+	for (size_t i=0; i<num_kernels(); ++i)
+	{
+		CKernel* kernel=kernel_at(i);
+		CShiftInvariantKernel* shift_inv_kernel=dynamic_cast<CShiftInvariantKernel*>(kernel);
+		REQUIRE(shift_inv_kernel!=nullptr, "Kernel instance (was %s) must be of CShiftInvarintKernel type!\n", kernel->get_name());
+		shift_inv_kernel->m_precomputed_distance=nullptr;
+		shift_inv_kernel->num_lhs=0;
+		shift_inv_kernel->num_rhs=0;
+	}
+}
diff --git a/src/shogun/statistical_testing/internals/KernelManager.h b/src/shogun/statistical_testing/internals/KernelManager.h
new file mode 100644
index 00000000000..51a398bf47c
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/KernelManager.h
@@ -0,0 +1,80 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef KERNEL_MANAGER_H__
+#define KERNEL_MANAGER_H__
+
+#include <vector>
+#include <memory>
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/internals/InitPerKernel.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CDistance;
+class CCustomDistance;
+class CCustomKernel;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class KernelManager
+{
+public:
+	KernelManager();
+	explicit KernelManager(index_t num_kernels);
+	~KernelManager();
+
+	InitPerKernel kernel_at(size_t i);
+	CKernel* kernel_at(size_t i) const;
+
+	void push_back(CKernel* kernel);
+	const size_t num_kernels() const;
+
+	void precompute_kernel_at(size_t i);
+	void restore_kernel_at(size_t i);
+
+	void clear();
+	bool same_distance_type() const;
+	CDistance* get_distance_instance() const;
+	void set_precomputed_distance(CCustomDistance* distance) const;
+	void unset_precomputed_distance() const;
+private:
+	std::vector<std::shared_ptr<CKernel> > m_kernels;
+	std::vector<std::shared_ptr<CCustomKernel> > m_precomputed_kernels;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // KERNEL_MANAGER_H__
diff --git a/src/shogun/statistical_testing/internals/NextSamples.cpp b/src/shogun/statistical_testing/internals/NextSamples.cpp
new file mode 100644
index 00000000000..cfaa1adae4f
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/NextSamples.cpp
@@ -0,0 +1,75 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <shogun/features/Features.h>
+#include <shogun/statistical_testing/internals/NextSamples.h>
+
+using namespace shogun;
+using namespace internal;
+
+NextSamples::NextSamples(index_t num_distributions) : m_num_blocks(0)
+{
+	next_samples.resize(num_distributions);
+}
+
+NextSamples& NextSamples::operator=(const NextSamples& other)
+{
+	clear();
+	m_num_blocks=other.m_num_blocks;
+	next_samples=other.next_samples;
+	return *this;
+}
+
+NextSamples::~NextSamples()
+{
+	clear();
+}
+
+std::vector<Block>& NextSamples::operator[](size_t i)
+{
+	REQUIRE(i>=0 && i<next_samples.size(),
+			"index (%d) must be between [0,%d]!\n",
+			i, next_samples.size()-1);
+	return next_samples[i];
+}
+
+const std::vector<Block>& NextSamples::operator[](size_t i) const
+{
+	REQUIRE(i>=0 && i<next_samples.size(),
+			"index (%d) must be between [0,%d]!\n",
+			i, next_samples.size()-1);
+	return next_samples[i];
+}
+
+const index_t NextSamples::num_blocks() const
+{
+	return m_num_blocks;
+}
+
+const bool NextSamples::empty() const
+{
+	typedef const std::vector<Block> type;
+	return std::any_of(next_samples.cbegin(), next_samples.cend(), [](type& f) { return f.size()==0; });
+}
+
+void NextSamples::clear()
+{
+	typedef std::vector<Block> type;
+	std::for_each(next_samples.begin(), next_samples.end(), [](type& f) { f.clear(); });
+	next_samples.clear();
+}
diff --git a/src/shogun/statistical_testing/internals/NextSamples.h b/src/shogun/statistical_testing/internals/NextSamples.h
new file mode 100644
index 00000000000..04980496f09
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/NextSamples.h
@@ -0,0 +1,116 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#ifndef NEXT_SAMPLES_H__
+#define NEXT_SAMPLES_H__
+
+#include <memory>
+#include <vector>
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/internals/Block.h>
+
+namespace shogun
+{
+
+class CFeatures;
+
+namespace internal
+{
+
+/**
+ * @brief class NextSamples is the return type for next() call in DataManager.
+ * If there are no more samples (from any one of the distributions), an empty
+ * instance of NextSamples is supposed to be returned. This can be verified
+ * from the caller by calling the empty() method. Otherwise, always a get()
+ * call with appropriate index would give the samples from that distribution.
+ * If an inappropriate index is provided, e.g. get(2) for a two-sample test,
+ * a runtime exception is thrown.
+ *
+ * Example usage:
+ * @code
+ * 		NextSamples next_samples(2);
+ * 		next_samples[0] = fetchers[0].next();
+ * 		next_samples[1] = fetchers[1].next();
+ * 		if (!next_samples.empty())
+ * 		{
+ * 			auto first = next_samples[0];
+ * 			auto second = next_samples[1];
+ * 			auto third = next_samples[2]; / Runtime Error
+ * 		}
+ * @endcode
+ */
+class NextSamples
+{
+	friend class DataManager;
+private:
+	NextSamples(index_t num_distributions);
+public:
+	/**
+	 * Assignment operator. Clears the current blocks.
+	 */
+	NextSamples& operator=(const NextSamples& other);
+
+	/**
+	 * Destructor
+	 */
+	~NextSamples();
+
+	/**
+	 * Contains a number of blocks (of samples) fetched in the current burst from a
+	 * specified distribution.
+	 *
+	 * @param i determines samples from which distribution
+	 * @return a vector of fetched blocks of features from the specified distribution
+	 */
+	std::vector<Block>& operator[](size_t i);
+
+	/**
+	 * Const version of the above. This is called when a const instance of NextSamples
+	 * is returned.
+	 */
+	const std::vector<Block>& operator[](size_t i) const;
+
+	/**
+	 * @return number of blocks fetched from each of the distribution. It is assumed
+	 * that this number is same for all the distributions.
+	 */
+	const index_t num_blocks() const;
+
+	/**
+	 * This returns true if any of the distribution fetched 0 blocks (checked from the
+	 * size of the vector for that distribution)
+	 *
+	 * @return whether this instance does not contain any blocks of samples from any
+	 * of the distribution
+	 */
+	const bool empty() const;
+
+	/**
+	 * Method that clears the memory occupied by the feature objects inside.
+	 */
+	void clear();
+private:
+	index_t m_num_blocks;
+	std::vector<std::vector<Block> > next_samples;
+};
+
+}
+
+}
+
+#endif // NEXT_SAMPLES_H__
diff --git a/src/shogun/statistical_testing/internals/StreamingDataFetcher.cpp b/src/shogun/statistical_testing/internals/StreamingDataFetcher.cpp
new file mode 100644
index 00000000000..378b9c8c9e8
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/StreamingDataFetcher.cpp
@@ -0,0 +1,121 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <algorithm>
+#include <shogun/io/SGIO.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/streaming/StreamingFeatures.h>
+#include <shogun/statistical_testing/internals/StreamingDataFetcher.h>
+#include <shogun/statistical_testing/internals/BlockwiseDetails.h>
+
+using namespace shogun;
+using namespace internal;
+
+StreamingDataFetcher::StreamingDataFetcher(CStreamingFeatures* samples)
+: DataFetcher(), parser_running(false)
+{
+	REQUIRE(samples!=nullptr, "Samples cannot be null!\n");
+	SG_REF(samples);
+	m_samples=std::shared_ptr<CStreamingFeatures>(samples, [](CStreamingFeatures* ptr) { SG_UNREF(ptr); });
+	m_num_samples=0;
+}
+
+StreamingDataFetcher::~StreamingDataFetcher()
+{
+	end();
+}
+
+void StreamingDataFetcher::set_num_samples(index_t num_samples)
+{
+	m_num_samples=num_samples;
+}
+
+void StreamingDataFetcher::shuffle_features()
+{
+}
+
+void StreamingDataFetcher::unshuffle_features()
+{
+}
+
+void StreamingDataFetcher::use_fold(index_t i)
+{
+}
+
+void StreamingDataFetcher::init_active_subset()
+{
+}
+
+index_t StreamingDataFetcher::get_num_samples() const
+{
+	if (train_test_mode)
+	{
+		if (train_mode)
+			return m_num_samples*train_test_ratio/(train_test_ratio+1);
+		else
+			return m_num_samples/(train_test_ratio+1);
+	}
+	return m_num_samples;
+}
+
+void StreamingDataFetcher::start()
+{
+	REQUIRE(get_num_samples()>0, "Number of samples is not set! It is MANDATORY for streaming features!\n");
+	if (m_block_details.m_full_data || m_block_details.m_blocksize>get_num_samples())
+	{
+		SG_SINFO("Fetching entire data (%d samples)!\n", get_num_samples());
+		m_block_details.with_blocksize(get_num_samples());
+	}
+	m_block_details.m_total_num_blocks=get_num_samples()/m_block_details.m_blocksize;
+	m_block_details.m_next_block_index=0;
+	if (!parser_running)
+	{
+		m_samples->start_parser();
+		parser_running=true;
+	}
+}
+
+CFeatures* StreamingDataFetcher::next()
+{
+	CFeatures* next_samples=nullptr;
+	// figure out how many samples to fetch in this burst
+	auto num_already_fetched=m_block_details.m_next_block_index*m_block_details.m_blocksize;
+	auto num_more_samples=get_num_samples()-num_already_fetched;
+	if (num_more_samples>0)
+	{
+		auto num_samples_this_burst=std::min(m_block_details.m_max_num_samples_per_burst, num_more_samples);
+		next_samples=m_samples->get_streamed_features(num_samples_this_burst);
+		m_block_details.m_next_block_index+=m_block_details.m_num_blocks_per_burst;
+	}
+	return next_samples;
+}
+
+void StreamingDataFetcher::reset()
+{
+	m_block_details.m_next_block_index=0;
+	m_samples->reset_stream();
+}
+
+void StreamingDataFetcher::end()
+{
+	if (parser_running)
+	{
+		m_samples->end_parser();
+		parser_running=false;
+	}
+}
diff --git a/src/shogun/statistical_testing/internals/StreamingDataFetcher.h b/src/shogun/statistical_testing/internals/StreamingDataFetcher.h
new file mode 100644
index 00000000000..370aa8390ee
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/StreamingDataFetcher.h
@@ -0,0 +1,68 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2016  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <memory>
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/internals/DataFetcher.h>
+
+#ifndef STREMING_DATA_FETCHER_H__
+#define STREMING_DATA_FETCHER_H__
+
+namespace shogun
+{
+
+class CStreamingFeatures;
+
+namespace internal
+{
+
+class DataManager;
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class StreamingDataFetcher : public DataFetcher
+{
+	friend class DataManager;
+public:
+	StreamingDataFetcher(CStreamingFeatures* samples);
+	virtual ~StreamingDataFetcher();
+	void set_num_samples(index_t num_samples);
+
+	virtual void shuffle_features();
+	virtual void unshuffle_features();
+
+	virtual void use_fold(index_t i);
+	virtual void init_active_subset();
+
+	virtual void start();
+	virtual CFeatures* next();
+	virtual void reset();
+	virtual void end();
+
+	virtual index_t get_num_samples() const;
+	virtual const char* get_name() const
+	{
+		return "StreamingDataFetcher";
+	}
+private:
+	std::shared_ptr<CStreamingFeatures> m_samples;
+	bool parser_running;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+#endif // STREMING_DATA_FETCHER_H__
diff --git a/src/shogun/statistical_testing/internals/TestTypes.h b/src/shogun/statistical_testing/internals/TestTypes.h
new file mode 100644
index 00000000000..47f786f21cd
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/TestTypes.h
@@ -0,0 +1,98 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef TEST_TYPES_H__
+#define TEST_TYPES_H__
+
+namespace shogun
+{
+
+namespace internal
+{
+
+/**
+ * @brief Meta test-type for 1-distribution statistical tests.
+ */
+struct OneDistributionTest
+{
+	/** defines the number of feature objects required */
+	static constexpr index_t num_feats = 1;
+};
+
+/**
+ * @brief Meta test-type for 2-distribution statistical tests.
+ */
+struct TwoDistributionTest
+{
+	/** defines the number of feature objects required */
+	static constexpr index_t num_feats = 2;
+};
+
+/**
+ * @brief Meta test-type for 3-distribution statistical tests.
+ */
+struct ThreeDistributionTest
+{
+	/** defines the number of feature objects required */
+	static constexpr index_t num_feats = 3;
+};
+
+/**
+ * @brief Meta test-type for goodness-of-fit test.
+ */
+struct GoodnessOfFitTest : OneDistributionTest
+{
+	/** defines the number of kernel objects required */
+	static constexpr index_t num_kernels = 1;
+};
+
+/**
+ * @brief Meta test-type for two-sample test.
+ */
+struct TwoSampleTest : TwoDistributionTest
+{
+	/** defines the number of kernel objects required */
+	static constexpr index_t num_kernels = 1;
+};
+
+/**
+ * @brief Meta test-type for independence test.
+ */
+struct IndependenceTest : TwoDistributionTest
+{
+	/** defines the number of kernel objects required */
+	static constexpr index_t num_kernels = 2;
+};
+
+}
+
+}
+
+#endif // TEST_TYPES_H__
diff --git a/src/shogun/statistical_testing/internals/mmd/ComputeMMD.h b/src/shogun/statistical_testing/internals/mmd/ComputeMMD.h
new file mode 100644
index 00000000000..ab599a2f5a8
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/ComputeMMD.h
@@ -0,0 +1,261 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef COMPUTE_MMD_H_
+#define COMPUTE_MMD_H_
+
+#include <array>
+#include <vector>
+#include <shogun/lib/config.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/mathematics/eigen3.h>
+#include <shogun/io/SGIO.h>
+
+namespace shogun
+{
+
+namespace internal
+{
+
+namespace mmd
+{
+
+struct terms_t
+{
+	std::array<float64_t, 3> term{};
+	std::array<float64_t, 3> diag{};
+};
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+/**
+ * @brief Class Compute blah blah.
+ */
+struct ComputeMMD
+{
+	ComputeMMD() : m_n_x(0), m_n_y(0), m_stype(ST_UNBIASED_FULL)
+	{
+	}
+
+	template <class Kernel>
+	float32_t operator()(const Kernel& kernel) const
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		const index_t size=m_n_x+m_n_y;
+		terms_t terms;
+		for (auto i=0; i<size; ++i)
+		{
+			for (auto j=i; j<size; ++j)
+				add_term_upper(terms, kernel(i, j), i, j);
+		}
+		return compute(terms);
+	}
+
+	template <typename T>
+	float32_t operator()(const SGMatrix<T>& kernel_matrix) const
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		const index_t size=m_n_x+m_n_y;
+		ASSERT(kernel_matrix.num_rows==size && kernel_matrix.num_cols==size);
+
+		typedef Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic> MatrixXt;
+		typedef Eigen::Block<Eigen::Map<const MatrixXt> > BlockXt;
+
+		Eigen::Map<const MatrixXt> map(kernel_matrix.matrix, kernel_matrix.num_rows, kernel_matrix.num_cols);
+
+		const BlockXt& b_x=map.block(0, 0, m_n_x, m_n_x);
+		const BlockXt& b_y=map.block(m_n_x, m_n_x, m_n_y, m_n_y);
+		const BlockXt& b_xy=map.block(m_n_x, 0, m_n_y, m_n_x);
+
+		terms_t terms;
+		terms.diag[0]=b_x.diagonal().sum();
+		terms.diag[1]=b_y.diagonal().sum();
+		terms.diag[2]=b_xy.diagonal().sum();
+
+		terms.term[0]=(b_x.sum()-terms.diag[0])/2+terms.diag[0];
+		terms.term[1]=(b_y.sum()-terms.diag[1])/2+terms.diag[1];
+		terms.term[2]=b_xy.sum();
+
+		return compute(terms);
+	}
+
+	SGVector<float64_t> operator()(const KernelManager& kernel_mgr) const
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		std::vector<terms_t> terms(kernel_mgr.num_kernels());
+		const index_t size=m_n_x+m_n_y;
+		for (auto j=0; j<size; ++j)
+		{
+			for (auto i=j; i<size; ++i)
+			{
+				for (size_t k=0; k<kernel_mgr.num_kernels(); ++k)
+				{
+					auto kernel=kernel_mgr.kernel_at(k)->kernel(i, j);
+					add_term_lower(terms[k], kernel, i, j);
+				}
+			}
+		}
+
+		SGVector<float64_t> result(kernel_mgr.num_kernels());
+		for (size_t k=0; k<kernel_mgr.num_kernels(); ++k)
+		{
+			result[k]=compute(terms[k]);
+			SG_SDEBUG("result[%d] = %f!\n", k, result[k]);
+		}
+		terms.resize(0);
+		return result;
+	}
+
+	/**
+	 * Adds the kernel value to to the term that corresponding to K(i, j). It only
+	 * uses the lower triangular half of the matrix to exploit symmetry.
+	 *
+	 * @param terms the terms for computing MMD
+	 * @param kernel_value the kernel value between i-th and j-th features.
+	 * @param i the row index for the Gram matrix
+	 * @param j the col index for the Gram matrix
+	 */
+	template <typename T>
+	inline void add_term_lower(terms_t& terms, T kernel_value, index_t i, index_t j) const
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		if (i<m_n_x && j<m_n_x && i>=j)
+		{
+			SG_SDEBUG("Adding Kernel(%d, %d)=%f to term_0!\n", i, j, kernel_value);
+			terms.term[0]+=kernel_value;
+			if (i==j)
+				terms.diag[0]+=kernel_value;
+		}
+		else if (i>=m_n_x && j>=m_n_x && i>=j)
+		{
+			SG_SDEBUG("Adding Kernel(%d, %d)=%f to term_1!\n", i, j, kernel_value);
+			terms.term[1]+=kernel_value;
+			if (i==j)
+				terms.diag[1]+=kernel_value;
+		}
+		else if (i>=m_n_x && j<m_n_x)
+		{
+			SG_SDEBUG("Adding Kernel(%d, %d)=%f to term_2!\n", i, j, kernel_value);
+			terms.term[2]+=kernel_value;
+			if (i-m_n_x==j)
+				terms.diag[2]+=kernel_value;
+		}
+	}
+
+	/**
+	 * Adds the kernel value to to the term that corresponding to K(i, j). It only
+	 * uses the upper triangular half of the matrix to exploit symmetry.
+	 *
+	 * @param terms the terms for computing MMD
+	 * @param kernel_value the kernel value between i-th and j-th features.
+	 * @param i the row index for the Gram matrix
+	 * @param j the col index for the Gram matrix
+	 */
+	template <typename T>
+	inline void add_term_upper(terms_t& terms, T kernel_value, index_t i, index_t j) const
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		if (i<m_n_x && j<m_n_x && i<=j)
+		{
+			SG_SDEBUG("Adding Kernel(%d, %d)=%f to term_0!\n", i, j, kernel_value);
+			terms.term[0]+=kernel_value;
+			if (i==j)
+				terms.diag[0]+=kernel_value;
+		}
+		else if (i>=m_n_x && j>=m_n_x && i<=j)
+		{
+			SG_SDEBUG("Adding Kernel(%d, %d)=%f to term_1!\n", i, j, kernel_value);
+			terms.term[1]+=kernel_value;
+			if (i==j)
+				terms.diag[1]+=kernel_value;
+		}
+		else if (i<m_n_x && j>=m_n_x)
+		{
+			SG_SDEBUG("Adding Kernel(%d, %d)=%f to term_2!\n", i, j, kernel_value);
+			terms.term[2]+=kernel_value;
+			if (i+m_n_x==j)
+				terms.diag[2]+=kernel_value;
+		}
+	}
+
+	inline float64_t compute(terms_t& terms) const
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		terms.term[0]=2*(terms.term[0]-terms.diag[0]);
+		terms.term[1]=2*(terms.term[1]-terms.diag[1]);
+		SG_SDEBUG("term_0 sum (without diagonal) = %f!\n", terms.term[0]);
+		SG_SDEBUG("term_1 sum (without diagonal) = %f!\n", terms.term[1]);
+		if (m_stype!=ST_BIASED_FULL)
+		{
+			terms.term[0]/=m_n_x*(m_n_x-1);
+			terms.term[1]/=m_n_y*(m_n_y-1);
+		}
+		else
+		{
+			terms.term[0]+=terms.diag[0];
+			terms.term[1]+=terms.diag[1];
+			SG_SDEBUG("term_0 sum (with diagonal) = %f!\n", terms.term[0]);
+			SG_SDEBUG("term_1 sum (with diagonal) = %f!\n", terms.term[1]);
+			terms.term[0]/=m_n_x*m_n_x;
+			terms.term[1]/=m_n_y*m_n_y;
+		}
+		SG_SDEBUG("term_0 (normalized) = %f!\n", terms.term[0]);
+		SG_SDEBUG("term_1 (normalized) = %f!\n", terms.term[1]);
+
+		SG_SDEBUG("term_2 sum (with diagonal) = %f!\n", terms.term[2]);
+		if (m_stype==ST_UNBIASED_INCOMPLETE)
+		{
+			terms.term[2]-=terms.diag[2];
+			SG_SDEBUG("term_2 sum (without diagonal) = %f!\n", terms.term[2]);
+			terms.term[2]/=m_n_x*(m_n_x-1);
+		}
+		else
+			terms.term[2]/=m_n_x*m_n_y;
+		SG_SDEBUG("term_2 (normalized) = %f!\n", terms.term[2]);
+
+		auto result=terms.term[0]+terms.term[1]-2*terms.term[2];
+		SG_SDEBUG("result = %f!\n", result);
+		return result;
+	}
+
+	index_t m_n_x;
+	index_t m_n_y;
+	EStatisticType m_stype;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+}
+#endif // COMPUTE_MMD_H_
diff --git a/src/shogun/statistical_testing/internals/mmd/CrossValidationMMD.h b/src/shogun/statistical_testing/internals/mmd/CrossValidationMMD.h
new file mode 100644
index 00000000000..390ef6ad93f
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/CrossValidationMMD.h
@@ -0,0 +1,241 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef CROSS_VALIDATION_MMD_H_
+#define CROSS_VALIDATION_MMD_H_
+
+#include <memory>
+#include <algorithm>
+#include <numeric>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/labels/BinaryLabels.h>
+#include <shogun/features/SubsetStack.h>
+#include <shogun/evaluation/CrossValidationSplitting.h>
+#include <shogun/statistical_testing/internals/mmd/PermutationMMD.h>
+
+using std::unique_ptr;
+
+namespace shogun
+{
+
+namespace internal
+{
+
+namespace mmd
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+struct CrossValidationMMD : PermutationMMD
+{
+	CrossValidationMMD(index_t n_x, index_t n_y, index_t num_folds, index_t num_null_samples)
+	{
+		ASSERT(n_x>0 && n_y>0);
+		ASSERT(num_folds>0);
+		ASSERT(num_null_samples>0);
+
+		m_n_x=n_x;
+		m_n_y=n_y;
+		m_num_folds=num_folds;
+		m_num_null_samples=num_null_samples;
+		m_num_runs=DEFAULT_NUM_RUNS;
+		m_alpha=DEFAULT_ALPHA;
+
+		init();
+	}
+
+	void operator()(const KernelManager& kernel_mgr)
+	{
+		REQUIRE(m_rejections.num_rows==m_num_runs*m_num_folds,
+			"Number of rows in the measure matrix (was %d), has to be >= %d*%d = %d!\n",
+			m_rejections.num_rows, m_num_runs, m_num_folds, m_num_runs*m_num_folds);
+		REQUIRE(size_t(m_rejections.num_cols)==kernel_mgr.num_kernels(),
+			"Number of columns in the measure matrix (was %d), has to equal to the nunber of kernels (%d)!\n",
+			m_rejections.num_cols, kernel_mgr.num_kernels());
+
+		const index_t size=m_n_x+m_n_y;
+		const index_t orig_n_x=m_n_x;
+		const index_t orig_n_y=m_n_y;
+		SGVector<float64_t> null_samples(m_num_null_samples);
+		SGVector<float32_t> precomputed_km(size*(size+1)/2);
+
+		for (size_t k=0; k<kernel_mgr.num_kernels(); ++k)
+		{
+			auto kernel=kernel_mgr.kernel_at(k);
+			for (auto i=0; i<size; ++i)
+			{
+				for (auto j=i; j<size; ++j)
+				{
+					auto index=i*size-i*(i+1)/2+j;
+					precomputed_km[index]=kernel->kernel(i, j);
+				}
+			}
+
+			for (auto current_run=0; current_run<m_num_runs; ++current_run)
+			{
+				m_kfold_x->build_subsets();
+				m_kfold_y->build_subsets();
+				for (auto current_fold=0; current_fold<m_num_folds; ++current_fold)
+				{
+					generate_inds(current_fold);
+					std::fill(m_inverted_inds.data(), m_inverted_inds.data()+m_inverted_inds.size(), -1);
+					for (size_t idx=0; idx<m_xy_inds.size(); ++idx)
+						m_inverted_inds[m_xy_inds[idx]]=idx;
+
+					SGVector<index_t> xy_wrapper(m_xy_inds.data(), m_xy_inds.size(), false);
+					m_stack->add_subset(xy_wrapper);
+
+					m_permuted_inds.resize(m_xy_inds.size());
+					SGVector<index_t> permutation_wrapper(m_permuted_inds.data(), m_permuted_inds.size(), false);
+					for (auto n=0; n<m_num_null_samples; ++n)
+					{
+						std::iota(m_permuted_inds.data(), m_permuted_inds.data()+m_permuted_inds.size(), 0);
+						CMath::permute(permutation_wrapper);
+
+						m_stack->add_subset(permutation_wrapper);
+						SGVector<index_t> inds=m_stack->get_last_subset()->get_subset_idx();
+						m_stack->remove_subset();
+
+						std::fill(m_inverted_permuted_inds[n].data(), m_inverted_permuted_inds[n].data()+size, -1);
+						for (int idx=0; idx<inds.size(); ++idx)
+							m_inverted_permuted_inds[n][inds[idx]]=idx;
+					}
+					m_stack->remove_subset();
+
+					terms_t terms;
+					for (auto i=0; i<size; ++i)
+					{
+						for (auto j=i; j<size; ++j)
+						{
+							auto inverted_row=m_inverted_inds[i];
+							auto inverted_col=m_inverted_inds[j];
+							if (inverted_row!=-1 && inverted_col!=-1)
+							{
+								auto idx=i*size-i*(i+1)/2+j;
+								add_term_upper(terms, precomputed_km[idx], inverted_row, inverted_col);
+							}
+						}
+					}
+					auto statistic=compute(terms);
+
+#pragma omp parallel for
+					for (auto n=0; n<m_num_null_samples; ++n)
+					{
+						terms_t null_terms;
+						for (auto i=0; i<size; ++i)
+						{
+							for (auto j=i; j<size; ++j)
+							{
+								auto inverted_row=m_inverted_permuted_inds[n][i];
+								auto inverted_col=m_inverted_permuted_inds[n][j];
+								if (inverted_row!=-1 && inverted_col!=-1)
+								{
+									auto idx=i*size-i*(i+1)/2+j;
+									if (inverted_row<=inverted_col)
+										add_term_upper(null_terms, precomputed_km[idx], inverted_row, inverted_col);
+									else
+										add_term_upper(null_terms, precomputed_km[idx], inverted_col, inverted_row);
+								}
+							}
+						}
+						null_samples[n]=compute(null_terms);
+					}
+
+					std::sort(null_samples.data(), null_samples.data()+null_samples.size());
+					SG_SDEBUG("statistic=%f\n", statistic);
+					float64_t idx=null_samples.find_position_to_insert(statistic);
+					SG_SDEBUG("index=%f\n", idx);
+					auto p_value=1.0-idx/m_num_null_samples;
+					bool rejected=p_value<m_alpha;
+					SG_SDEBUG("p-value=%f, alpha=%f, rejected=%d\n", p_value, m_alpha, rejected);
+					m_rejections(current_run*m_num_folds+current_fold, k)=rejected;
+
+					m_n_x=orig_n_x;
+					m_n_y=orig_n_y;
+				}
+			}
+		}
+	}
+
+	index_t m_num_runs;
+	index_t m_num_folds;
+	static constexpr index_t DEFAULT_NUM_RUNS=10;
+
+	float64_t m_alpha;
+	static constexpr float64_t DEFAULT_ALPHA=0.05;
+
+	unique_ptr<CCrossValidationSplitting> m_kfold_x;
+	unique_ptr<CCrossValidationSplitting> m_kfold_y;
+	unique_ptr<CSubsetStack> m_stack;
+
+	std::vector<index_t> m_xy_inds;
+	SGVector<index_t> m_inverted_inds;
+	SGMatrix<float64_t> m_rejections;
+
+	void init()
+	{
+		SGVector<int64_t> dummy_labels_x(m_n_x);
+		SGVector<int64_t> dummy_labels_y(m_n_y);
+
+		auto instance_x=new CCrossValidationSplitting(new CBinaryLabels(dummy_labels_x), m_num_folds);
+		auto instance_y=new CCrossValidationSplitting(new CBinaryLabels(dummy_labels_y), m_num_folds);
+		m_kfold_x=unique_ptr<CCrossValidationSplitting>(instance_x);
+		m_kfold_y=unique_ptr<CCrossValidationSplitting>(instance_y);
+
+		m_stack=unique_ptr<CSubsetStack>(new CSubsetStack());
+
+		const index_t size=m_n_x+m_n_y;
+		m_inverted_inds=SGVector<index_t>(size);
+
+		m_inverted_permuted_inds.resize(m_num_null_samples);
+		for (auto i=0; i<m_num_null_samples; ++i)
+			m_inverted_permuted_inds[i].resize(size);
+	}
+
+	void generate_inds(index_t current_fold)
+	{
+		SGVector<index_t> x_inds=m_kfold_x->generate_subset_inverse(current_fold);
+		SGVector<index_t> y_inds=m_kfold_y->generate_subset_inverse(current_fold);
+		std::for_each(y_inds.data(), y_inds.data()+y_inds.size(), [this](index_t& val) { val += m_n_x; });
+
+		m_n_x=x_inds.size();
+		m_n_y=y_inds.size();
+
+		m_xy_inds.resize(x_inds.size()+y_inds.size());
+		std::copy(x_inds.data(), x_inds.data()+x_inds.size(), m_xy_inds.data());
+		std::copy(y_inds.data(), y_inds.data()+y_inds.size(), m_xy_inds.data()+x_inds.size());
+	}
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+}
+#endif // CROSS_VALIDATION_MMD_H_
diff --git a/src/shogun/statistical_testing/internals/mmd/PermutationMMD.h b/src/shogun/statistical_testing/internals/mmd/PermutationMMD.h
new file mode 100644
index 00000000000..69c48ef05cd
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/PermutationMMD.h
@@ -0,0 +1,254 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef PERMUTATION_MMD_H_
+#define PERMUTATION_MMD_H_
+
+#include <algorithm>
+#include <numeric>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+
+namespace shogun
+{
+
+namespace internal
+{
+
+namespace mmd
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+struct PermutationMMD : ComputeMMD
+{
+	PermutationMMD() : m_save_inds(false)
+	{
+	}
+
+	template <class Kernel>
+	SGVector<float32_t> operator()(const Kernel& kernel)
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		ASSERT(m_num_null_samples>0);
+		precompute_permutation_inds();
+
+		const index_t size=m_n_x+m_n_y;
+		SGVector<float32_t> null_samples(m_num_null_samples);
+#pragma omp parallel for
+		for (auto n=0; n<m_num_null_samples; ++n)
+		{
+			terms_t terms;
+			for (auto j=0; j<size; ++j)
+			{
+				for (auto i=j; i<size; ++i)
+				{
+					auto inverted_row=m_inverted_permuted_inds[n][i];
+					auto inverted_col=m_inverted_permuted_inds[n][j];
+
+					if (inverted_row>=inverted_col)
+						add_term_lower(terms, kernel(i, j), inverted_row, inverted_col);
+					else
+						add_term_lower(terms, kernel(i, j), inverted_col, inverted_row);
+				}
+			}
+			null_samples[n]=compute(terms);
+			SG_SDEBUG("null_samples[%d] = %f!\n", n, null_samples[n]);
+		}
+		return null_samples;
+	}
+
+	SGMatrix<float32_t> operator()(const KernelManager& kernel_mgr)
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		ASSERT(m_num_null_samples>0);
+		precompute_permutation_inds();
+
+		const index_t size=m_n_x+m_n_y;
+		SGMatrix<float32_t> null_samples(m_num_null_samples, kernel_mgr.num_kernels());
+		SGVector<float32_t> km(size*(size+1)/2);
+		for (size_t k=0; k<kernel_mgr.num_kernels(); ++k)
+		{
+			auto kernel=kernel_mgr.kernel_at(k);
+			terms_t terms;
+			for (auto i=0; i<size; ++i)
+			{
+				for (auto j=i; j<size; ++j)
+				{
+					auto index=i*size-i*(i+1)/2+j;
+					km[index]=kernel->kernel(i, j);
+				}
+			}
+
+#pragma omp parallel for
+			for (auto n=0; n<m_num_null_samples; ++n)
+			{
+				terms_t null_terms;
+				for (auto i=0; i<size; ++i)
+				{
+					for (auto j=i; j<size; ++j)
+					{
+						auto index=i*size-i*(i+1)/2+j;
+						auto inverted_row=m_inverted_permuted_inds[n][i];
+						auto inverted_col=m_inverted_permuted_inds[n][j];
+
+						if (inverted_row<=inverted_col)
+							add_term_upper(null_terms, km[index], inverted_row, inverted_col);
+						else
+							add_term_upper(null_terms, km[index], inverted_col, inverted_row);
+					}
+				}
+				null_samples(n, k)=compute(null_terms);
+			}
+		}
+		return null_samples;
+	}
+
+	template <class Kernel>
+	float64_t p_value(const Kernel& kernel)
+	{
+		auto statistic=ComputeMMD::operator()(kernel);
+		auto null_samples=operator()(kernel);
+		return compute_p_value(null_samples, statistic);
+	}
+
+	SGVector<float64_t> p_value(const KernelManager& kernel_mgr)
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		ASSERT(m_num_null_samples>0);
+		precompute_permutation_inds();
+
+		const index_t size=m_n_x+m_n_y;
+		SGVector<float32_t> null_samples(m_num_null_samples);
+		SGVector<float64_t> result(kernel_mgr.num_kernels());
+
+		SGVector<float32_t> km(size*(size+1)/2);
+		for (size_t k=0; k<kernel_mgr.num_kernels(); ++k)
+		{
+			auto kernel=kernel_mgr.kernel_at(k);
+			terms_t terms;
+			for (auto i=0; i<size; ++i)
+			{
+				for (auto j=i; j<size; ++j)
+				{
+					auto index=i*size-i*(i+1)/2+j;
+					km[index]=kernel->kernel(i, j);
+					add_term_upper(terms, km[index], i, j);
+				}
+			}
+			float32_t statistic=compute(terms);
+			SG_SDEBUG("Kernel(%d): statistic=%f\n", k, statistic);
+
+#pragma omp parallel for
+			for (auto n=0; n<m_num_null_samples; ++n)
+			{
+				terms_t null_terms;
+				for (auto i=0; i<size; ++i)
+				{
+					for (auto j=i; j<size; ++j)
+					{
+						auto index=i*size-i*(i+1)/2+j;
+						auto inverted_row=m_inverted_permuted_inds[n][i];
+						auto inverted_col=m_inverted_permuted_inds[n][j];
+
+						if (inverted_row<=inverted_col)
+							add_term_upper(null_terms, km[index], inverted_row, inverted_col);
+						else
+							add_term_upper(null_terms, km[index], inverted_col, inverted_row);
+					}
+				}
+				null_samples[n]=compute(null_terms);
+			}
+			result[k]=compute_p_value(null_samples, statistic);
+			SG_SDEBUG("Kernel(%d): p_value=%f\n", k, result[k]);
+		}
+
+		return result;
+	}
+
+	inline void precompute_permutation_inds()
+	{
+		ASSERT(m_num_null_samples>0);
+		allocate_permutation_inds();
+		SGVector<index_t> sg_wrapper(m_permuted_inds.data(), m_permuted_inds.size(), false);
+		for (auto n=0; n<m_num_null_samples; ++n)
+		{
+			std::iota(m_permuted_inds.data(), m_permuted_inds.data()+m_permuted_inds.size(), 0);
+			CMath::permute(sg_wrapper);
+			if (m_save_inds)
+			{
+				auto offset=n*sg_wrapper.size();
+				std::copy(sg_wrapper.data(), sg_wrapper.data()+sg_wrapper.size(), &m_all_inds.matrix[offset]);
+			}
+			for (size_t i=0; i<m_permuted_inds.size(); ++i)
+				m_inverted_permuted_inds[n][m_permuted_inds[i]]=i;
+		}
+	}
+
+	inline float64_t compute_p_value(SGVector<float32_t>& null_samples, float32_t statistic) const
+	{
+		std::sort(null_samples.data(), null_samples.data()+null_samples.size());
+		float64_t idx=null_samples.find_position_to_insert(statistic);
+		return 1.0-idx/null_samples.size();
+	}
+
+	inline void allocate_permutation_inds()
+	{
+		const index_t size=m_n_x+m_n_y;
+		if (m_permuted_inds.size()!=size_t(size))
+			m_permuted_inds.resize(size);
+
+		if (m_inverted_permuted_inds.size()!=size_t(m_num_null_samples))
+			m_inverted_permuted_inds.resize(m_num_null_samples);
+
+		for (auto i=0; i<m_num_null_samples; ++i)
+		{
+			if (m_inverted_permuted_inds[i].size()!=size_t(size))
+				m_inverted_permuted_inds[i].resize(size);
+		}
+
+		if (m_save_inds)
+			m_all_inds=SGMatrix<index_t>(size, m_num_null_samples);
+	}
+
+	index_t m_num_null_samples;
+	bool m_save_inds;
+	std::vector<index_t> m_permuted_inds;
+	std::vector<std::vector<index_t> > m_inverted_permuted_inds;
+	SGMatrix<index_t> m_all_inds;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+}
+
+#endif // PERMUTATION_MMD_H_
diff --git a/src/shogun/statistical_testing/internals/mmd/VarianceH0.h b/src/shogun/statistical_testing/internals/mmd/VarianceH0.h
new file mode 100644
index 00000000000..e67344f9e5c
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/VarianceH0.h
@@ -0,0 +1,81 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef VARIANCE_H0__H_
+#define VARIANCE_H0__H_
+
+#include <shogun/lib/common.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/mathematics/eigen3.h>
+#include <shogun/mathematics/Math.h>
+
+namespace shogun
+{
+
+template <typename T> class SGMatrix;
+
+namespace internal
+{
+
+namespace mmd
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+struct VarianceH0
+{
+	template <typename T>
+	T operator()(const SGMatrix<T>& kernel_matrix)
+	{
+		typedef Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic> MatrixXt;
+		typedef Eigen::Matrix<T, Eigen::Dynamic, 1> VectorXt;
+
+		Eigen::Map<MatrixXt> map(kernel_matrix.matrix, kernel_matrix.num_rows, kernel_matrix.num_cols);
+		index_t B=map.rows();
+
+		VectorXt diag=map.diagonal();
+		map.diagonal().setZero();
+
+		auto term_1=CMath::sq(map.array().sum()/B/(B-1));
+		auto term_2=map.array().square().sum()/B/(B-1);
+		auto term_3=(map.colwise().sum()/(B-1)).array().square().sum()/B;
+
+		map.diagonal()=diag;
+
+		auto variance_estimate=2*(term_1+term_2-2*term_3);
+		return variance_estimate;
+	}
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+}
+
+#endif // VARIANCE_H0__H_
diff --git a/src/shogun/statistical_testing/internals/mmd/VarianceH1.h b/src/shogun/statistical_testing/internals/mmd/VarianceH1.h
new file mode 100644
index 00000000000..67fbcde012e
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/VarianceH1.h
@@ -0,0 +1,269 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef VARIANCE_H1__H_
+#define VARIANCE_H1__H_
+
+#include <vector>
+#include <shogun/lib/common.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/Kernel.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+
+using std::vector;
+
+namespace shogun
+{
+
+namespace internal
+{
+
+namespace mmd
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+struct VarianceH1
+{
+	VarianceH1() : m_lambda(1E-5), m_free_terms(true)
+	{
+	}
+
+	void init_terms()
+	{
+		m_sum_x=0;
+		m_sum_y=0;
+		m_sum_xy=0;
+		m_sum_sq_x=0;
+		m_sum_sq_y=0;
+		m_sum_sq_xy=0;
+
+		m_sum_colwise_x.resize(m_n_x);
+		m_sum_colwise_y.resize(m_n_y);
+		m_sum_rowwise_xy.resize(m_n_x);
+		m_sum_colwise_xy.resize(m_n_y);
+		std::fill(m_sum_colwise_x.begin(), m_sum_colwise_x.end(), 0);
+		std::fill(m_sum_colwise_y.begin(), m_sum_colwise_y.end(), 0);
+		std::fill(m_sum_rowwise_xy.begin(), m_sum_rowwise_xy.end(), 0);
+		std::fill(m_sum_colwise_xy.begin(), m_sum_colwise_xy.end(), 0);
+
+		if (m_second_order_terms.rows()==m_n_x && m_second_order_terms.cols()==m_n_x)
+			m_second_order_terms.setZero();
+		else
+			m_second_order_terms=Eigen::MatrixXd::Zero(m_n_x, m_n_x);
+	}
+
+	void free_terms()
+	{
+		if (m_free_terms)
+		{
+			m_sum_colwise_x.resize(0);
+			m_sum_colwise_y.resize(0);
+			m_sum_rowwise_xy.resize(0);
+			m_sum_colwise_xy.resize(0);
+			m_second_order_terms=Eigen::MatrixXd::Zero(0, 0);
+		}
+	}
+
+	template <typename T>
+	void add_terms(T kernel_value, index_t i, index_t j)
+	{
+		if (i<m_n_x && j<m_n_x)
+		{
+			m_sum_x+=2*kernel_value;
+			m_sum_sq_x+=2*kernel_value*kernel_value;
+			m_sum_colwise_x[i]+=kernel_value;
+			m_sum_colwise_x[j]+=kernel_value;
+			m_second_order_terms(i, j)+=kernel_value;
+			m_second_order_terms(j, i)+=kernel_value;
+		}
+		else if (i>=m_n_x && j>=m_n_x)
+		{
+			m_sum_y+=2*kernel_value;
+			m_sum_sq_y+=2*kernel_value*kernel_value;
+			m_sum_colwise_y[i-m_n_x]+=kernel_value;
+			m_sum_colwise_y[j-m_n_x]+=kernel_value;
+			m_second_order_terms(i-m_n_x, j-m_n_x)+=kernel_value;
+			m_second_order_terms(j-m_n_x, i-m_n_x)+=kernel_value;
+		}
+		else if (i<m_n_x && j>=m_n_x)
+		{
+			m_sum_xy+=kernel_value;
+			m_sum_sq_xy+=kernel_value*kernel_value;
+			if (j-i!=m_n_x)
+			{
+				m_second_order_terms(i, j-m_n_x)-=kernel_value;
+				m_second_order_terms(j-m_n_x, i)-=kernel_value;
+			}
+			m_sum_rowwise_xy[i]+=kernel_value;
+			m_sum_colwise_xy[j-m_n_x]+=kernel_value;
+		}
+	}
+
+	float64_t compute_variance_estimate()
+	{
+		Eigen::Map<Eigen::VectorXd> map_sum_colwise_x(m_sum_colwise_x.data(), m_sum_colwise_x.size());
+		Eigen::Map<Eigen::VectorXd> map_sum_colwise_y(m_sum_colwise_y.data(), m_sum_colwise_y.size());
+		Eigen::Map<Eigen::VectorXd> map_sum_rowwise_xy(m_sum_rowwise_xy.data(), m_sum_rowwise_xy.size());
+		Eigen::Map<Eigen::VectorXd> map_sum_colwise_xy(m_sum_colwise_xy.data(), m_sum_colwise_xy.size());
+
+		auto t_0=(map_sum_colwise_x.dot(map_sum_colwise_x)-m_sum_sq_x)/m_n_x/(m_n_x-1)/(m_n_x-2);
+		auto t_1=CMath::sq(m_sum_x/m_n_x/(m_n_x-1));
+
+		auto t_2=map_sum_colwise_x.dot(map_sum_rowwise_xy)*2/m_n_x/(m_n_x-1)/m_n_y;
+		auto t_3=m_sum_x*m_sum_xy*2/m_n_x/m_n_x/(m_n_x-1)/m_n_y;
+
+		auto t_4=(map_sum_colwise_y.dot(map_sum_colwise_y)-m_sum_sq_y)/m_n_y/(m_n_y-1)/(m_n_y-2);
+		auto t_5=CMath::sq(m_sum_y/m_n_y/(m_n_y-1));
+
+		auto t_6=map_sum_colwise_y.dot(map_sum_colwise_xy)*2/m_n_y/(m_n_y-1)/m_n_x;
+		auto t_7=m_sum_y*m_sum_xy*2/m_n_y/m_n_y/(m_n_y-1)/m_n_x;
+
+		auto t_8=(map_sum_rowwise_xy.dot(map_sum_rowwise_xy)-m_sum_sq_xy)/m_n_y/(m_n_y-1)/m_n_x;
+		auto t_9=2*CMath::sq(m_sum_xy/m_n_x/m_n_y);
+		auto t_10=(map_sum_colwise_xy.dot(map_sum_colwise_xy)-m_sum_sq_xy)/m_n_x/(m_n_x-1)/m_n_y;
+
+		auto var_first=(t_0-t_1)-t_2+t_3+(t_4-t_5)-t_6+t_7+(t_8-t_9+t_10);
+		var_first*=4.0*(m_n_x-2)/m_n_x/(m_n_x-1);
+
+		auto var_second=2.0/m_n_x/m_n_y/(m_n_x-1)/(m_n_y-1)*m_second_order_terms.array().square().sum();
+
+		auto variance_estimate=var_first+var_second;
+		if (variance_estimate<0)
+			variance_estimate=var_second;
+
+		return variance_estimate;
+	}
+
+	template <class Kernel>
+	float64_t operator()(const Kernel& kernel)
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		ASSERT(m_n_x==m_n_y);
+		const index_t size=m_n_x+m_n_y;
+		init_terms();
+		for (auto j=0; j<size; ++j)
+		{
+			for (auto i=0; i<j; ++i)
+				add_terms(kernel(i, j), i, j);
+		}
+		auto variance_estimate=compute_variance_estimate();
+		free_terms();
+		return variance_estimate;
+	}
+
+	SGVector<float64_t> operator()(const KernelManager& kernel_mgr)
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		ASSERT(m_n_x==m_n_y);
+		ASSERT(kernel_mgr.num_kernels()>0);
+
+		const index_t size=m_n_x+m_n_y;
+		SGVector<float64_t> result(kernel_mgr.num_kernels());
+		SelfAdjointPrecomputedKernel kernel_functor(SGVector<float32_t>(size*(size+1)/2));
+		for (size_t k=0; k<kernel_mgr.num_kernels(); ++k)
+		{
+			auto kernel=kernel_mgr.kernel_at(k);
+			ASSERT(kernel);
+			kernel_functor.precompute(kernel);
+			init_terms();
+			for (auto i=0; i<size; ++i)
+			{
+				for (auto j=i+1; j<size; ++j)
+					add_terms(kernel_functor(i, j), i, j);
+			}
+			result[k]=compute_variance_estimate();
+		}
+
+		free_terms();
+		return result;
+	}
+
+	SGVector<float64_t> test_power(const KernelManager& kernel_mgr)
+	{
+		ASSERT(m_n_x>0 && m_n_y>0);
+		ASSERT(m_n_x==m_n_y);
+		ASSERT(kernel_mgr.num_kernels()>0);
+		ComputeMMD compute_mmd_job;
+		compute_mmd_job.m_n_x=m_n_x;
+		compute_mmd_job.m_n_y=m_n_y;
+		compute_mmd_job.m_stype=ST_UNBIASED_FULL;
+
+		const index_t size=m_n_x+m_n_y;
+		SGVector<float64_t> result(kernel_mgr.num_kernels());
+		SelfAdjointPrecomputedKernel kernel_functor(SGVector<float32_t>(size*(size+1)/2));
+		for (size_t k=0; k<kernel_mgr.num_kernels(); ++k)
+		{
+			auto kernel=kernel_mgr.kernel_at(k);
+			ASSERT(kernel);
+			kernel_functor.precompute(kernel);
+			init_terms();
+			for (auto i=0; i<size; ++i)
+			{
+				for (auto j=i+1; j<size; ++j)
+					add_terms(kernel_functor(i, j), i, j);
+			}
+			auto var_est=compute_variance_estimate();
+			auto mmd_est=compute_mmd_job(kernel_functor);
+			result[k]=mmd_est/CMath::sqrt(var_est+m_lambda);
+		}
+
+		free_terms();
+		return result;
+	}
+
+	index_t m_n_x;
+	index_t m_n_y;
+
+	float64_t m_sum_x;
+	float64_t m_sum_y;
+	float64_t m_sum_xy;
+	float64_t m_sum_sq_x;
+	float64_t m_sum_sq_y;
+	float64_t m_sum_sq_xy;
+	float64_t m_lambda;
+
+	vector<float64_t> m_sum_colwise_x;
+	vector<float64_t> m_sum_colwise_y;
+	vector<float64_t> m_sum_rowwise_xy;
+	vector<float64_t> m_sum_colwise_xy;
+	Eigen::MatrixXd m_second_order_terms;
+
+	bool m_free_terms;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+}
+
+#endif // VARIANCE_H1__H_
diff --git a/src/shogun/statistical_testing/internals/mmd/WithinBlockDirect.cpp b/src/shogun/statistical_testing/internals/mmd/WithinBlockDirect.cpp
new file mode 100644
index 00000000000..7e9fcb162f4
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/WithinBlockDirect.cpp
@@ -0,0 +1,45 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2014  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/GPUMatrix.h>
+#include <shogun/mathematics/eigen3.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/statistical_testing/internals/mmd/WithinBlockDirect.h>
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+
+float32_t WithinBlockDirect::operator()(const SGMatrix<float32_t>& km)
+{
+	Eigen::Map<Eigen::MatrixXf> map(km.matrix, km.num_rows, km.num_cols);
+	index_t B=km.num_rows;
+
+	Eigen::VectorXf diag=map.diagonal();
+	map.diagonal().setZero();
+
+	auto term_1=map.array().square().sum();
+	auto term_2=CMath::sq(map.array().sum());
+	auto term_3=(map*map).array().sum();
+
+	map.diagonal()=diag;
+
+	auto variance_estimate=2*(term_1+term_2/(B-1)/(B-2)-2*term_3/(B-2))/B/(B-3);
+	return variance_estimate;
+}
diff --git a/src/shogun/statistical_testing/internals/mmd/WithinBlockDirect.h b/src/shogun/statistical_testing/internals/mmd/WithinBlockDirect.h
new file mode 100644
index 00000000000..253f7d3c7e0
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/WithinBlockDirect.h
@@ -0,0 +1,49 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2014  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#ifndef WITHIN_BLOCK_DIRECT_H_
+#define WITHIN_BLOCK_DIRECT_H_
+
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+template <typename T> class SGMatrix;
+template <typename T> class CGPUMatrix;
+
+namespace internal
+{
+
+namespace mmd
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+struct WithinBlockDirect
+{
+	typedef float32_t return_type;
+	return_type operator()(const SGMatrix<return_type>& kernel_matrix);
+//	return_type operator()(const CGPUMatrix<return_type>& kernel_matrix);
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+}
+
+#endif // WITHIN_BLOCK_DIRECT_H_
diff --git a/src/shogun/statistical_testing/internals/mmd/WithinBlockPermutation.cpp b/src/shogun/statistical_testing/internals/mmd/WithinBlockPermutation.cpp
new file mode 100644
index 00000000000..26e7d794113
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/WithinBlockPermutation.cpp
@@ -0,0 +1,117 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2014  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#include <numeric>
+#include <shogun/io/SGIO.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/GPUMatrix.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/mmd/WithinBlockPermutation.h>
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+
+WithinBlockPermutation::WithinBlockPermutation(index_t nx, index_t ny, EStatisticType type)
+: n_x(nx), n_y(ny), stype(type), terms()
+{
+	SG_SDEBUG("number of samples are %d and %d!\n", n_x, n_y);
+	permuted_inds=SGVector<index_t>(n_x+n_y);
+	inverted_permuted_inds=SGVector<index_t>(permuted_inds.vlen);
+}
+
+void WithinBlockPermutation::add_term(float32_t val, index_t i, index_t j)
+{
+	if (i<n_x && j<n_x && i<=j)
+	{
+		SG_SDEBUG("Adding Kernel(%d,%d)=%f to term_0!\n", i, j, val);
+		terms.term[0]+=val;
+		if (i==j)
+			terms.diag[0]+=val;
+	}
+	else if (i>=n_x && j>=n_x && i<=j)
+	{
+		SG_SDEBUG("Adding Kernel(%d,%d)=%f to term_1!\n", i, j, val);
+		terms.term[1]+=val;
+		if (i==j)
+			terms.diag[1]+=val;
+	}
+	else if (i>=n_x && j<n_x)
+	{
+		SG_SDEBUG("Adding Kernel(%d,%d)=%f to term_2!\n", i, j, val);
+		terms.term[2]+=val;
+		if (i-n_x==j)
+			terms.diag[2]+=val;
+	}
+}
+
+float32_t WithinBlockPermutation::operator()(const SGMatrix<float32_t>& km)
+{
+	SG_SDEBUG("Entering!\n");
+
+	std::iota(permuted_inds.vector, permuted_inds.vector+permuted_inds.vlen, 0);
+	CMath::permute(permuted_inds);
+	for (int i=0; i<permuted_inds.vlen; ++i)
+		inverted_permuted_inds[permuted_inds[i]]=i;
+
+	std::fill(&terms.term[0], &terms.term[2]+1, 0);
+	std::fill(&terms.diag[0], &terms.diag[2]+1, 0);
+
+	for (auto j=0; j<n_x+n_y; ++j)
+	{
+		for (auto i=0; i<n_x+n_y; ++i)
+			add_term(km(i, j), inverted_permuted_inds[i], inverted_permuted_inds[j]);
+	}
+
+	terms.term[0]=2*(terms.term[0]-terms.diag[0]);
+	terms.term[1]=2*(terms.term[1]-terms.diag[1]);
+	SG_SDEBUG("term_0 sum (without diagonal) = %f!\n", terms.term[0]);
+	SG_SDEBUG("term_1 sum (without diagonal) = %f!\n", terms.term[1]);
+	if (stype!=ST_BIASED_FULL)
+	{
+		terms.term[0]/=n_x*(n_x-1);
+		terms.term[1]/=n_y*(n_y-1);
+	}
+	else
+	{
+		terms.term[0]+=terms.diag[0];
+		terms.term[1]+=terms.diag[1];
+		SG_SDEBUG("term_0 sum (with diagonal) = %f!\n", terms.term[0]);
+		SG_SDEBUG("term_1 sum (with diagonal) = %f!\n", terms.term[1]);
+		terms.term[0]/=n_x*n_x;
+		terms.term[1]/=n_y*n_y;
+	}
+	SG_SDEBUG("term_0 (normalized) = %f!\n", terms.term[0]);
+	SG_SDEBUG("term_1 (normalized) = %f!\n", terms.term[1]);
+
+	SG_SDEBUG("term_2 sum (with diagonal) = %f!\n", terms.term[2]);
+	if (stype==ST_UNBIASED_INCOMPLETE)
+	{
+		terms.term[2]-=terms.diag[2];
+		SG_SDEBUG("term_2 sum (without diagonal) = %f!\n", terms.term[2]);
+		terms.term[2]/=n_x*(n_x-1);
+	}
+	else
+		terms.term[2]/=n_x*n_y;
+	SG_SDEBUG("term_2 (normalized) = %f!\n", terms.term[2]);
+
+	SG_SDEBUG("Leaving!\n");
+	return terms.term[0]+terms.term[1]-2*terms.term[2];
+}
diff --git a/src/shogun/statistical_testing/internals/mmd/WithinBlockPermutation.h b/src/shogun/statistical_testing/internals/mmd/WithinBlockPermutation.h
new file mode 100644
index 00000000000..c4895513133
--- /dev/null
+++ b/src/shogun/statistical_testing/internals/mmd/WithinBlockPermutation.h
@@ -0,0 +1,67 @@
+/*
+ * Restructuring Shogun's statistical hypothesis testing framework.
+ * Copyright (C) 2014  Soumyajit De
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http:/www.gnu.org/licenses/>.
+ */
+
+#ifndef WITHIN_BLOCK_PERMUTATION_H_
+#define WITHIN_BLOCK_PERMUTATION_H_
+
+#include <shogun/lib/common.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/statistical_testing/TestEnums.h>
+
+namespace shogun
+{
+
+template <typename T> class SGMatrix;
+template <typename T> class CGPUMatrix;
+
+namespace internal
+{
+
+namespace mmd
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class WithinBlockPermutation
+{
+	typedef float32_t return_type;
+public:
+	WithinBlockPermutation(index_t, index_t, EStatisticType);
+	return_type operator()(const SGMatrix<return_type>& kernel_matrix);
+//	return_type operator()(const CGPUMatrix<return_type>& kernel_matrix);
+private:
+	void add_term(float32_t, index_t, index_t);
+
+	const index_t n_x;
+	const index_t n_y;
+	const EStatisticType stype;
+	SGVector<index_t> permuted_inds;
+	SGVector<index_t> inverted_permuted_inds;
+	struct terms_t
+	{
+		float32_t term[3];
+		float32_t diag[3];
+	};
+	terms_t terms;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+}
+
+#endif // WITHIN_BLOCK_PERMUTATION_H_
diff --git a/src/shogun/statistical_testing/kernelselection/KernelSelectionStrategy.cpp b/src/shogun/statistical_testing/kernelselection/KernelSelectionStrategy.cpp
new file mode 100644
index 00000000000..b6382b3058f
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/KernelSelectionStrategy.cpp
@@ -0,0 +1,265 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/io/SGIO.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/StreamingMMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h>
+#include <shogun/statistical_testing/kernelselection/internals/KernelSelection.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxMeasure.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxTestPower.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.h>
+#include <shogun/statistical_testing/kernelselection/internals/MedianHeuristic.h>
+#include <shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.h>
+#include <shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.h>
+
+using namespace shogun;
+using namespace internal;
+
+struct CKernelSelectionStrategy::Self
+{
+	Self();
+
+	KernelManager kernel_mgr;
+	std::unique_ptr<KernelSelection> policy;
+
+	EKernelSelectionMethod method;
+	bool weighted;
+	index_t num_runs;
+	index_t num_folds;
+	float64_t alpha;
+
+	void init_policy(CMMD* estimator);
+
+	const static EKernelSelectionMethod default_method;
+	const static bool default_weighted;
+	const static index_t default_num_runs;
+	const static index_t default_num_folds;
+	const static float64_t default_alpha;
+};
+
+const EKernelSelectionMethod CKernelSelectionStrategy::Self::default_method=KSM_AUTO;
+const bool CKernelSelectionStrategy::Self::default_weighted=false;
+const index_t CKernelSelectionStrategy::Self::default_num_runs=10;
+const index_t CKernelSelectionStrategy::Self::default_num_folds=3;
+const float64_t CKernelSelectionStrategy::Self::default_alpha=0.05;
+
+CKernelSelectionStrategy::Self::Self() : policy(nullptr), method(default_method),
+	weighted(default_weighted), num_runs(default_num_runs), num_folds(default_num_folds), alpha(default_alpha)
+{
+}
+
+void CKernelSelectionStrategy::Self::init_policy(CMMD* estimator)
+{
+	switch (method)
+	{
+	case KSM_MEDIAN_HEURISTIC:
+	{
+		REQUIRE(!weighted, "Weighted kernel selection is not possible with MEDIAN_HEURISTIC!\n");
+		policy=std::unique_ptr<MedianHeuristic>(new MedianHeuristic(kernel_mgr, estimator));
+	}
+	break;
+	case KSM_CROSS_VALIDATION:
+	{
+		REQUIRE(!weighted, "Weighted kernel selection is not possible with CROSS_VALIDATION!\n");
+		policy=std::unique_ptr<MaxCrossValidation>(new MaxCrossValidation(kernel_mgr, estimator,
+			num_runs, num_folds, alpha));
+	}
+	break;
+	case KSM_MAXIMIZE_MMD:
+	{
+		if (weighted)
+			policy=std::unique_ptr<WeightedMaxMeasure>(new WeightedMaxMeasure(kernel_mgr, estimator));
+		else
+			policy=std::unique_ptr<MaxMeasure>(new MaxMeasure(kernel_mgr, estimator));
+	}
+	break;
+	case KSM_MAXIMIZE_POWER:
+	{
+		if (weighted)
+		{
+			auto casted_estimator=dynamic_cast<CStreamingMMD*>(estimator);
+			REQUIRE(casted_estimator, "Weighted kernel selection is not possible with MAXIMIZE_POWER!\n");
+			policy=std::unique_ptr<WeightedMaxTestPower>(new WeightedMaxTestPower(kernel_mgr, estimator));
+		}
+		else
+			policy=std::unique_ptr<MaxTestPower>(new MaxTestPower(kernel_mgr, estimator));
+	}
+	break;
+	default:
+	{
+		SG_SERROR("Unsupported kernel selection method specified! Accepted strategies are "
+			"MAXIMIZE_MMD (single, weighted), "
+			"MAXIMIZE_POWER (single, weighted), "
+			"CROSS_VALIDATION (single) and "
+			"MEDIAN_HEURISTIC (single)!\n");
+	}
+	break;
+	}
+}
+
+CKernelSelectionStrategy::CKernelSelectionStrategy()
+{
+	init();
+}
+
+CKernelSelectionStrategy::CKernelSelectionStrategy(EKernelSelectionMethod method, bool weighted)
+{
+	init();
+	self->method=method;
+	self->weighted=weighted;
+}
+
+CKernelSelectionStrategy::CKernelSelectionStrategy(EKernelSelectionMethod method, index_t num_runs,
+	index_t num_folds, float64_t alpha)
+{
+	init();
+	self->method=method;
+	self->num_runs=num_runs;
+	self->num_folds=num_folds;
+	self->alpha=alpha;
+}
+
+void CKernelSelectionStrategy::init()
+{
+	self=std::unique_ptr<Self>(new Self());
+}
+
+CKernelSelectionStrategy::~CKernelSelectionStrategy()
+{
+	self->kernel_mgr.clear();
+}
+
+CKernelSelectionStrategy& CKernelSelectionStrategy::use_method(EKernelSelectionMethod method)
+{
+	self->method=method;
+	return *this;
+}
+
+CKernelSelectionStrategy& CKernelSelectionStrategy::use_num_runs(index_t num_runs)
+{
+	self->num_runs=num_runs;
+	return *this;
+}
+
+CKernelSelectionStrategy& CKernelSelectionStrategy::use_num_folds(index_t num_folds)
+{
+	self->num_folds=num_folds;
+	return *this;
+}
+
+CKernelSelectionStrategy& CKernelSelectionStrategy::use_alpha(float64_t alpha)
+{
+	self->alpha=alpha;
+	return *this;
+}
+
+CKernelSelectionStrategy& CKernelSelectionStrategy::use_weighted(bool weighted)
+{
+	self->weighted=weighted;
+	return *this;
+}
+
+EKernelSelectionMethod CKernelSelectionStrategy::get_method() const
+{
+	return self->method;
+}
+
+index_t CKernelSelectionStrategy::get_num_runs() const
+{
+	return self->num_runs;
+}
+
+index_t CKernelSelectionStrategy::get_num_folds() const
+{
+	return self->num_folds;
+}
+
+float64_t CKernelSelectionStrategy::get_alpha() const
+{
+	return self->alpha;
+}
+
+bool CKernelSelectionStrategy::get_weighted() const
+{
+	return self->weighted;
+}
+
+void CKernelSelectionStrategy::add_kernel(CKernel* kernel)
+{
+	self->kernel_mgr.push_back(kernel);
+}
+
+CKernel* CKernelSelectionStrategy::select_kernel(CMMD* estimator)
+{
+	auto num_kernels=self->kernel_mgr.num_kernels();
+	REQUIRE(num_kernels>0, "Number of kernels is 0. Please add kernels using add_kernel method!\n");
+	SG_DEBUG("Selecting kernels from a total of %d kernels!\n", num_kernels);
+
+	self->init_policy(estimator);
+	ASSERT(self->policy!=nullptr);
+
+	return self->policy->select_kernel();
+}
+
+// TODO call this method when test train mode is turned off
+void CKernelSelectionStrategy::erase_intermediate_results()
+{
+	self->policy=nullptr;
+	self->kernel_mgr.clear();
+}
+
+SGMatrix<float64_t> CKernelSelectionStrategy::get_measure_matrix()
+{
+	REQUIRE(self->policy!=nullptr, "The kernel selection policy is not initialized!\n");
+	return self->policy->get_measure_matrix();
+}
+
+SGVector<float64_t> CKernelSelectionStrategy::get_measure_vector()
+{
+	REQUIRE(self->policy!=nullptr, "The kernel selection policy is not initialized!\n");
+	return self->policy->get_measure_vector();
+}
+
+const char* CKernelSelectionStrategy::get_name() const
+{
+	return "KernelSelectionStrategy";
+}
+
+const KernelManager& CKernelSelectionStrategy::get_kernel_mgr() const
+{
+	return self->kernel_mgr;
+}
diff --git a/src/shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h b/src/shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h
new file mode 100644
index 00000000000..456fdab28c9
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h
@@ -0,0 +1,95 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2012 - 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef KERNEL_SELECTION_STRAGERY_H_
+#define KERNEL_SELECTION_STRAGERY_H_
+
+#include <memory>
+#include <shogun/base/SGObject.h>
+#include <shogun/statistical_testing/TestEnums.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+class CQuadraticTimeMMD;
+template <class> class SGVector;
+template <class> class SGMatrix;
+
+namespace internal
+{
+
+class KernelManager;
+
+}
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class CKernelSelectionStrategy : public CSGObject
+{
+	friend class CMMD;
+	friend class CStreamingMMD;
+	friend class CQuadraticTimeMMD;
+public:
+	CKernelSelectionStrategy();
+	CKernelSelectionStrategy(EKernelSelectionMethod method, bool weighted = false);
+	CKernelSelectionStrategy(EKernelSelectionMethod method, index_t num_runs, index_t num_folds, float64_t alpha);
+	CKernelSelectionStrategy(const CKernelSelectionStrategy& other)=delete;
+	CKernelSelectionStrategy& operator=(const CKernelSelectionStrategy& other)=delete;
+	virtual ~CKernelSelectionStrategy();
+
+	CKernelSelectionStrategy& use_method(EKernelSelectionMethod method);
+	CKernelSelectionStrategy& use_num_runs(index_t num_runs);
+	CKernelSelectionStrategy& use_num_folds(index_t num_folds);
+	CKernelSelectionStrategy& use_alpha(float64_t alpha);
+	CKernelSelectionStrategy& use_weighted(bool weighted);
+
+	EKernelSelectionMethod get_method() const;
+	index_t get_num_runs() const;
+	index_t get_num_folds() const;
+	float64_t get_alpha() const;
+	bool get_weighted() const;
+
+	void add_kernel(CKernel* kernel);
+	CKernel* select_kernel(CMMD* estimator);
+	virtual const char* get_name() const;
+	void erase_intermediate_results();
+
+	SGMatrix<float64_t> get_measure_matrix();
+	SGVector<float64_t> get_measure_vector();
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+	void init();
+	const internal::KernelManager& get_kernel_mgr() const;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+#endif // KERNEL_SELECTION_STRAGERY_H_
diff --git a/src/shogun/statistical_testing/kernelselection/internals/KernelSelection.cpp b/src/shogun/statistical_testing/kernelselection/internals/KernelSelection.cpp
new file mode 100644
index 00000000000..555748278ec
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/KernelSelection.cpp
@@ -0,0 +1,51 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/io/SGIO.h>
+#include <shogun/distance/Distance.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/ShiftInvariantKernel.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/kernelselection/internals/KernelSelection.h>
+
+using namespace shogun;
+using namespace internal;
+
+KernelSelection::KernelSelection(KernelManager& km, CMMD* est) : kernel_mgr(km), estimator(est)
+{
+	REQUIRE(kernel_mgr.num_kernels()>0, "Number of kernels is %d!\n", kernel_mgr.num_kernels());
+	REQUIRE(estimator!=nullptr, "Estimator is not set!\n");
+}
+
+KernelSelection::~KernelSelection()
+{
+}
diff --git a/src/shogun/statistical_testing/kernelselection/internals/KernelSelection.h b/src/shogun/statistical_testing/kernelselection/internals/KernelSelection.h
new file mode 100644
index 00000000000..d2bf3cfe5d5
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/KernelSelection.h
@@ -0,0 +1,71 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef KERNEL_SELECTION_H__
+#define KERNEL_SELECTION_H__
+
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+template <class> class SGVector;
+template <class> class SGMatrix;
+
+namespace internal
+{
+
+class KernelManager;
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class KernelSelection
+{
+public:
+	KernelSelection(KernelManager&, CMMD*);
+	KernelSelection(const KernelSelection& other)=delete;
+	virtual ~KernelSelection();
+	KernelSelection& operator=(const KernelSelection& other)=delete;
+	virtual CKernel* select_kernel()=0;
+	virtual SGMatrix<float64_t> get_measure_matrix()=0;
+	virtual SGVector<float64_t> get_measure_vector()=0;
+protected:
+	const KernelManager& kernel_mgr;
+	CMMD* estimator;
+	virtual void init_measures()=0;
+	virtual void compute_measures()=0;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // KERNEL_SELECTION_H__
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.cpp b/src/shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.cpp
new file mode 100644
index 00000000000..cefb31eae32
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.cpp
@@ -0,0 +1,174 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <numeric>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/distance/Distance.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/NextSamples.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <shogun/statistical_testing/internals/mmd/CrossValidationMMD.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.h>
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+
+MaxCrossValidation::MaxCrossValidation(KernelManager& km, CMMD* est, const index_t& M, const index_t& K, const float64_t& alp)
+: KernelSelection(km, est), num_runs(M), num_folds(K),  alpha(alp)
+{
+	REQUIRE(num_runs>0, "Number of runs (%d) must be positive!\n", num_runs);
+	REQUIRE(num_folds>0, "Number of folds (%d) must be positive!\n", num_folds);
+	REQUIRE(alpha>=0.0 && alpha<=1.0, "Threshold (%f) has to be in [0, 1]!\n", alpha);
+}
+
+MaxCrossValidation::~MaxCrossValidation()
+{
+}
+
+SGVector<float64_t> MaxCrossValidation::get_measure_vector()
+{
+	return measures;
+}
+
+SGMatrix<float64_t> MaxCrossValidation::get_measure_matrix()
+{
+	return rejections;
+}
+
+void MaxCrossValidation::init_measures()
+{
+	const index_t num_kernels=kernel_mgr.num_kernels();
+	if (rejections.num_rows!=num_folds*num_runs || rejections.num_cols!=num_kernels)
+		rejections=SGMatrix<float64_t>(num_folds*num_runs, num_kernels);
+	std::fill(rejections.data(), rejections.data()+rejections.size(), 0);
+	if (measures.size()!=num_kernels)
+		measures=SGVector<float64_t>(num_kernels);
+	std::fill(measures.data(), measures.data()+measures.size(), 0);
+}
+
+void MaxCrossValidation::compute_measures()
+{
+	SG_SDEBUG("Performing %d fold cross-validattion!\n", num_folds);
+	const size_t num_kernels=kernel_mgr.num_kernels();
+
+	CQuadraticTimeMMD* quadratic_time_mmd=dynamic_cast<CQuadraticTimeMMD*>(estimator);
+	if (quadratic_time_mmd)
+	{
+		REQUIRE(estimator->get_null_approximation_method()==NAM_PERMUTATION,
+			"Only supported with PERMUTATION method for null distribution approximation!\n");
+
+		auto Nx=estimator->get_num_samples_p();
+		auto Ny=estimator->get_num_samples_q();
+		auto num_null_samples=estimator->get_num_null_samples();
+		auto stype=estimator->get_statistic_type();
+		CrossValidationMMD compute(Nx, Ny, num_folds, num_null_samples);
+		compute.m_stype=stype;
+		compute.m_alpha=alpha;
+		compute.m_num_runs=num_runs;
+		compute.m_rejections=rejections;
+
+		if (kernel_mgr.same_distance_type())
+		{
+			CDistance* distance=kernel_mgr.get_distance_instance();
+			kernel_mgr.set_precomputed_distance(estimator->compute_joint_distance(distance));
+			SG_UNREF(distance);
+			compute(kernel_mgr);
+			kernel_mgr.unset_precomputed_distance();
+		}
+		else
+		{
+			auto samples_p_and_q=quadratic_time_mmd->get_p_and_q();
+			SG_REF(samples_p_and_q);
+
+			for (size_t k=0; k<num_kernels; ++k)
+			{
+				CKernel* kernel=kernel_mgr.kernel_at(k);
+				kernel->init(samples_p_and_q, samples_p_and_q);
+			}
+
+			compute(kernel_mgr);
+
+			for (size_t k=0; k<num_kernels; ++k)
+			{
+				CKernel* kernel=kernel_mgr.kernel_at(k);
+				kernel->remove_lhs_and_rhs();
+			}
+
+			SG_UNREF(samples_p_and_q);
+		}
+	}
+	else // TODO put check, this one assumes infinite data
+	{
+		auto existing_kernel=estimator->get_kernel();
+		for (auto i=0; i<num_runs; ++i)
+		{
+			for (auto j=0; j<num_folds; ++j)
+			{
+				SG_SDEBUG("Running fold %d\n", j);
+				for (size_t k=0; k<num_kernels; ++k)
+				{
+					auto kernel=kernel_mgr.kernel_at(k);
+					estimator->set_kernel(kernel);
+					auto statistic=estimator->compute_statistic();
+					rejections(i*num_folds+j, k)=estimator->compute_p_value(statistic)<alpha;
+					estimator->cleanup();
+				}
+			}
+		}
+		if (existing_kernel)
+			estimator->set_kernel(existing_kernel);
+	}
+
+	for (auto j=0; j<rejections.num_cols; ++j)
+	{
+		auto begin=rejections.get_column_vector(j);
+		auto size=rejections.num_rows;
+		measures[j]=std::accumulate(begin, begin+size, 0.0)/size;
+	}
+}
+
+CKernel* MaxCrossValidation::select_kernel()
+{
+	init_measures();
+	compute_measures();
+	auto max_element=std::max_element(measures.vector, measures.vector+measures.vlen);
+	auto max_idx=std::distance(measures.vector, max_element);
+	SG_SDEBUG("Selected kernel at %d position!\n", max_idx);
+	return kernel_mgr.kernel_at(max_idx);
+}
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.h b/src/shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.h
new file mode 100644
index 00000000000..4dda6a213bd
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MaxCrossValidation.h
@@ -0,0 +1,72 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef MAX_CROSS_VALIDATION_H__
+#define MAX_CROSS_VALIDATION_H__
+
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/kernelselection/internals/KernelSelection.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+template <typename T> class SGVector;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class MaxCrossValidation : public KernelSelection
+{
+public:
+	MaxCrossValidation(KernelManager&, CMMD*, const index_t&, const index_t&, const float64_t&);
+	MaxCrossValidation(const MaxCrossValidation& other)=delete;
+	~MaxCrossValidation();
+	MaxCrossValidation& operator=(const MaxCrossValidation& other)=delete;
+	virtual CKernel* select_kernel() override;
+	virtual SGVector<float64_t> get_measure_vector();
+	virtual SGMatrix<float64_t> get_measure_matrix();
+protected:
+	virtual void init_measures();
+	virtual void compute_measures();
+	const index_t num_runs;
+	const index_t num_folds;
+	const float64_t alpha;
+	SGMatrix<float64_t> rejections;
+	SGVector<float64_t> measures;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // MAX_CROSS_VALIDATION_H__
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MaxMeasure.cpp b/src/shogun/statistical_testing/kernelselection/internals/MaxMeasure.cpp
new file mode 100644
index 00000000000..d0748e86230
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MaxMeasure.cpp
@@ -0,0 +1,104 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxMeasure.h>
+
+using namespace shogun;
+using namespace internal;
+
+MaxMeasure::MaxMeasure(KernelManager& km, CMMD* est) : KernelSelection(km, est)
+{
+}
+
+MaxMeasure::~MaxMeasure()
+{
+}
+
+SGVector<float64_t> MaxMeasure::get_measure_vector()
+{
+	return measures;
+}
+
+SGMatrix<float64_t> MaxMeasure::get_measure_matrix()
+{
+	SG_SNOTIMPLEMENTED;
+	return SGMatrix<float64_t>();
+}
+
+void MaxMeasure::init_measures()
+{
+	const index_t num_kernels=kernel_mgr.num_kernels();
+	REQUIRE(num_kernels>0, "Number of kernels is %d!\n", kernel_mgr.num_kernels());
+	if (measures.size()!=num_kernels)
+		measures=SGVector<float64_t>(num_kernels);
+	std::fill(measures.data(), measures.data()+measures.size(), 0);
+}
+
+void MaxMeasure::compute_measures()
+{
+	REQUIRE(estimator!=nullptr, "Estimator is not set!\n");
+	CQuadraticTimeMMD* mmd=dynamic_cast<CQuadraticTimeMMD*>(estimator);
+	if (mmd!=nullptr && kernel_mgr.same_distance_type())
+		measures=mmd->multikernel()->statistic(kernel_mgr);
+	else
+	{
+		init_measures();
+		auto existing_kernel=estimator->get_kernel();
+		const size_t num_kernels=kernel_mgr.num_kernels();
+		for (size_t i=0; i<num_kernels; ++i)
+		{
+			auto kernel=kernel_mgr.kernel_at(i);
+			estimator->set_kernel(kernel);
+			measures[i]=estimator->compute_statistic();
+			estimator->cleanup();
+		}
+		if (existing_kernel)
+			estimator->set_kernel(existing_kernel);
+	}
+}
+
+CKernel* MaxMeasure::select_kernel()
+{
+	compute_measures();
+	ASSERT(size_t(measures.size())==kernel_mgr.num_kernels());
+	auto max_element=std::max_element(measures.vector, measures.vector+measures.vlen);
+	auto max_idx=std::distance(measures.vector, max_element);
+	SG_SDEBUG("Selected kernel at %d position!\n", max_idx);
+	return kernel_mgr.kernel_at(max_idx);
+}
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MaxMeasure.h b/src/shogun/statistical_testing/kernelselection/internals/MaxMeasure.h
new file mode 100644
index 00000000000..fff5b81a4df
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MaxMeasure.h
@@ -0,0 +1,69 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef MAX_MEASURE_H__
+#define MAX_MEASURE_H__
+
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/kernelselection/internals/KernelSelection.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+template <typename T> class SGVector;
+template <typename T> class SGMatrix;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class MaxMeasure : public KernelSelection
+{
+public:
+	MaxMeasure(KernelManager&, CMMD*);
+	MaxMeasure(const MaxMeasure& other)=delete;
+	~MaxMeasure();
+	MaxMeasure& operator=(const MaxMeasure& other)=delete;
+	virtual CKernel* select_kernel();
+	virtual SGVector<float64_t> get_measure_vector();
+	virtual SGMatrix<float64_t> get_measure_matrix();
+protected:
+	virtual void init_measures();
+	virtual void compute_measures();
+	SGVector<float64_t> measures;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // MAX_MEASURE_H__
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MaxTestPower.cpp b/src/shogun/statistical_testing/kernelselection/internals/MaxTestPower.cpp
new file mode 100644
index 00000000000..cb1635bfc43
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MaxTestPower.cpp
@@ -0,0 +1,83 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <shogun/lib/SGVector.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/statistical_testing/StreamingMMD.h>
+#include <shogun/statistical_testing/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxTestPower.h>
+
+using namespace shogun;
+using namespace internal;
+
+MaxTestPower::MaxTestPower(KernelManager& km, CMMD* est) : MaxMeasure(km, est), lambda(1E-5)
+{
+}
+
+MaxTestPower::~MaxTestPower()
+{
+}
+
+void MaxTestPower::compute_measures()
+{
+	init_measures();
+	REQUIRE(estimator!=nullptr, "Estimator is not set!\n");
+	const auto m=estimator->get_num_samples_p();
+	const auto n=estimator->get_num_samples_q();
+	auto existing_kernel=estimator->get_kernel();
+	const size_t num_kernels=kernel_mgr.num_kernels();
+	auto streaming_mmd=dynamic_cast<CStreamingMMD*>(estimator);
+	if (streaming_mmd)
+	{
+		for (size_t i=0; i<num_kernels; ++i)
+		{
+			auto kernel=kernel_mgr.kernel_at(i);
+			estimator->set_kernel(kernel);
+			auto estimates=streaming_mmd->compute_statistic_variance();
+			auto var_est=estimates.first;
+			auto mmd_est=estimates.second*(m+n)/m/n;
+			measures[i]=mmd_est/CMath::sqrt(var_est+lambda);
+			estimator->cleanup();
+		}
+	}
+	else
+	{
+		auto quadratictime_mmd=dynamic_cast<CQuadraticTimeMMD*>(estimator);
+		ASSERT(quadratictime_mmd);
+		measures=quadratictime_mmd->multikernel()->test_power(kernel_mgr);
+	}
+	if (existing_kernel)
+		estimator->set_kernel(existing_kernel);
+}
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MaxTestPower.h b/src/shogun/statistical_testing/kernelselection/internals/MaxTestPower.h
new file mode 100644
index 00000000000..5229f67d502
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MaxTestPower.h
@@ -0,0 +1,63 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef MAX_TEST_POWER_H__
+#define MAX_TEST_POWER_H__
+
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxMeasure.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class MaxTestPower : public MaxMeasure
+{
+public:
+	MaxTestPower(KernelManager&, CMMD*);
+	MaxTestPower(const MaxTestPower& other)=delete;
+	~MaxTestPower();
+	MaxTestPower& operator=(const MaxTestPower& other)=delete;
+protected:
+	virtual void compute_measures();
+	float64_t lambda;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // MAX_TEST_POWER_H__
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MedianHeuristic.cpp b/src/shogun/statistical_testing/kernelselection/internals/MedianHeuristic.cpp
new file mode 100644
index 00000000000..d7e98d69f04
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MedianHeuristic.cpp
@@ -0,0 +1,115 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <vector>
+#include <algorithm>
+#include <shogun/io/SGIO.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/distance/EuclideanDistance.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/kernelselection/internals/MedianHeuristic.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+
+using namespace shogun;
+using namespace internal;
+
+MedianHeuristic::MedianHeuristic(KernelManager& km, CMMD* est) : KernelSelection(km, est), distance(nullptr)
+{
+	for (size_t i=0; i<kernel_mgr.num_kernels(); ++i)
+	{
+		REQUIRE(kernel_mgr.kernel_at(i)->get_kernel_type()==K_GAUSSIAN,
+			"The underlying kernel has to be a GaussianKernel (was %s)!\n",
+			kernel_mgr.kernel_at(i)->get_name());
+	}
+}
+
+MedianHeuristic::~MedianHeuristic()
+{
+}
+
+void MedianHeuristic::init_measures()
+{
+	SG_SNOTIMPLEMENTED;
+}
+
+void MedianHeuristic::compute_measures()
+{
+	auto tmp=new CEuclideanDistance();
+	tmp->set_disable_sqrt(false);
+	SG_REF(tmp);
+	distance=std::shared_ptr<CCustomDistance>(estimator->compute_joint_distance(tmp));
+	SG_UNREF(tmp);
+
+	n=distance->get_num_vec_lhs();
+	REQUIRE(distance->get_num_vec_lhs()==distance->get_num_vec_rhs(),
+		"Distance matrix is supposed to be a square matrix (was of dimension %dX%d)!\n",
+		distance->get_num_vec_lhs(), distance->get_num_vec_rhs());
+	measures=SGVector<float64_t>((n*(n-1))/2);
+	size_t write_idx=0;
+	for (auto j=0; j<n; ++j)
+	{
+		for (auto i=j+1; i<n; ++i)
+			measures[write_idx++]=distance->distance(i, j);
+	}
+	std::sort(measures.data(), measures.data()+measures.size());
+}
+
+SGVector<float64_t> MedianHeuristic::get_measure_vector()
+{
+	return measures;
+}
+
+SGMatrix<float64_t> MedianHeuristic::get_measure_matrix()
+{
+	REQUIRE(distance!=nullptr, "Distance is not initialized!\n");
+	return distance->get_distance_matrix();
+}
+
+CKernel* MedianHeuristic::select_kernel()
+{
+	compute_measures();
+	auto median_distance=measures[measures.size()/2];
+	SG_SDEBUG("kernel width (shogun): %f\n", median_distance);
+
+	const size_t num_kernels=kernel_mgr.num_kernels();
+	measures=SGVector<float64_t>(num_kernels);
+	for (size_t i=0; i<num_kernels; ++i)
+	{
+		CGaussianKernel *kernel=static_cast<CGaussianKernel*>(kernel_mgr.kernel_at(i));
+		measures[i]=CMath::abs(kernel->get_width()-median_distance);
+	}
+
+	size_t kernel_idx=std::distance(measures.data(), std::min_element(measures.data(), measures.data()+measures.size()));
+	SG_SDEBUG("Selected kernel at %d position!\n", kernel_idx);
+	return kernel_mgr.kernel_at(kernel_idx);
+}
diff --git a/src/shogun/statistical_testing/kernelselection/internals/MedianHeuristic.h b/src/shogun/statistical_testing/kernelselection/internals/MedianHeuristic.h
new file mode 100644
index 00000000000..a59d457a647
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/MedianHeuristic.h
@@ -0,0 +1,72 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef MEDIAN_HEURISTIC_H__
+#define MEDIAN_HEURISTIC_H__
+
+#include <memory>
+#include <shogun/statistical_testing/kernelselection/internals/KernelSelection.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+class CCustomDistance;
+template <typename T> class SGVector;
+template <typename T> class SGMatrix;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class MedianHeuristic : public KernelSelection
+{
+public:
+	MedianHeuristic(KernelManager&, CMMD*);
+	MedianHeuristic(const MedianHeuristic& other)=delete;
+	~MedianHeuristic();
+	MedianHeuristic& operator=(const MedianHeuristic& other)=delete;
+	virtual CKernel* select_kernel() override;
+	virtual SGVector<float64_t> get_measure_vector();
+	virtual SGMatrix<float64_t> get_measure_matrix();
+protected:
+	virtual void init_measures();
+	virtual void compute_measures();
+	std::shared_ptr<CCustomDistance> distance;
+	SGVector<float64_t> measures;
+	int32_t n;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // MEDIAN_HEURISTIC_H__
diff --git a/src/shogun/statistical_testing/kernelselection/internals/OptimizationSolver.cpp b/src/shogun/statistical_testing/kernelselection/internals/OptimizationSolver.cpp
new file mode 100644
index 00000000000..77788ae8aea
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/OptimizationSolver.cpp
@@ -0,0 +1,164 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * Written (W) 2011-2013 Heiko Strathmann
+ * Written (W) 2016 Soumyajit De
+ */
+
+#include <functional>
+#include <algorithm>
+#include <numeric>
+#include <vector>
+#include <shogun/io/SGIO.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/statistical_testing/kernelselection/internals/OptimizationSolver.h>
+
+//#ifdef USE_GPL_SHOGUN
+#include <shogun/lib/external/libqp.h>
+//#endif // USE_GPL_SHOGUN
+
+using namespace shogun;
+using namespace internal;
+
+struct OptimizationSolver::Self
+{
+	Self(SGVector<float64_t> mmds, SGMatrix<float64_t> Q);
+//#ifdef USE_GPL_SHOGUN
+	SGVector<float64_t> solve() const;
+	void init();
+	static const float64_t* get_Q_col(uint32_t i);
+	static void print_state(libqp_state_T state);
+
+	index_t opt_max_iterations;
+	float64_t opt_epsilon;
+	float64_t opt_low_cut;
+	SGVector<float64_t> m_mmds;
+	static SGMatrix<float64_t> m_Q;
+//#endif // USE_GPL_SHOGUN
+};
+
+//#ifdef USE_GPL_SHOGUN
+SGMatrix<float64_t> OptimizationSolver::Self::m_Q=SGMatrix<float64_t>();
+//#endif // USE_GPL_SHOGUN
+
+OptimizationSolver::Self::Self(SGVector<float64_t> mmds, SGMatrix<float64_t> Q)
+{
+//#ifdef USE_GPL_SHOGUN
+	m_Q=Q;
+	m_mmds=mmds;
+	init();
+//#endif // USE_GPL_SHOGUN
+}
+
+//#ifdef USE_GPL_SHOGUN
+void OptimizationSolver::Self::init()
+{
+	opt_max_iterations=10000;
+	opt_epsilon=1E-14;
+	opt_low_cut=1E-6;
+}
+
+const float64_t* OptimizationSolver::Self::get_Q_col(uint32_t i)
+{
+	return &m_Q[m_Q.num_rows*i];
+}
+
+void OptimizationSolver::Self::print_state(libqp_state_T state)
+{
+	SG_SDEBUG("libqp state: primal=%f\n", state.QP);
+}
+
+SGVector<float64_t> OptimizationSolver::Self::solve() const
+{
+	const index_t num_kernels=m_mmds.size();
+	float64_t sum_m_mmds=std::accumulate(m_mmds.data(), m_mmds.data()+m_mmds.size(), 0);
+	SGVector<float64_t> weights(num_kernels);
+	if (std::any_of(m_mmds.data(), m_mmds.data()+m_mmds.size(), [](float64_t& value) { return value > 0; }))
+	{
+		SG_SDEBUG("At least one MMD entry is positive, performing optimisation\n")
+
+		std::vector<float64_t> Q_diag(num_kernels);
+		std::vector<float64_t> f(num_kernels, 0);
+		std::vector<float64_t> lb(num_kernels, 0);
+		std::vector<float64_t> ub(num_kernels, CMath::INFTY);
+
+		// initial point has to be feasible, i.e. m_mmds'*x = b
+		std::fill(weights.data(), weights.data()+weights.size(), 1.0/sum_m_mmds);
+
+		for (index_t i=0; i<num_kernels; ++i)
+			Q_diag[i]=m_Q(i,i);
+
+		SG_SDEBUG("starting libqp optimization\n");
+		libqp_state_T qp_exitflag=libqp_gsmo_solver(&OptimizationSolver::Self::get_Q_col,
+			Q_diag.data(),
+			f.data(),
+			m_mmds.data(),
+			1,
+			lb.data(),
+			ub.data(),
+			weights.data(),
+			num_kernels,
+			opt_max_iterations,
+			opt_epsilon,
+			&OptimizationSolver::Self::print_state);
+
+		SG_SDEBUG("libqp returns: nIts=%d, exit_flag: %d\n", qp_exitflag.nIter, qp_exitflag.exitflag);
+		m_Q=SGMatrix<float64_t>();
+
+		// set really small entries to zero and sum up for normalization
+		float64_t sum_weights=0;
+		for (index_t i=0; i<weights.vlen; ++i)
+		{
+			if (weights[i]<opt_low_cut)
+			{
+				SG_SDEBUG("lowcut: weight[%i]=%f<%f setting to zero\n", i, weights[i], opt_low_cut);
+				weights[i]=0;
+			}
+			sum_weights+=weights[i];
+		}
+
+		// normalize (allowed since problem is scale invariant)
+		std::for_each(weights.data(), weights.data()+weights.size(), [&sum_weights](float64_t& weight)
+		{
+			weight/=sum_weights;
+		});
+	}
+	else
+	{
+		SG_SWARNING("All mmd estimates are negative. This is techically possible,"
+			"although extremely rare. Consider using different kernels. "
+			"This combination will lead to a bad two-sample test. Since any"
+			"combination is bad, will now just return equally distributed "
+			"kernel weights\n");
+
+		// if no element is positive, we can choose arbritary weights since
+		// the results will be bad anyway
+		std::fill(weights.data(), weights.data()+weights.size(), 1.0/num_kernels);
+	}
+	return weights;
+}
+//#endif // USE_GPL_SHOGUN
+
+OptimizationSolver::OptimizationSolver(const SGVector<float64_t>& mmds, const SGMatrix<float64_t>& Q)
+{
+	self=std::unique_ptr<Self>(new Self(mmds, Q));
+}
+
+OptimizationSolver::~OptimizationSolver()
+{
+}
+
+SGVector<float64_t> OptimizationSolver::solve() const
+{
+//#ifdef USE_GPL_SHOGUN
+	return self->solve();
+//#else // USE_GPL_SHOGUN
+//	SG_SWARNING("Presently this feature is only available with GNU GPLv3 license!");
+//	return SGVector<float64_t>();
+//#endif // USE_GPL_SHOGUN
+}
diff --git a/src/shogun/statistical_testing/kernelselection/internals/OptimizationSolver.h b/src/shogun/statistical_testing/kernelselection/internals/OptimizationSolver.h
new file mode 100644
index 00000000000..d763c097c12
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/OptimizationSolver.h
@@ -0,0 +1,64 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef OPTIMIZATION_SOLVER_H__
+#define OPTIMIZATION_SOLVER_H__
+
+#include <memory>
+#include <shogun/lib/common.h>
+
+namespace shogun
+{
+
+template <typename T> class SGVector;
+template <typename T> class SGMatrix;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class OptimizationSolver
+{
+public:
+	OptimizationSolver(const SGVector<float64_t>& mmds, const SGMatrix<float64_t>& Q);
+	OptimizationSolver(const OptimizationSolver& other)=delete;
+	OptimizationSolver& operator=(const OptimizationSolver& other)=delete;
+	~OptimizationSolver();
+	SGVector<float64_t> solve() const;
+private:
+	struct Self;
+	std::unique_ptr<Self> self;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // OPTIMIZATION_SOLVER_H__
diff --git a/src/shogun/statistics/MMDKernelSelection.cpp b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.cpp
similarity index 52%
rename from src/shogun/statistics/MMDKernelSelection.cpp
rename to src/shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.cpp
index 4882423ad57..4ce92028c0d 100644
--- a/src/shogun/statistics/MMDKernelSelection.cpp
+++ b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.cpp
@@ -1,7 +1,7 @@
 /*
  * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -29,61 +29,58 @@
  * either expressed or implied, of the Shogun Development Team.
  */
 
-#include <shogun/statistics/MMDKernelSelection.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/kernel/Kernel.h>
 #include <shogun/kernel/CombinedKernel.h>
-#include <shogun/statistics/KernelTwoSampleTest.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/statistics/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.h>
+#include <shogun/statistical_testing/kernelselection/internals/OptimizationSolver.h>
 
 using namespace shogun;
+using namespace internal;
 
-CMMDKernelSelection::CMMDKernelSelection()
+WeightedMaxMeasure::WeightedMaxMeasure(KernelManager& km, CMMD* est) : MaxMeasure(km, est)
 {
 }
 
-CMMDKernelSelection::CMMDKernelSelection(CKernelTwoSampleTest* mmd)
-	: CKernelSelection(mmd)
+WeightedMaxMeasure::~WeightedMaxMeasure()
 {
-	/* ensure that mmd contains an instance of a MMD related class
-	   TODO - Add S_BTEST_MMD when feature/mmd is merged with develop */
-	REQUIRE(mmd->get_statistic_type()==S_LINEAR_TIME_MMD ||
-			mmd->get_statistic_type()==S_QUADRATIC_TIME_MMD,
-			"Provided instance for kernel two sample testing has to be a MMD-"
-			"based class! The provided is of class \"%s\"\n", mmd->get_name());
 }
 
-CMMDKernelSelection::~CMMDKernelSelection()
+void WeightedMaxMeasure::compute_measures()
 {
+	MaxMeasure::compute_measures();
+	const size_t num_kernels=kernel_mgr.num_kernels();
+	if (Q.num_rows!=num_kernels || Q.num_cols!=num_kernels)
+		Q=SGMatrix<float64_t>(num_kernels, num_kernels);
+	std::fill(Q.data(), Q.data()+Q.size(), 0);
+	for (size_t i=0; i<num_kernels; ++i)
+		Q(i, i)=1;
 }
 
-CKernel* CMMDKernelSelection::select_kernel()
+SGMatrix<float64_t> WeightedMaxMeasure::get_measure_matrix()
 {
-	SG_DEBUG("entering\n")
+	return Q;
+}
+
+CKernel* WeightedMaxMeasure::select_kernel()
+{
+	init_measures();
+	compute_measures();
 
-	/* compute measures and return single kernel with maximum measure */
-	SGVector<float64_t> measures=compute_measures();
+	OptimizationSolver solver(measures, Q);
+	SGVector<float64_t> weights=solver.solve();
 
-	/* find maximum and return corresponding kernel */
-	float64_t max=measures[0];
-	index_t max_idx=0;
-	for (index_t i=1; i<measures.vlen; ++i)
+	CCombinedKernel* kernel=new CCombinedKernel();
+	const size_t num_kernels=kernel_mgr.num_kernels();
+	for (size_t i=0; i<num_kernels; ++i)
 	{
-		if (measures[i]>max)
-		{
-			max=measures[i];
-			max_idx=i;
-		}
+		if (!kernel->append_kernel(kernel_mgr.kernel_at(i)))
+			SG_SERROR("Error while creating a combined kernel! Please contact Shogun developers!\n");
 	}
-
-	/* find kernel with corresponding index */
-	CCombinedKernel* combined=(CCombinedKernel*)m_estimator->get_kernel();
-	CKernel* current=combined->get_kernel(max_idx);
-
-	SG_UNREF(combined);
-	SG_DEBUG("leaving\n");
-
-	/* current is not SG_UNREF'ed nor SG_REF'ed since the counter needs to be
-	 * incremented exactly by one */
-	return current;
+	kernel->set_subkernel_weights(weights);
+	SG_SDEBUG("Created a weighted kernel!\n");
+	return kernel;
 }
-
diff --git a/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.h b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.h
new file mode 100644
index 00000000000..51c757db967
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.h
@@ -0,0 +1,65 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef WEIGHTED_MAX_MEASURE_H__
+#define WEIGHTED_MAX_MEASURE_H__
+
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/kernelselection/internals/MaxMeasure.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class WeightedMaxMeasure : public MaxMeasure
+{
+public:
+	WeightedMaxMeasure(KernelManager&, CMMD*);
+	WeightedMaxMeasure(const WeightedMaxMeasure& other)=delete;
+	~WeightedMaxMeasure();
+	WeightedMaxMeasure& operator=(const WeightedMaxMeasure& other)=delete;
+	virtual CKernel* select_kernel();
+	virtual SGMatrix<float64_t> get_measure_matrix();
+protected:
+	virtual void compute_measures();
+	SGMatrix<float64_t> Q;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // WEIGHTED_MAX_MEASURE_H__
diff --git a/src/shogun/statistics/KernelSelection.cpp b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.cpp
similarity index 55%
rename from src/shogun/statistics/KernelSelection.cpp
rename to src/shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.cpp
index 84a45fa74c7..88b07b465b4 100644
--- a/src/shogun/statistics/KernelSelection.cpp
+++ b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.cpp
@@ -1,7 +1,7 @@
 /*
  * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -29,58 +29,37 @@
  * either expressed or implied, of the Shogun Development Team.
  */
 
-#include <shogun/statistics/KernelSelection.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/kernel/Kernel.h>
 #include <shogun/kernel/CombinedKernel.h>
-#include <shogun/statistics/KernelTwoSampleTest.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/statistics/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/StreamingMMD.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.h>
+#include <shogun/statistical_testing/kernelselection/internals/OptimizationSolver.h>
 
 using namespace shogun;
+using namespace internal;
 
-CKernelSelection::CKernelSelection()
+WeightedMaxTestPower::WeightedMaxTestPower(KernelManager& km, CMMD* est) : WeightedMaxMeasure(km, est), lambda(1E-5)
 {
-	init();
 }
 
-CKernelSelection::CKernelSelection(CKernelTwoSampleTest* estimator)
+WeightedMaxTestPower::~WeightedMaxTestPower()
 {
-	init();
-	set_estimator(estimator);
 }
 
-CKernelSelection::~CKernelSelection()
+void WeightedMaxTestPower::init_measures()
 {
-	SG_UNREF(m_estimator);
 }
 
-void CKernelSelection::init()
+void WeightedMaxTestPower::compute_measures()
 {
-	SG_ADD((CSGObject**)&m_estimator, "estimator",
-			"Underlying CKernelTwoSampleTest instance", MS_NOT_AVAILABLE);
-
-	m_estimator=NULL;
-}
-
-void CKernelSelection::set_estimator(CKernelTwoSampleTest* estimator)
-{
-	REQUIRE(estimator, "No CKernelTwoSampleTest instance provided!\n");
-
-	/* ensure that there is a combined kernel */
-	CKernel* kernel=estimator->get_kernel();
-	REQUIRE(kernel, "Underlying \"%s\" has no kernel set!\n",
-			estimator->get_name());
-	REQUIRE(kernel->get_kernel_type()==K_COMBINED, "Kernel of underlying \"%s\" "
-			"is of type \"%s\" but is has to be CCombinedKernel\n",
-			estimator->get_name(), kernel->get_name());
-	SG_UNREF(kernel);
-
-	SG_REF(estimator);
-	SG_UNREF(m_estimator);
-	m_estimator=estimator;
-}
-
-CKernelTwoSampleTest* CKernelSelection::get_estimator() const
-{
-	SG_REF(m_estimator);
-	return m_estimator;
+	auto casted_estimator=dynamic_cast<CStreamingMMD*>(estimator);
+	ASSERT(casted_estimator);
+	const auto& estimates=casted_estimator->compute_statistic_and_Q(kernel_mgr);
+	measures=estimates.first;
+	Q=estimates.second;
+	for (index_t i=0; i<Q.num_rows; ++i)
+		Q(i, i)+=lambda;
 }
diff --git a/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.h b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.h
new file mode 100644
index 00000000000..2402a5ac303
--- /dev/null
+++ b/src/shogun/statistical_testing/kernelselection/internals/WeightedMaxTestPower.h
@@ -0,0 +1,65 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2013 Heiko Strathmann
+ * Written (w) 2014 - 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#ifndef WEIGHTED_MAX_TEST_POWER_H__
+#define WEIGHTED_MAX_TEST_POWER_H__
+
+#include <shogun/lib/common.h>
+#include <shogun/statistical_testing/kernelselection/internals/WeightedMaxMeasure.h>
+
+namespace shogun
+{
+
+class CKernel;
+class CMMD;
+template <typename T> class SGVector;
+
+namespace internal
+{
+#ifndef DOXYGEN_SHOULD_SKIP_THIS
+class WeightedMaxTestPower : public WeightedMaxMeasure
+{
+public:
+	WeightedMaxTestPower(KernelManager&, CMMD*);
+	WeightedMaxTestPower(const WeightedMaxTestPower& other)=delete;
+	~WeightedMaxTestPower();
+	WeightedMaxTestPower& operator=(const WeightedMaxTestPower& other)=delete;
+protected:
+	virtual void init_measures();
+	virtual void compute_measures();
+	float64_t lambda;
+};
+#endif // DOXYGEN_SHOULD_SKIP_THIS
+}
+
+}
+
+#endif // WEIGHTED_MAX_TEST_POWER_H__
diff --git a/src/shogun/statistics/HSIC.cpp b/src/shogun/statistics/HSIC.cpp
deleted file mode 100644
index 7c2eb90a37d..00000000000
--- a/src/shogun/statistics/HSIC.cpp
+++ /dev/null
@@ -1,293 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/HSIC.h>
-#include <shogun/features/Features.h>
-#include <shogun/mathematics/Statistics.h>
-#include <shogun/kernel/Kernel.h>
-#include <shogun/kernel/CustomKernel.h>
-
-using namespace shogun;
-
-CHSIC::CHSIC() : CKernelIndependenceTest()
-{
-	init();
-}
-
-CHSIC::CHSIC(CKernel* kernel_p, CKernel* kernel_q, CFeatures* p,
-		CFeatures* q) :
-		CKernelIndependenceTest(kernel_p, kernel_q, p, q)
-{
-	init();
-
-	if (p && q && p->get_num_vectors()!=q->get_num_vectors())
-	{
-		SG_ERROR("Only features with equal number of vectors are currently "
-				"possible\n");
-	}
-	else
-		m_num_features=p->get_num_vectors();
-}
-
-CHSIC::~CHSIC()
-{
-}
-
-void CHSIC::init()
-{
-	SG_ADD(&m_num_features, "num_features",
-			"Number of features from each of the distributions",
-			MS_NOT_AVAILABLE);
-
-	m_num_features=0;
-}
-
-float64_t CHSIC::compute_statistic()
-{
-	SG_DEBUG("entering!\n");
-
-	REQUIRE(m_kernel_p && m_kernel_q, "No or only one kernel specified!\n");
-
-	REQUIRE(m_p && m_q, "features needed!\n")
-
-	/* compute kernel matrices */
-	SGMatrix<float64_t> K=get_kernel_matrix_K();
-	SGMatrix<float64_t> L=get_kernel_matrix_L();
-
-	/* center matrices (MATLAB: Kc=H*K*H) */
-	K.center();
-
-	/* compute MATLAB: sum(sum(Kc' .* (L))), which is biased HSIC */
-	index_t m=m_num_features;
-	SG_DEBUG("Number of samples %d!\n", m);
-
-	float64_t result=0;
-	for (index_t i=0; i<m; ++i)
-	{
-		for (index_t j=0; j<m; ++j)
-			result+=K(j, i)*L(i, j);
-	}
-
-	/* return m times statistic */
-	result/=m;
-
-	SG_DEBUG("leaving!\n");
-
-	return result;
-}
-
-float64_t CHSIC::compute_p_value(float64_t statistic)
-{
-	float64_t result=0;
-	switch (m_null_approximation_method)
-	{
-	case HSIC_GAMMA:
-	{
-		/* fit gamma and return cdf at statistic */
-		SGVector<float64_t> params=fit_null_gamma();
-		result=CStatistics::gamma_cdf(statistic, params[0], params[1]);
-		break;
-	}
-
-	default:
-		/* sampling null is handled there */
-		result=CIndependenceTest::compute_p_value(statistic);
-		break;
-	}
-
-	return result;
-}
-
-float64_t CHSIC::compute_threshold(float64_t alpha)
-{
-	float64_t result=0;
-	switch (m_null_approximation_method)
-	{
-	case HSIC_GAMMA:
-	{
-		/* fit gamma and return inverse cdf at statistic */
-		SGVector<float64_t> params=fit_null_gamma();
-
-		// alpha, beta are shape and rate parameter
-		result=CStatistics::gamma_inverse_cdf(alpha, params[0], params[1]);
-		break;
-	}
-
-	default:
-		/* sampling null is handled there */
-		result=CIndependenceTest::compute_threshold(alpha);
-		break;
-	}
-
-	return result;
-}
-
-SGVector<float64_t> CHSIC::fit_null_gamma()
-{
-	REQUIRE(m_kernel_p && m_kernel_q, "No or only one kernel specified!\n");
-
-	REQUIRE(m_p && m_q, "features needed!\n")
-
-	index_t m=m_num_features;
-
-	/* compute kernel matrices */
-	SGMatrix<float64_t> K=get_kernel_matrix_K();
-	SGMatrix<float64_t> L=get_kernel_matrix_L();
-
-	/* compute sum and trace of uncentered kernel matrices, needed later */
-	float64_t trace_K=0;
-	float64_t trace_L=0;
-	float64_t sum_K=0;
-	float64_t sum_L=0;
-	for (index_t i=0; i<m; ++i)
-	{
-		trace_K+=K(i,i);
-		trace_L+=L(i,i);
-		for (index_t j=0; j<m; ++j)
-		{
-			sum_K+=K(i,j);
-			sum_L+=L(i,j);
-		}
-	}
-	SG_DEBUG("sum_K: %f, sum_L: %f, trace_K: %f, trace_L: %f\n", sum_K, sum_L,
-			trace_K, trace_L);
-
-	/* center both matrices: K=H*K*H, L=H*L*H in MATLAB */
-	K.center();
-	L.center();
-
-	/* compute the trace of MATLAB: (1/6 * Kc.*Lc).^2 Ü */
-	float64_t trace=0;
-	for (index_t i=0; i<m; ++i)
-		trace+=CMath::pow(K(i,i)*L(i,i), 2);
-
-	trace/=36.0;
-	SG_DEBUG("trace %f\n", trace)
-
-	/* compute sum of elements of MATLAB: (1/6 * Kc.*Lc).^2 */
-	float64_t sum=0;
-	for (index_t i=0; i<m; ++i)
-	{
-		for (index_t j=0; j<m; ++j)
-			sum+=CMath::pow(K(i,j)*L(i,j), 2);
-	}
-	sum/=36.0;
-	SG_DEBUG("sum %f\n", sum)
-
-	/* compute MATLAB: 1/m/(m-1)*(sum(sum(varHSIC)) - sum(diag(varHSIC))),
-	 * second term is bias correction */
-	float64_t var_hsic=1.0/m/(m-1)*(sum-trace);
-	SG_DEBUG("1.0/m/(m-1)*(sum-trace): %f\n", var_hsic)
-
-	/* finally, compute variance of hsic under H0
-	 * MATLAB: varHSIC = 72*(m-4)*(m-5)/m/(m-1)/(m-2)/(m-3)  *  varHSIC */
-	var_hsic=72.0*(m-4)*(m-5)/m/(m-1)/(m-2)/(m-3)*var_hsic;
-	SG_DEBUG("var_hsic: %f\n", var_hsic)
-
-	/* compute mean of matrices with diagonal elements zero on the base of sums
-	 * and trace from K and L which were computed above */
-	float64_t mu_x=1.0/m/(m-1)*(sum_K-trace_K);
-	float64_t mu_y=1.0/m/(m-1)*(sum_L-trace_L);
-	SG_DEBUG("mu_x: %f, mu_y: %f\n", mu_x, mu_y)
-
-	/* compute mean under H0, MATLAB: 1/m * ( 1 +muX*muY  - muX - muY ) */
-	float64_t m_hsic=1.0/m*(1+mu_x*mu_y-mu_x-mu_y);
-	SG_DEBUG("m_hsic: %f\n", m_hsic)
-
-	/* finally, compute parameters of gamma distirbution */
-	float64_t a=CMath::pow(m_hsic, 2)/var_hsic;
-	float64_t b=var_hsic*m/m_hsic;
-	SG_DEBUG("a: %f, b: %f\n", a, b)
-
-	SGVector<float64_t> result(2);
-	result[0]=a;
-	result[1]=b;
-
-	SG_DEBUG("leaving!\n")
-	return result;
-}
-
-SGVector<float64_t> CHSIC::sample_null()
-{
-	SG_DEBUG("entering!\n")
-
-	/* replace current kernel via precomputed custom kernel and call superclass
-	 * method */
-
-	/* backup references to old kernels */
-	CKernel* kernel_p=m_kernel_p;
-	CKernel* kernel_q=m_kernel_q;
-
-	/* init kernels before to be sure that everything is fine
-	 * kernel function between two samples from different distributions
-	 * is never computed - in fact, they may as well have different features */
-	m_kernel_p->init(m_p, m_p);
-	m_kernel_q->init(m_q, m_q);
-
-	/* precompute kernel matrices */
-	CCustomKernel* precomputed_p=new CCustomKernel(m_kernel_p);
-	CCustomKernel* precomputed_q=new CCustomKernel(m_kernel_q);
-	SG_REF(precomputed_p);
-	SG_REF(precomputed_q);
-
-	/* temporarily replace own kernels */
-	m_kernel_p=precomputed_p;
-	m_kernel_q=precomputed_q;
-
-	/* use superclass sample_null which shuffles the entries for one
-	 * distribution using index permutation on rows and columns of
-	 * kernel matrix from one distribution, while accessing the other
-	 * in its original order and then compute statistic */
-	SGVector<float64_t> null_samples=CKernelIndependenceTest::sample_null();
-
-	/* restore kernels */
-	m_kernel_p=kernel_p;
-	m_kernel_q=kernel_q;
-
-	SG_UNREF(precomputed_p);
-	SG_UNREF(precomputed_q);
-
-	SG_DEBUG("leaving!\n")
-	return null_samples;
-}
-
-void CHSIC::set_p(CFeatures* p)
-{
-	CIndependenceTest::set_p(p);
-	m_num_features=p->get_num_vectors();
-}
-
-void CHSIC::set_q(CFeatures* q)
-{
-	CIndependenceTest::set_q(q);
-	m_num_features=q->get_num_vectors();
-}
-
diff --git a/src/shogun/statistics/HSIC.h b/src/shogun/statistics/HSIC.h
deleted file mode 100644
index 5122222fde2..00000000000
--- a/src/shogun/statistics/HSIC.h
+++ /dev/null
@@ -1,207 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef __HSIC_H_
-#define __HSIC_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/KernelIndependenceTest.h>
-
-namespace shogun
-{
-
-template<class T> class SGMatrix;
-
-
-/** @brief This class implements the Hilbert Schmidtd Independence Criterion
- * based independence test as described in [1].
- *
- * Given samples \f$Z=\{(x_i,y_i)\}_{i=1}^m\f$ from the joint
- * distribution \f$\textbf{P}_{xy}\f$, does the joint distribution
- * factorize as \f$\textbf{P}_{xy}=\textbf{P}_x\textbf{P}_y\f$?
- *
- * The HSIC is a kernel based independence criterion, which is based on the
- * largest singular value of a Cross-Covariance Operator in a reproducing
- * kernel Hilbert space (RKHS). Its population expression is zero if and only
- * if the two underlying distributions are independent.
- *
- * This class can compute empirical biased estimates:
- * \f[
- * m\text{HSIC}(Z)[,p,q]^2)=\frac{1}{m^2}\text{trace}\textbf{KHLH}
- * \f]
- * where \f$\textbf{H}=\textbf{I}-\frac{1}{m}\textbf{11}^T\f$ is a centering
- * matrix and \f$\textbf{K}, \textbf{L}\f$ are kernel matrices of both sets
- * of samples.
- *
- * Note that computing the statistic returns m*MMD; same holds for the null
- * distribution samples.
- *
- * Along with the statistic comes a method to compute a p-value based on
- * different methods. Sampling from null is also possible. If unsure which one to
- * use, sampling with 250 iterations always is correct (but slow).
- *
- * To choose, use set_null_approximation_method() and choose from
- *
- * HSIC_GAMMA: for a very fast, but not consistent test based on moment matching
- * of a Gamma distribution, as described in [1].
- *
- * PERMUTATION: For permuting available samples to sample null-distribution.
- * This is done on precomputed kernel matrices, since they have to
- * be stored anyway when the statistic is computed.
- *
- * A very basic method for kernel selection when using CGaussianKernel is to
- * use the median distance of the underlying data. See examples how to do that.
- * More advanced methods will follow in the near future. However, the median
- * heuristic works in quite some cases. See [1].
- *
- * [1]: Gretton, A., Fukumizu, K., Teo, C., & Song, L. (2008).
- * A kernel statistical test of independence.
- * Advances in Neural Information Processing Systems, 1-8.
- *
- */
-class CHSIC : public CKernelIndependenceTest
-{
-public:
-	/** Constructor */
-	CHSIC();
-
-	/** Constructor.
-	 *
-	 * Initializes the kernels and features from the two distributions and
-	 * SG_REFs them
-	 *
-	 * @param kernel_p kernel to use on samples from p
-	 * @param kernel_q kernel to use on samples from q
-	 * @param p samples from distribution p
-	 * @param q samples from distribution q
-	 */
-	CHSIC(CKernel* kernel_p, CKernel* kernel_q, CFeatures* p, CFeatures* q);
-
-	/** destructor */
-	virtual ~CHSIC();
-
-	/** Computes the HSIC statistic (see class description) for underlying
-	 * kernels and data. Note that it is multiplied by the number of used
-	 * samples. It is a biased estimator. Note that it is m*HSIC_b.
-	 *
-	 * Note that since kernel matrices have to be stored, it has quadratic
-	 * space costs.
-	 *
-	 * @return m*HSIC (unbiased estimate)
-	 */
-	virtual float64_t compute_statistic();
-
-	/** computes a p-value based on current method for approximating the
-	 * null-distribution. The p-value is the 1-p quantile of the null-
-	 * distribution where the given statistic lies in.
-	 *
-	 * @param statistic statistic value to compute the p-value for
-	 * @return p-value parameter statistic is the (1-p) percentile of the
-	 * null distribution
-	 */
-	virtual float64_t compute_p_value(float64_t statistic);
-
-	/** computes a threshold based on current method for approximating the
-	 * null-distribution. The threshold is the value that a statistic has
-	 * to have in ordner to reject the null-hypothesis.
-	 *
-	 * @param alpha test level to reject null-hypothesis
-	 * @return threshold for statistics to reject null-hypothesis
-	 */
-	virtual float64_t compute_threshold(float64_t alpha);
-
-	/** @return the class name */
-	virtual const char* get_name() const
-	{
-		return "HSIC";
-	}
-
-	/** returns the statistic type of this test statistic */
-	virtual EStatisticType get_statistic_type() const
-	{
-		return S_HSIC;
-	}
-
-	/** Setter for features from distribution p, SG_REFs it
-	 *
-	 * @param p features from p
-	 */
-	virtual void set_p(CFeatures* p);
-
-	/** Setter for features from distribution q, SG_REFs it
-	 *
-	 * @param q features from q
-	 */
-	virtual void set_q(CFeatures* q);
-
-	/** Approximates the null-distribution by a two parameter gamma
-	 * distribution. Returns parameters.
-	 *
-	 * NOTE: the gamma distribution is fitted to m*HSIC_b. But since
-	 * compute_statistic() returnes the biased estimate, you can safely call
-	 * this with values from compute_statistic().
-	 * However, the attached features have to be the SAME size, as these, the
-	 * statistic was computed on. If compute_threshold() or compute_p_value()
-	 * are used, this is ensured automatically. Note that m*Null-distribution is
-	 * fitted, which is fine since the statistic is also m*HSIC.
-	 *
-	 * Has quadratic computational costs in terms of samples.
-	 *
-	 * Called by compute_p_value() if null approximation method is set to
-	 * MMD2_GAMMA.
-	 *
-	 * @return vector with two parameters for gamma distribution. To use:
-	 * call gamma_cdf(statistic, a, b).
-	 */
-	SGVector<float64_t> fit_null_gamma();
-
-	/** merges both sets of samples and computes the test statistic
-	 * m_num_null_sample times. This version precomputes the kenrel matrix
-	 * once by hand, then samples using this one. The matrix has
-	 * to be stored anyway when statistic is computed.
-	 *
-	 * @return vector of all statistics
-	 */
-	virtual SGVector<float64_t> sample_null();
-
-private:
-	/** register parameters and initialize with defaults */
-	void init();
-
-	/** number of features from the distributions (should be equal for both) */
-	index_t m_num_features;
-
-};
-
-}
-
-#endif /* __HSIC_H_ */
diff --git a/src/shogun/statistics/IndependenceTest.cpp b/src/shogun/statistics/IndependenceTest.cpp
deleted file mode 100644
index 89c16cb3ad2..00000000000
--- a/src/shogun/statistics/IndependenceTest.cpp
+++ /dev/null
@@ -1,132 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/IndependenceTest.h>
-#include <shogun/features/Features.h>
-
-using namespace shogun;
-
-CIndependenceTest::CIndependenceTest() : CHypothesisTest()
-{
-	init();
-}
-
-CIndependenceTest::CIndependenceTest(CFeatures* p, CFeatures* q)
-	: CHypothesisTest()
-{
-	init();
-
-	SG_REF(p);
-	SG_REF(q);
-
-	m_p=p;
-	m_q=q;
-}
-
-CIndependenceTest::~CIndependenceTest()
-{
-	SG_UNREF(m_p);
-	SG_UNREF(m_q);
-}
-
-void CIndependenceTest::init()
-{
-	SG_ADD((CSGObject**)&m_p, "p", "Samples from p", MS_NOT_AVAILABLE);
-	SG_ADD((CSGObject**)&m_q, "q", "Samples from q", MS_NOT_AVAILABLE);
-
-	m_p=NULL;
-	m_q=NULL;
-}
-
-SGVector<float64_t> CIndependenceTest::sample_null()
-{
-	SG_DEBUG("entering!\n")
-
-	REQUIRE(m_p, "No features p!\n");
-	REQUIRE(m_q, "No features q!\n");
-
-	/* compute sample statistics for null distribution */
-	SGVector<float64_t> results(m_num_null_samples);
-
-	/* memory for index permutations. Adding of subset has to happen
-	 * inside the loop since it may be copied if there already is one set.
-	 *
-	 * subset for selecting samples from p. In this case we want to
-	 * shuffle only samples from p while keeping samples from q fixed */
-	SGVector<index_t> ind_permutation(m_p->get_num_vectors());
-	ind_permutation.range_fill();
-
-	for (index_t i=0; i<m_num_null_samples; ++i)
-	{
-		/* idea: shuffle samples from p while keeping samples from q fixed
-		 * and compute statistic. This is done using subsets here */
-
-		/* create index permutation and add as subset to features from p */
-		CMath::permute(ind_permutation);
-
-		/* compute statistic for this permutation of mixed samples */
-		m_p->add_subset(ind_permutation);
-		results[i]=compute_statistic();
-		m_p->remove_subset();
-	}
-
-	SG_DEBUG("leaving!\n")
-	return results;
-}
-
-void CIndependenceTest::set_p(CFeatures* p)
-{
-	/* ref before unref to avoid problems when instances are equal */
-	SG_REF(p);
-	SG_UNREF(m_p);
-	m_p=p;
-}
-
-void CIndependenceTest::set_q(CFeatures* q)
-{
-	/* ref before unref to avoid problems when instances are equal */
-	SG_REF(q);
-	SG_UNREF(m_q);
-	m_q=q;
-}
-
-CFeatures* CIndependenceTest::get_p()
-{
-	SG_REF(m_p);
-	return m_p;
-}
-
-CFeatures* CIndependenceTest::get_q()
-{
-	SG_REF(m_q);
-	return m_q;
-}
-
diff --git a/src/shogun/statistics/IndependenceTest.h b/src/shogun/statistics/IndependenceTest.h
deleted file mode 100644
index faac0eee492..00000000000
--- a/src/shogun/statistics/IndependenceTest.h
+++ /dev/null
@@ -1,124 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef INDEPENDENCE_TEST_H_
-#define INDEPENDENCE_TEST_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/HypothesisTest.h>
-
-namespace shogun
-{
-
-class CFeatures;
-
-/** @brief Provides an interface for performing the independence test.
- * Given samples \f$Z=\{(x_i,y_i)\}_{i=1}^m\f$ from the joint distribution
- * \f$\textbf{P}_{xy}\f$, does the joint distribution factorize as
- * \f$\textbf{P}_{xy}=\textbf{P}_x\textbf{P}_y\f$, i.e. product of the marginals?
- * The null-hypothesis says yes, i.e. no dependence, the alternative hypothesis
- * says no.
- *
- * Abstract base class. Provides all interfaces and implements approximating
- * the null distribution via permutation, i.e. shuffling the samples from
- * one distribution repeatedly using subsets while keeping the samples from
- * the other distribution in its original order
- *
- */
-class CIndependenceTest : public CHypothesisTest
-{
-public:
-	/** default constructor */
-	CIndependenceTest();
-
-	/** Constructor.
-	 *
-	 * Initializes the features from the two distributions and SG_REFs them
-	 *
-	 * @param p samples from distribution p
-	 * @param q samples from distribution q
-	 */
-	CIndependenceTest(CFeatures* p, CFeatures* q);
-
-	/** destructor */
-	virtual ~CIndependenceTest();
-
-	/** shuffles samples from one distribution keeping the samples from another
-	 * distribution in the original order and computes the test statistic
-	 * m_num_null_sample times
-	 *
-	 * @return vector of all statistics
-	 */
-	virtual SGVector<float64_t> sample_null();
-
-	/** Setter for features from distribution p, SG_REFs it
-	 *
-	 * @param p features from p
-	 */
-	virtual void set_p(CFeatures* p);
-
-	/** Setter for features from distribution q, SG_REFs it
-	 *
-	 * @param q features from q
-	 */
-	virtual void set_q(CFeatures* q);
-
-	/** Getter for features from p, SG_REF'ed
-	 *
-	 * @return feature object from p
-	 */
-	virtual CFeatures* get_p();
-
-	/** Getter for features from q, SG_REF'ed
-	 *
-	 * @return feature object from q
-	 */
-	virtual CFeatures* get_q();
-
-	/** @return class name */
-	virtual const char* get_name() const=0;
-
-private:
-	/** register parameters and initialize with default values */
-	void init();
-
-protected:
-	/** samples of the distribution p */
-	CFeatures* m_p;
-
-	/** samples of the distribution q */
-	CFeatures* m_q;
-};
-
-}
-
-#endif /* INDEPENDENCE_TEST_H_ */
diff --git a/src/shogun/statistics/KernelIndependenceTest.cpp b/src/shogun/statistics/KernelIndependenceTest.cpp
deleted file mode 100644
index e667b2978ce..00000000000
--- a/src/shogun/statistics/KernelIndependenceTest.cpp
+++ /dev/null
@@ -1,208 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/KernelIndependenceTest.h>
-#include <shogun/features/Features.h>
-#include <shogun/kernel/Kernel.h>
-#include <shogun/kernel/CustomKernel.h>
-#include <shogun/mathematics/Math.h>
-
-using namespace shogun;
-
-CKernelIndependenceTest::CKernelIndependenceTest() :
-		CIndependenceTest()
-{
-	init();
-}
-
-CKernelIndependenceTest::CKernelIndependenceTest(CKernel* kernel_p,
-		CKernel* kernel_q, CFeatures* p, CFeatures* q) :
-		CIndependenceTest(p, q)
-{
-	init();
-
-	m_kernel_p=kernel_p;
-	SG_REF(kernel_p);
-
-	m_kernel_q=kernel_q;
-	SG_REF(kernel_q);
-}
-
-CKernelIndependenceTest::~CKernelIndependenceTest()
-{
-	SG_UNREF(m_kernel_p);
-	SG_UNREF(m_kernel_q);
-}
-
-void CKernelIndependenceTest::init()
-{
-	SG_ADD((CSGObject**)&m_kernel_p, "kernel_p", "Kernel for samples from p",
-			MS_AVAILABLE);
-	SG_ADD((CSGObject**)&m_kernel_q, "kernel_q", "Kernel for samples from q",
-			MS_AVAILABLE);
-
-	m_kernel_p=NULL;
-	m_kernel_q=NULL;
-}
-
-SGVector<float64_t> CKernelIndependenceTest::sample_null()
-{
-	SG_DEBUG("entering!\n")
-
-	/* compute sample statistics for null distribution */
-	SGVector<float64_t> results;
-
-	/* only do something if a custom kernel is used: use the power of pre-
-	 * computed kernel matrices
-	 */
-	if (m_kernel_p->get_kernel_type()==K_CUSTOM &&
-			m_kernel_q->get_kernel_type()==K_CUSTOM)
-	{
-		/* allocate memory */
-		results=SGVector<float64_t>(m_num_null_samples);
-
-		/* memory for index permutations */
-		SGVector<index_t> ind_permutation(m_p->get_num_vectors());
-		ind_permutation.range_fill();
-
-		/* check if kernel is a custom kernel. In that case, changing features is
-		 * not what we want but just subsetting the kernel itself */
-		CCustomKernel* custom_kernel_p=(CCustomKernel*)m_kernel_p;
-
-		for (index_t i=0; i<m_num_null_samples; ++i)
-		{
-			/* idea: shuffle samples from p while keeping samples from q intact
-			 * and compute statistic. This is done using subsets here. add to
-			 * custom kernel since it has no features to subset. CustomKernel
-			 * has not to be re-initialised after each subset setting */
-			CMath::permute(ind_permutation);
-
-			custom_kernel_p->add_row_subset(ind_permutation);
-			custom_kernel_p->add_col_subset(ind_permutation);
-
-			/* compute statistic for this permutation of mixed samples */
-			results[i]=compute_statistic();
-
-			/* remove subsets */
-			custom_kernel_p->remove_row_subset();
-			custom_kernel_p->remove_col_subset();
-		}
-	}
-	else
-	{
-		/* in this case, just use superclass method */
-		results=CIndependenceTest::sample_null();
-	}
-
-
-	SG_DEBUG("leaving!\n")
-	return results;
-}
-
-void CKernelIndependenceTest::set_kernel_p(CKernel* kernel_p)
-{
-	/* ref before unref to avoid problems when instances are equal */
-	SG_REF(kernel_p);
-	SG_UNREF(m_kernel_p);
-	m_kernel_p=kernel_p;
-}
-
-void CKernelIndependenceTest::set_kernel_q(CKernel* kernel_q)
-{
-	/* ref before unref to avoid problems when instances are equal */
-	SG_REF(kernel_q);
-	SG_UNREF(m_kernel_q);
-	m_kernel_q=kernel_q;
-}
-
-CKernel* CKernelIndependenceTest::get_kernel_p()
-{
-	SG_REF(m_kernel_p);
-	return m_kernel_p;
-}
-
-CKernel* CKernelIndependenceTest::get_kernel_q()
-{
-	SG_REF(m_kernel_q);
-	return m_kernel_q;
-}
-
-SGMatrix<float64_t> CKernelIndependenceTest::get_kernel_matrix_K()
-{
-	SG_DEBUG("entering!\n");
-
-	SGMatrix<float64_t> K;
-
-	/* distinguish between custom and normal kernels */
-	if (m_kernel_p->get_kernel_type()==K_CUSTOM)
-	{
-		/* custom kernels need to to be initialised when a subset is added */
-		CCustomKernel* custom_kernel_p=(CCustomKernel*)m_kernel_p;
-		K=custom_kernel_p->get_kernel_matrix();
-	}
-	else
-	{
-		/* need to init the kernel if kernel is not precomputed - if subsets of
-		 * features are in the stack (for permutation), this will handle it */
-		m_kernel_p->init(m_p, m_p);
-		K=m_kernel_p->get_kernel_matrix();
-	}
-
-	SG_DEBUG("leaving!\n");
-
-	return K;
-}
-
-SGMatrix<float64_t> CKernelIndependenceTest::get_kernel_matrix_L()
-{
-	SG_DEBUG("entering!\n");
-
-	SGMatrix<float64_t> L;
-
-	/* now second half of data for L */
-	if (m_kernel_q->get_kernel_type()==K_CUSTOM)
-	{
-		/* custom kernels need to to be initialised - no subsets here */
-		CCustomKernel* custom_kernel_q=(CCustomKernel*)m_kernel_q;
-		L=custom_kernel_q->get_kernel_matrix();
-	}
-	else
-	{
-		/* need to init the kernel if kernel is not precomputed */
-		m_kernel_q->init(m_q, m_q);
-		L=m_kernel_q->get_kernel_matrix();
-	}
-
-	SG_DEBUG("leaving!\n");
-
-	return L;
-}
-
diff --git a/src/shogun/statistics/KernelIndependenceTest.h b/src/shogun/statistics/KernelIndependenceTest.h
deleted file mode 100644
index 077d56462d9..00000000000
--- a/src/shogun/statistics/KernelIndependenceTest.h
+++ /dev/null
@@ -1,145 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef KERNEL_INDEPENDENCE_TEST_H_
-#define KERNEL_INDEPENDENCE_TEST_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/IndependenceTest.h>
-
-namespace shogun
-{
-
-class CFeatures;
-class CKernel;
-
-/** @brief Kernel independence test base class. Provides an interface for
- * performing an independence test. Given samples \f$Z=\{(x_i,y_i)\}_{i=1}^m\f$
- * from the joint distribution \f$\textbf{P}_{xy}\f$, does the joint
- * distribution factorize as \f$\textbf{P}_{xy}=\textbf{P}_x\textbf{P}_y\f$,
- * i.e. product of the marginals?
- *
- * The null-hypothesis says yes, i.e. no dependence, the alternative hypothesis
- * says no.
- *
- * In this class, this is done using a single kernel for each of the two sets
- * of samples
- *
- * The class also re-implements the sample_null() method. If the underlying
- * kernel is a custom one (precomputed), the rows and column of the kernel
- * matrix for p is permuted using subsets. The computation falls back to
- * CIndependenceTest::sample_null() otherwise, which requires to re-compute
- * the kernel in each iteration via subsets on the features instead
- *
- * Abstract base class.
- */
-class CKernelIndependenceTest: public CIndependenceTest
-{
-public:
-	/** default constructor */
-	CKernelIndependenceTest();
-
-	/** Constructor.
-	 *
-	 * Initializes the kernels and features from the two distributions and
-	 * SG_REFs them
-	 *
-	 * @param kernel_p kernel to use on samples from p
-	 * @param kernel_q kernel to use on samples from q
-	 * @param p samples from distribution p
-	 * @param q samples from distribution q
-	 */
-	CKernelIndependenceTest(CKernel* kernel_p, CKernel* kernel_q,
-			CFeatures* p, CFeatures* q);
-
-	/** destructor */
-	virtual ~CKernelIndependenceTest();
-
-	/** shuffles the indeices that corresponds to the kernel entries of
-	 * samples from p while accessing samples from q in the original order and
-	 * computes the test statistic  m_num_null_samples times. This version
-	 * checks if a precomputed custom kernel is used, and, if so, just permutes
-	 * the indices of the kernel corresponding to p instead of re-computing it
-	 * in every iteration.
-	 *
-	 * @return vector of all statistics
-	 */
-	virtual SGVector<float64_t> sample_null();
-
-	/** Setter for kernel for features from distribution p, SG_REFs it
-	 *
-	 * @param kernel_p kernel for features from p
-	 */
-	virtual void set_kernel_p(CKernel* kernel_p);
-
-	/** Setter for kernel for features from distribution q, SG_REFs it
-	 *
-	 * @param kernel_q kernel for features from q
-	 */
-	virtual void set_kernel_q(CKernel* kernel_q);
-
-	/** Getter for kernel for features from p, SG_REF'ed
-	 *
-	 * @return kernel for features from p
-	 */
-	virtual CKernel* get_kernel_p();
-
-	/** Getter for kernel for features from q, SG_REF'ed
-	 *
-	 * @return kernel for features from q
-	 */
-	virtual CKernel* get_kernel_q();
-
-	/** @return the class name */
-	virtual const char* get_name() const=0;
-
-private:
-	/** register parameters and intiailize with default values */
-	void init();
-
-protected:
-	/** @return kernel matrix on samples from p. Distinguishes CustomKernels */
-	SGMatrix<float64_t> get_kernel_matrix_K();
-
-	/** @return kernel matrix on samples from q. Distinguishes CustomKernels */
-	SGMatrix<float64_t> get_kernel_matrix_L();
-
-	/** underlying kernel for p */
-	CKernel* m_kernel_p;
-
-	/** underlying kernel for q */
-	CKernel* m_kernel_q;
-};
-
-}
-
-#endif /* KERNEL_INDEPENDENCE_TEST_H_ */
diff --git a/src/shogun/statistics/KernelMeanMatching.cpp b/src/shogun/statistics/KernelMeanMatching.cpp
deleted file mode 100644
index 71942054b6d..00000000000
--- a/src/shogun/statistics/KernelMeanMatching.cpp
+++ /dev/null
@@ -1,109 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Copyright (W) 2012 Sergey Lisitsyn
- */
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/statistics/KernelMeanMatching.h>
-#include <shogun/lib/external/libqp.h>
-
-
-static float64_t* kmm_K = NULL;
-static int32_t kmm_K_ld = 0;
-
-static const float64_t* kmm_get_col(uint32_t i)
-{
-	return kmm_K + kmm_K_ld*i;
-}
-
-namespace shogun
-{
-CKernelMeanMatching::CKernelMeanMatching() :
-	CSGObject(), m_kernel(NULL)
-{
-}
-
-CKernelMeanMatching::CKernelMeanMatching(CKernel* kernel, SGVector<index_t> training_indices,
-                                         SGVector<index_t> test_indices) :
-	CSGObject(), m_kernel(NULL)
-{
-	set_kernel(kernel);
-	set_training_indices(training_indices);
-	set_test_indices(test_indices);
-}
-
-SGVector<float64_t> CKernelMeanMatching::compute_weights()
-{
-	int32_t i,j;
-	ASSERT(m_kernel)
-	ASSERT(m_training_indices.vlen)
-	ASSERT(m_test_indices.vlen)
-
-	int32_t n_tr = m_training_indices.vlen;
-	int32_t n_te = m_test_indices.vlen;
-
-	SGVector<float64_t> weights(n_tr);
-	weights.zero();
-
-	kmm_K = SG_MALLOC(float64_t, n_tr*n_tr);
-	kmm_K_ld = n_tr;
-	float64_t* diag_K = SG_MALLOC(float64_t, n_tr);
-	for (i=0; i<n_tr; i++)
-	{
-		float64_t d = m_kernel->kernel(m_training_indices[i], m_training_indices[i]);
-		diag_K[i] = d;
-		kmm_K[i*n_tr+i] = d;
-		for (j=i+1; j<n_tr; j++)
-		{
-			d = m_kernel->kernel(m_training_indices[i],m_training_indices[j]);
-			kmm_K[i*n_tr+j] = d;
-			kmm_K[j*n_tr+i] = d;
-		}
-	}
-	float64_t* kappa = SG_MALLOC(float64_t, n_tr);
-	for (i=0; i<n_tr; i++)
-	{
-		float64_t avg = 0.0;
-		for (j=0; j<n_te; j++)
-			avg+= m_kernel->kernel(m_training_indices[i],m_test_indices[j]);
-
-		avg *= float64_t(n_tr)/n_te;
-		kappa[i] = -avg;
-	}
-	float64_t* a = SG_MALLOC(float64_t, n_tr);
-	for (i=0; i<n_tr; i++) a[i] = 1.0;
-	float64_t* LB = SG_MALLOC(float64_t, n_tr);
-	float64_t* UB = SG_MALLOC(float64_t, n_tr);
-	float64_t B = 2.0;
-	for (i=0; i<n_tr; i++)
-	{
-		LB[i] = 0.0;
-		UB[i] = B;
-	}
-	for (i=0; i<n_tr; i++)
-		weights[i] = 1.0/float64_t(n_tr);
-
-	libqp_state_T result =
-		libqp_gsmo_solver(&kmm_get_col,diag_K,kappa,a,1.0,LB,UB,weights,n_tr,1000,1e-9,NULL);
-
-	SG_DEBUG("libqp exitflag=%d, %d iterations passed, primal objective=%f\n",
-	         result.exitflag,result.nIter,result.QP);
-
-	SG_FREE(kappa);
-	SG_FREE(a);
-	SG_FREE(LB);
-	SG_FREE(UB);
-	SG_FREE(diag_K);
-	SG_FREE(kmm_K);
-
-	return weights;
-}
-
-}
-#endif //USE_GPL_SHOGUN
diff --git a/src/shogun/statistics/KernelMeanMatching.h b/src/shogun/statistics/KernelMeanMatching.h
deleted file mode 100644
index 3cb6921e16e..00000000000
--- a/src/shogun/statistics/KernelMeanMatching.h
+++ /dev/null
@@ -1,63 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Copyright (W) 2012 Sergey Lisitsyn
- */
-
-#ifndef KERNELMEANMATCHING_H_
-#define KERNELMEANMATCHING_H_
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/base/SGObject.h>
-#include <shogun/kernel/Kernel.h>
-
-namespace shogun
-{
-
-/** @brief Kernel Mean Matching */
-class CKernelMeanMatching: public CSGObject
-{
-public:
-
-	/** constructor */
-	CKernelMeanMatching();
-
-	/** constructor */
-	CKernelMeanMatching(CKernel* kernel, SGVector<index_t> training_indices, SGVector<index_t> test_indices);
-
-	/** get kernel */
-	CKernel* get_kernel() const { SG_REF(m_kernel); return m_kernel; }
-	/** set kernel */
-	void set_kernel(CKernel* kernel) { SG_REF(kernel); SG_UNREF(m_kernel); m_kernel = kernel; }
-	/** get training indices */
-	SGVector<index_t> get_training_indices() const { return m_training_indices; }
-	/** set training indices */
-	void set_training_indices(SGVector<index_t> training_indices) { m_training_indices = training_indices; }
-	/** get test indices */
-	SGVector<index_t> get_test_indices() const { return m_test_indices; }
-	/** set test indices */
-	void set_test_indices(SGVector<index_t> test_indices) { m_test_indices = test_indices; }
-
-	/** compute weights */
-	SGVector<float64_t> compute_weights();
-
-	virtual const char* get_name() const { return "KernelMeanMatching"; }
-
-protected:
-
-	/** kernel */
-	CKernel* m_kernel;
-	/** training indices */
-	SGVector<index_t> m_training_indices;
-	/** test indices */
-	SGVector<index_t> m_test_indices;
-};
-
-}
-#endif //USE_GPL_SHOGUN
-#endif
diff --git a/src/shogun/statistics/KernelSelection.h b/src/shogun/statistics/KernelSelection.h
deleted file mode 100644
index 8d2cdf29abe..00000000000
--- a/src/shogun/statistics/KernelSelection.h
+++ /dev/null
@@ -1,104 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef KERNEL_SELECTION_H_
-#define KERNEL_SELECTION_H_
-
-#include <shogun/lib/config.h>
-#include <shogun/base/SGObject.h>
-
-namespace shogun
-{
-
-class CKernelTwoSampleTest;
-class CKernel;
-
-/** @brief Base class for kernel selection for kernel two-sample test
- * statistic implementations (e.g. MMD).
- * Provides abstract methods for selecting kernels and computing criteria or
- * kernel weights for the implemented method. In order to implement new methods
- * for kernel selection, simply write a new implementation of this class.
- */
-class CKernelSelection: public CSGObject
-{
-public:
-	/** Default constructor */
-	CKernelSelection();
-
-	/** Constructor that initialises the underlying CKernelTwoSampleTest instance
-	 *
-	 * @param estimator CKernelTwoSampleTest instance to use.
-	 */
-	CKernelSelection(CKernelTwoSampleTest* estimator);
-
-	/** Destructor */
-	virtual ~CKernelSelection();
-
-	/** If the the implemented method selects a single kernel, this computes
-	 * criteria for all underlying kernels. If the method selects combined
-	 * kernels, this method returns weights for the baseline kernels
-	 *
-	 * @return vector with criteria or kernel weights
-	 */
-	virtual SGVector<float64_t> compute_measures()=0;
-
-	/** Abstract method that performs kernel selection on the base of the
-	 * compute_measures() method and returns the selected kernel which is
-	 * either a single or a combined one (with weights set)
-	 *
-	 * @return selected kernel (SG_REF'ed)
-	 */
-	virtual CKernel* select_kernel()=0;
-
-	/** @param estimator the underlying CKernelTwoSampleTest instance */
-	void set_estimator(CKernelTwoSampleTest* estimator);
-
-	/** @return the underlying CKernelTwoSampleTest instance */
-	CKernelTwoSampleTest* get_estimator() const;
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const
-	{
-		return "KernelSelection";
-	}
-
-private:
-	/** Register parameters and initialize with default */
-	void init();
-
-protected:
-	/** Underlying kernel two-sample test instance */
-	CKernelTwoSampleTest* m_estimator;
-};
-
-}
-
-#endif /* KERNEL_SELECTION_H_ */
diff --git a/src/shogun/statistics/KernelTwoSampleTest.cpp b/src/shogun/statistics/KernelTwoSampleTest.cpp
deleted file mode 100644
index 0035f1bb9d5..00000000000
--- a/src/shogun/statistics/KernelTwoSampleTest.cpp
+++ /dev/null
@@ -1,117 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/KernelTwoSampleTest.h>
-#include <shogun/features/Features.h>
-#include <shogun/kernel/Kernel.h>
-#include <shogun/kernel/CustomKernel.h>
-#include <shogun/mathematics/Math.h>
-
-using namespace shogun;
-
-CKernelTwoSampleTest::CKernelTwoSampleTest() :
-		CTwoSampleTest()
-{
-	init();
-}
-
-CKernelTwoSampleTest::CKernelTwoSampleTest(CKernel* kernel,
-		CFeatures* p_and_q, index_t q_start) :
-		CTwoSampleTest(p_and_q, q_start)
-{
-	init();
-
-	m_kernel=kernel;
-	SG_REF(kernel);
-}
-
-CKernelTwoSampleTest::CKernelTwoSampleTest(CKernel* kernel,
-		CFeatures* p, CFeatures* q) : CTwoSampleTest(p, q)
-{
-	init();
-
-	m_kernel=kernel;
-	SG_REF(kernel);
-}
-
-CKernelTwoSampleTest::~CKernelTwoSampleTest()
-{
-	SG_UNREF(m_kernel);
-}
-
-void CKernelTwoSampleTest::init()
-{
-	SG_ADD((CSGObject**)&m_kernel, "kernel", "Kernel for two sample test",
-			MS_AVAILABLE);
-	m_kernel=NULL;
-}
-
-SGVector<float64_t> CKernelTwoSampleTest::sample_null()
-{
-	SG_DEBUG("entering!\n");
-
-	REQUIRE(m_kernel, "No kernel set!\n");
-	REQUIRE(m_kernel->get_kernel_type()==K_CUSTOM || m_p_and_q,
-			"No features and no custom kernel set!\n");
-
-	/* compute sample statistics for null distribution */
-	SGVector<float64_t> results;
-
-	/* only do something if a custom kernel is used: use the power of pre-
-	 * computed kernel matrices
-	 */
-	if (m_kernel->get_kernel_type()==K_CUSTOM)
-	{
-		/* allocate memory */
-		results=SGVector<float64_t>(m_num_null_samples);
-
-		/* in case of custom kernel, there are no features */
-		index_t num_data;
-		if (m_kernel->get_kernel_type()==K_CUSTOM)
-			num_data=m_kernel->get_num_vec_lhs();
-		else
-			num_data=m_p_and_q->get_num_vectors();
-
-		/* memory for index permutations, (would slow down loop) */
-		SGVector<index_t> ind_permutation(num_data);
-		ind_permutation.range_fill();
-
-		/* check if kernel is a custom kernel. In that case, changing features is
-		 * not what we want but just subsetting the kernel itself */
-		CCustomKernel* custom_kernel=(CCustomKernel*)m_kernel;
-
-		for (index_t i=0; i<m_num_null_samples; ++i)
-		{
-			/* idea: merge features of p and q, shuffle, and compute statistic.
-			 * This is done using subsets here. add to custom kernel since
-			 * it has no features to subset. CustomKernel has not to be
-			 * re-initialised after each subset setting */
-			CMath::permute(ind_permutation);
-
-			custom_kernel->add_row_subset(ind_permutation);
-			custom_kernel->add_col_subset(ind_permutation);
-
-			/* compute statistic for this permutation of mixed samples */
-			results[i]=compute_statistic();
-
-			/* remove subsets */
-			custom_kernel->remove_row_subset();
-			custom_kernel->remove_col_subset();
-		}
-	}
-	else
-	{
-		/* in this case, just use superclass method */
-		results=CTwoSampleTest::sample_null();
-	}
-
-	SG_DEBUG("leaving!\n");
-
-	return results;
-}
diff --git a/src/shogun/statistics/KernelTwoSampleTest.h b/src/shogun/statistics/KernelTwoSampleTest.h
deleted file mode 100644
index ba7b9d26160..00000000000
--- a/src/shogun/statistics/KernelTwoSampleTest.h
+++ /dev/null
@@ -1,126 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#ifndef KERNEL_TWO_SAMPLE_TEST_H_
-#define KERNEL_TWO_SAMPLE_TEST_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/TwoSampleTest.h>
-#include <shogun/kernel/Kernel.h>
-
-namespace shogun
-{
-
-class CFeatures;
-class CKernel;
-
-/** @brief Kernel two sample test base class. Provides an interface for
- * performing a two-sample test using a kernel, i.e. Given samples from two
- * distributions \f$p\f$ and \f$q\f$, the null-hypothesis is: \f$H_0: p=q\f$,
- * the alternative hypothesis: \f$H_1: p\neq q\f$.
- *
- * In this class, this is done using a single kernel for the data.
- *
- * The class also re-implements the sample_null() method. If the underlying
- * kernel is a custom one (precomputed), the rows and column of the kernel
- * matrix is permuted using subsets. The computation falls back to
- * CTwoSampleTest::sample_null() otherwise.
- *
- * Abstract base class.
- */
-class CKernelTwoSampleTest : public CTwoSampleTest
-{
-	public:
-		/** default constructor */
-		CKernelTwoSampleTest();
-
-		/** Constructor
-		 *
-		 * @param p_and_q feature data. Is assumed to contain samples from both
-		 * p and q. First all samples from p, then from index q_start all
-		 * samples from q
-		 *
-		 * @param kernel kernel to use
-		 * @param p_and_q samples from p and q, appended
-		 * @param q_start index of first sample of q
-		 */
-		CKernelTwoSampleTest(CKernel* kernel, CFeatures* p_and_q,
-				index_t q_start);
-
-		/** Constructor.
-		 * This is a convienience constructor which copies both features to one
-		 * element and then calls the other constructor. Needs twice the memory
-		 * for a short time
-		 *
-		 * @param kernel kernel for MMD
-		 * @param p samples from distribution p, will be copied and NOT
-		 * SG_REF'ed
-		 * @param q samples from distribution q, will be copied and NOT
-		 * SG_REF'ed
-		 */
-		CKernelTwoSampleTest(CKernel* kernel, CFeatures* p,
-				CFeatures* q);
-
-		/** destructor */
-		virtual ~CKernelTwoSampleTest();
-
-		/** Setter for the underlying kernel
-		 * @param kernel new kernel to use
-		 */
-		inline virtual void set_kernel(CKernel* kernel)
-		{
-			/* ref before unref to prevent deleting in case objects are the same */
-			SG_REF(kernel);
-			SG_UNREF(m_kernel);
-			m_kernel=kernel;
-		}
-
-		/** @return underlying kernel, is SG_REF'ed */
-		inline virtual CKernel* get_kernel()
-		{
-			SG_REF(m_kernel);
-			return m_kernel;
-		}
-
-		/** merges both sets of samples and computes the test statistic
-		 * m_num_null_samples times. This version checks if a precomputed
-		 * custom kernel is used, and, if so, just permutes it instead of re-
-		 * computing it in every iteration.
-		 *
-		 * @return vector of all statistics
-		 */
-		virtual SGVector<float64_t> sample_null();
-
-		/** Same as compute_statistic(), but with the possibility to perform on
-		 * multiple kernels at once
-		 *
-		 * @param multiple_kernels if true, and underlying kernel is K_COMBINED,
-		 * method will be executed on all subkernels on the same data
-		 * @return vector of results for subkernels
-		 */
-		virtual SGVector<float64_t> compute_statistic(
-				bool multiple_kernels)=0;
-
-		/** Wrapper for compute_statistic(false) */
-		virtual float64_t compute_statistic()=0;
-
-		virtual const char* get_name() const=0;
-
-	private:
-		void init();
-
-	protected:
-		/** underlying kernel */
-		CKernel* m_kernel;
-};
-
-}
-
-#endif /* KERNEL_TWO_SAMPLE_TEST_H_ */
diff --git a/src/shogun/statistics/LinearTimeMMD.cpp b/src/shogun/statistics/LinearTimeMMD.cpp
deleted file mode 100644
index 735d7dc502f..00000000000
--- a/src/shogun/statistics/LinearTimeMMD.cpp
+++ /dev/null
@@ -1,458 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/features/Features.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/mathematics/Statistics.h>
-#include <shogun/features/CombinedFeatures.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <shogun/lib/List.h>
-
-#include <shogun/lib/external/libqp.h>
-
-using namespace shogun;
-
-CLinearTimeMMD::CLinearTimeMMD() : CStreamingMMD()
-{
-}
-
-CLinearTimeMMD::CLinearTimeMMD(CKernel* kernel, CStreamingFeatures* p,
-		CStreamingFeatures* q, index_t m, index_t blocksize)
-	: CStreamingMMD(kernel, p, q, m, blocksize)
-{
-}
-
-CLinearTimeMMD::~CLinearTimeMMD()
-{
-}
-
-void CLinearTimeMMD::compute_squared_mmd(CKernel* kernel, CList* data,
-		SGVector<float64_t>& current, SGVector<float64_t>& pp,
-		SGVector<float64_t>& qq, SGVector<float64_t>& pq,
-		SGVector<float64_t>& qp, index_t num_this_run)
-{
-	SG_DEBUG("entering!\n");
-
-	REQUIRE(data->get_num_elements()==4, "Wrong number of blocks!\n");
-
-	/* cast is safe the list is passed inside the class
-	 * features will be SG_REF'ed once again by these get methods */
-	CFeatures* p1=(CFeatures*)data->get_first_element();
-	CFeatures* p2=(CFeatures*)data->get_next_element();
-	CFeatures* q1=(CFeatures*)data->get_next_element();
-	CFeatures* q2=(CFeatures*)data->get_next_element();
-
-	SG_DEBUG("computing MMD values for current kernel!\n");
-
-	/* compute kernel matrix diagonals */
-	kernel->init(p1, p2);
-	kernel->get_kernel_diagonal(pp);
-
-	kernel->init(q1, q2);
-	kernel->get_kernel_diagonal(qq);
-
-	kernel->init(p1, q2);
-	kernel->get_kernel_diagonal(pq);
-
-	kernel->init(q1, p2);
-	kernel->get_kernel_diagonal(qp);
-
-	/* cleanup */
-	SG_UNREF(p1);
-	SG_UNREF(p2);
-	SG_UNREF(q1);
-	SG_UNREF(q2);
-
-	/* compute sum of current h terms for current kernel */
-
-	for (index_t i=0; i<num_this_run; ++i)
-		current[i]=pp[i]+qq[i]-pq[i]-qp[i];
-
-	SG_DEBUG("leaving!\n");
-}
-
-SGVector<float64_t> CLinearTimeMMD::compute_squared_mmd(CKernel* kernel,
-		CList* data, index_t num_this_run)
-{
-	/* wrapper method used for convenience for using preallocated memory */
-	SGVector<float64_t> current(num_this_run);
-	SGVector<float64_t> pp(num_this_run);
-	SGVector<float64_t> qq(num_this_run);
-	SGVector<float64_t> pq(num_this_run);
-	SGVector<float64_t> qp(num_this_run);
-	compute_squared_mmd(kernel, data, current, pp, qq, pq, qp, num_this_run);
-	return current;
-}
-
-void CLinearTimeMMD::compute_statistic_and_variance(
-		SGVector<float64_t>& statistic, SGVector<float64_t>& variance,
-		bool multiple_kernels)
-{
-	SG_DEBUG("entering!\n")
-
-	REQUIRE(m_streaming_p, "streaming features p required!\n");
-	REQUIRE(m_streaming_q, "streaming features q required!\n");
-
-	REQUIRE(m_kernel, "kernel needed!\n");
-
-	/* make sure multiple_kernels flag is used only with a combined kernel */
-	REQUIRE(!multiple_kernels || m_kernel->get_kernel_type()==K_COMBINED,
-			"multiple kernels specified, but underlying kernel is not of type "
-			"K_COMBINED\n");
-
-	/* m is number of samples from each distribution, m_2 is half of it
-	 * using names from JLMR paper (see class documentation) */
-	index_t m_2=m_m/2;
-
-	SG_DEBUG("m_m=%d\n", m_m)
-
-	/* find out whether single or multiple kernels (cast is safe, check above) */
-	index_t num_kernels=1;
-	if (multiple_kernels)
-	{
-		num_kernels=((CCombinedKernel*)m_kernel)->get_num_subkernels();
-		SG_DEBUG("computing MMD and variance for %d sub-kernels\n",
-				num_kernels);
-	}
-
-	/* allocate memory for results if vectors are empty */
-	if (!statistic.vector)
-		statistic=SGVector<float64_t>(num_kernels);
-
-	if (!variance.vector)
-		variance=SGVector<float64_t>(num_kernels);
-
-	/* ensure right dimensions */
-	REQUIRE(statistic.vlen==num_kernels,
-			"statistic vector size (%d) does not match number of kernels (%d)\n",
-			 statistic.vlen, num_kernels);
-
-	REQUIRE(variance.vlen==num_kernels,
-			"variance vector size (%d) does not match number of kernels (%d)\n",
-			 variance.vlen, num_kernels);
-
-	/* temp variable in the algorithm */
-	float64_t delta;
-
-	/* initialise statistic and variance since they are cumulative */
-	statistic.zero();
-	variance.zero();
-
-	/* needed for online mean and variance */
-	SGVector<index_t> term_counters(num_kernels);
-	term_counters.set_const(1);
-
-	/* term counter to compute online mean and variance */
-	index_t num_examples_processed=0;
-	while (num_examples_processed<m_2)
-	{
-		/* number of example to look at in this iteration */
-		index_t num_this_run=CMath::min(m_blocksize,
-				CMath::max(0, m_2-num_examples_processed));
-		SG_DEBUG("processing %d more examples. %d so far processed. Blocksize "
-				"is %d\n", num_this_run, num_examples_processed, m_blocksize);
-
-		/* stream 2 data blocks from each distribution */
-		CList* data=stream_data_blocks(2, num_this_run);
-
-		/* if multiple kernels are used, compute all of them on streamed data,
-		 * if multiple kernels flag is false, the above loop will be executed
-		 * only once */
-		CKernel* kernel=m_kernel;
-		if (multiple_kernels)
-			SG_DEBUG("using multiple kernels\n");
-
-		/* iterate through all kernels for this data */
-
-		for (index_t i=0; i<num_kernels; ++i)
-		{
-			/* if multiple kernels should be computed, set next kernel */
-			if (multiple_kernels)
-				kernel=((CCombinedKernel*)m_kernel)->get_kernel(i);
-
-			/* compute linear time MMD values */
-			SGVector<float64_t> current=compute_squared_mmd(kernel, data,
-					num_this_run);
-
-			/* single variances for all kernels. Update mean and variance
-			 * using Knuth's online variance algorithm.
-			 * C.f. for example Wikipedia */
-			for (index_t j=0; j<num_this_run; ++j)
-			{
-				/* D. Knuth's online variance algorithm for current kernel */
-				delta=current[j]-statistic[i];
-				statistic[i]+=delta/term_counters[i]++;
-				variance[i]+=delta*(current[j]-statistic[i]);
-
-				SG_DEBUG("burst: current=%f, delta=%f, statistic=%f, "
-						"variance=%f, kernel_idx=%d\n", current[j], delta,
-						statistic[i], variance[i], i);
-			}
-
-			if (multiple_kernels)
-				SG_UNREF(kernel);
-		}
-
-		/* clean up streamed data, this frees the feature objects  */
-		SG_UNREF(data);
-
-		/* add number of processed examples for this run */
-		num_examples_processed+=num_this_run;
-	}
-	SG_DEBUG("Done compouting statistic, processed 2*%d examples.\n",
-			num_examples_processed);
-
-	/* mean of sum all traces is linear time mmd, copy entries for all kernels */
-	if (io->get_loglevel()==MSG_DEBUG || io->get_loglevel()==MSG_GCDEBUG)
-		statistic.display_vector("statistics");
-
-	/* variance of terms can be computed using mean (statistic).
-	 * Note that the variance needs to be divided by m_2 in order to get
-	 * variance of null-distribution */
-	for (index_t i=0; i<num_kernels; ++i)
-		variance[i]=variance[i]/(m_2-1)/m_2;
-
-	if (io->get_loglevel()==MSG_DEBUG || io->get_loglevel()==MSG_GCDEBUG)
-		variance.display_vector("variances");
-
-	SG_DEBUG("leaving!\n")
-}
-
-void CLinearTimeMMD::compute_statistic_and_Q(
-		SGVector<float64_t>& statistic, SGMatrix<float64_t>& Q)
-{
-	SG_DEBUG("entering!\n")
-
-	REQUIRE(m_streaming_p, "streaming features p required!\n");
-	REQUIRE(m_streaming_q, "streaming features q required!\n");
-
-	REQUIRE(m_kernel, "kernel needed!\n");
-
-	/* make sure multiple_kernels flag is used only with a combined kernel */
-	REQUIRE(m_kernel->get_kernel_type()==K_COMBINED,
-			"underlying kernel is not of type K_COMBINED\n");
-
-	/* cast combined kernel */
-	CCombinedKernel* combined=(CCombinedKernel*)m_kernel;
-
-	/* m is number of samples from each distribution, m_4 is quarter of it */
-	REQUIRE(m_m>=4, "Need at least m>=4\n");
-	index_t m_4=m_m/4;
-
-	SG_DEBUG("m_m=%d\n", m_m)
-
-	/* find out whether single or multiple kernels (cast is safe, check above) */
-	index_t num_kernels=combined->get_num_subkernels();
-	REQUIRE(num_kernels>0, "At least one kernel is needed\n");
-
-	/* allocate memory for results if vectors are empty */
-	if (!statistic.vector)
-		statistic=SGVector<float64_t>(num_kernels);
-
-	if (!Q.matrix)
-		Q=SGMatrix<float64_t>(num_kernels, num_kernels);
-
-	/* ensure right dimensions */
-	REQUIRE(statistic.vlen==num_kernels,
-			"statistic vector size (%d) does not match number of kernels (%d)\n",
-			 statistic.vlen, num_kernels);
-
-	REQUIRE(Q.num_rows==num_kernels,
-			"Q number of rows does (%d) not match number of kernels (%d)\n",
-			 Q.num_rows, num_kernels);
-
-	REQUIRE(Q.num_cols==num_kernels,
-			"Q number of columns (%d) does not match number of kernels (%d)\n",
-			 Q.num_cols, num_kernels);
-
-	/* initialise statistic and variance since they are cumulative */
-	statistic.zero();
-	Q.zero();
-
-	/* produce two kernel lists to iterate doubly nested */
-	CList* list_i=new CList();
-	CList* list_j=new CList();
-
-	for (index_t k_idx=0; k_idx<combined->get_num_kernels(); k_idx++)
-	{
-		CKernel* kernel = combined->get_kernel(k_idx);
-		list_i->append_element(kernel);
-		list_j->append_element(kernel);
-		SG_UNREF(kernel);
-	}
-
-	/* needed for online mean and variance */
-	SGVector<index_t> term_counters_statistic(num_kernels);
-	SGMatrix<index_t> term_counters_Q(num_kernels, num_kernels);
-	term_counters_statistic.set_const(1);
-	term_counters_Q.set_const(1);
-
-	index_t num_examples_processed=0;
-	while (num_examples_processed<m_4)
-	{
-		/* number of example to look at in this iteration */
-		index_t num_this_run=CMath::min(m_blocksize,
-				CMath::max(0, m_4-num_examples_processed));
-		SG_DEBUG("processing %d more examples. %d so far processed. Blocksize "
-				"is %d\n", num_this_run, num_examples_processed, m_blocksize);
-
-		/* stream 4 data blocks from each distribution */
-		CList* data=stream_data_blocks(4, num_this_run);
-
-		/* create two sets of data, a and b, from alternative blocks */
-		CList* data_a=new CList(true);
-		CList* data_b=new CList(true);
-
-		/* take care of refcounts */
-		int32_t num_elements=data->get_num_elements();
-		CFeatures* current=(CFeatures*)data->get_first_element();
-		data_a->append_element(current);
-		SG_UNREF(current);
-		current=(CFeatures*)data->get_next_element();
-		data_b->append_element(current);
-		SG_UNREF(current);
-		num_elements-=2;
-		/* loop counter is safe since num_elements can only be even */
-		while (num_elements)
-		{
-			current=(CFeatures*)data->get_next_element();
-			data_a->append_element(current);
-			SG_UNREF(current);
-			current=(CFeatures*)data->get_next_element();
-			data_b->append_element(current);
-			SG_UNREF(current);
-			num_elements-=2;
-		}
-		/* safely unref previous list of data, decreases refcounts of features
-		 * but doesn't delete them */
-		SG_UNREF(data);
-
-		/* now for each of these streamed data instances, iterate through all
-		 * kernels and update Q matrix while also computing MMD statistic */
-
-		/* preallocate some memory for faster processing */
-		SGVector<float64_t> pp(num_this_run);
-		SGVector<float64_t> qq(num_this_run);
-		SGVector<float64_t> pq(num_this_run);
-		SGVector<float64_t> qp(num_this_run);
-		SGVector<float64_t> h_i_a(num_this_run);
-		SGVector<float64_t> h_i_b(num_this_run);
-		SGVector<float64_t> h_j_a(num_this_run);
-		SGVector<float64_t> h_j_b(num_this_run);
-
-		/* iterate through Q matrix and update values, compute mmd */
-		CKernel* kernel_i=(CKernel*)list_i->get_first_element();
-		for (index_t i=0; i<num_kernels; ++i)
-		{
-			/* compute all necessary 8 h-vectors for this burst.
-			 * h_delta-terms for each kernel, expression 7 of NIPS paper */
-
-			/* first kernel, a-part */
-			compute_squared_mmd(kernel_i, data_a, h_i_a, pp, qq, pq, qp,
-					num_this_run);
-
-			/* first kernel, b-part */
-			compute_squared_mmd(kernel_i, data_b, h_i_b, pp, qq, pq, qp,
-					num_this_run);
-
-			/* iterate through j, but use symmetry in order to save half of the
-			 * computations */
-			CKernel* kernel_j=(CKernel*)list_j->get_first_element();
-			for (index_t j=0; j<=i; ++j)
-			{
-				/* compute all necessary 8 h-vectors for this burst.
-				 * h_delta-terms for each kernel, expression 7 of NIPS paper */
-
-				/* second kernel, a-part */
-				compute_squared_mmd(kernel_j, data_a, h_j_a, pp, qq, pq, qp,
-						num_this_run);
-
-				/* second kernel, b-part */
-				compute_squared_mmd(kernel_j, data_b, h_j_b, pp, qq, pq, qp,
-						num_this_run);
-
-				float64_t term;
-				for (index_t it=0; it<num_this_run; ++it)
-				{
-					/* current term of expression 7 of NIPS paper */
-					term=(h_i_a[it]-h_i_b[it])*(h_j_a[it]-h_j_b[it]);
-
-					/* update covariance element for the current burst. This is a
-					 * running average of the product of the h_delta terms of each
-					 * kernel */
-					Q(i, j)+=(term-Q(i, j))/term_counters_Q(i, j)++;
-				}
-
-				/* use symmetry */
-				Q(j, i)=Q(i, j);
-
-				/* next kernel j */
-				kernel_j=(CKernel*)list_j->get_next_element();
-			}
-
-			/* update MMD statistic online computation for kernel i, using
-			 * vectors that were computed above */
-			SGVector<float64_t> h(num_this_run*2);
-			for (index_t it=0; it<num_this_run; ++it)
-			{
-				/* update statistic for kernel i (outer loop) and update using
-				 * all elements of the h_i_a, h_i_b vectors (iterate over it) */
-				statistic[i]=statistic[i]+
-						(h_i_a[it]-statistic[i])/term_counters_statistic[i]++;
-
-				/* Make sure to use all data, i.e. part a and b */
-				statistic[i]=statistic[i]+
-						(h_i_b[it]-statistic[i])/(term_counters_statistic[i]++);
-			}
-
-			/* next kernel i */
-			kernel_i=(CKernel*)list_i->get_next_element();
-		}
-
-		/* clean up streamed data */
-		SG_UNREF(data_a);
-		SG_UNREF(data_b);
-
-		/* add number of processed examples for this run */
-		num_examples_processed+=num_this_run;
-	}
-
-	/* clean up */
-	SG_UNREF(list_i);
-	SG_UNREF(list_j);
-
-	SG_DEBUG("Done compouting statistic, processed 4*%d examples.\n",
-			num_examples_processed);
-
-	SG_DEBUG("leaving!\n")
-}
-
diff --git a/src/shogun/statistics/LinearTimeMMD.h b/src/shogun/statistics/LinearTimeMMD.h
deleted file mode 100644
index 84049792803..00000000000
--- a/src/shogun/statistics/LinearTimeMMD.h
+++ /dev/null
@@ -1,158 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef LINEAR_TIME_MMD_H_
-#define LINEAR_TIME_MMD_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/StreamingMMD.h>
-
-namespace shogun
-{
-
-class CStreamingFeatures;
-class CFeatures;
-
-/** @brief This class implements the linear time Maximum Mean Statistic as
- * described in [1] for streaming data (see CStreamingMMD for description).
- *
- * Given two sets of samples \f$\{x_i\}_{i=1}^m\sim p\f$ and
- * \f$\{y_i\}_{i=1}^m\sim q\f$
- * the (unbiased) statistic is computed as
- * \f[
- * \text{MMD}_l^2[\mathcal{F},X,Y]=\frac{1}{m_2}\sum_{i=1}^{m_2}
- * h(z_{2i},z_{2i+1})
- * \f]
- * where
- * \f[
- * h(z_{2i},z_{2i+1})=k(x_{2i},x_{2i+1})+k(y_{2i},y_{2i+1})-k(x_{2i},y_{2i+1})-
- * k(x_{2i+1},y_{2i})
- * \f]
- * and \f$ m_2=\lfloor\frac{m}{2} \rfloor\f$.
- *
- * [1]: Gretton, A., Borgwardt, K. M., Rasch, M. J., Schoelkopf, B.,
- * & Smola, A. (2012). A Kernel Two-Sample Test. Journal of Machine Learning
- * Research, 13, 671-721.
- */
-class CLinearTimeMMD: public CStreamingMMD
-{
-public:
-	/** default constructor */
-	CLinearTimeMMD();
-
-	/** Constructor.
-	 * @param kernel kernel to use
-	 * @param p streaming features p to use
-	 * @param q streaming features q to use
-	 * @param m number of samples from each distribution
-	 * @param blocksize size of examples that are processed at once when
-	 * computing statistic/threshold. If larger than m/2, all examples will be
-	 * processed at once. Memory consumption increased linearly in the
-	 * blocksize. Choose as large as possible regarding available memory.
-	 */
-	CLinearTimeMMD(CKernel* kernel, CStreamingFeatures* p,
-			CStreamingFeatures* q, index_t m, index_t blocksize=10000);
-
-	/** destructor */
-	virtual ~CLinearTimeMMD();
-
-	/** Computes squared MMD and a variance estimate, in linear time.
-	 * If multiple_kernels is set to true, each subkernel is evaluated on the
-	 * same data.
-	 *
-	 * @param statistic return parameter for statistic, vector with entry for
-	 * each kernel. May be allocated before but doesn not have to be
-	 *
-	 * @param variance return parameter for statistic, vector with entry for
-	 * each kernel. May be allocated before but doesn not have to be
-	 *
-	 * @param multiple_kernels optional flag, if set to true, it is assumed that
-	 * the underlying kernel is of type K_COMBINED. Then, the MMD is computed on
-	 * all subkernel separately rather than computing it on the combination.
-	 * This is used by kernel selection strategies that need to evaluate
-	 * multiple kernels on the same data. Since the linear time MMD works on
-	 * streaming data, one cannot simply compute MMD, change kernel since data
-	 * would be different for every kernel.
-	 */
-	virtual void compute_statistic_and_variance(
-			SGVector<float64_t>& statistic, SGVector<float64_t>& variance,
-			bool multiple_kernels=false);
-
-	/** Same as compute_statistic_and_variance, but computes a linear time
-	 * estimate of the covariance of the multiple-kernel-MMD.
-	 * See [1] for details.
-	 */
-	virtual void compute_statistic_and_Q(
-			SGVector<float64_t>& statistic, SGMatrix<float64_t>& Q);
-
-	/** returns the statistic type of this test statistic */
-	virtual EStatisticType get_statistic_type() const
-	{
-		return S_LINEAR_TIME_MMD;
-	}
-
-	/** @return the class name */
-	virtual const char* get_name() const
-	{
-		return "LinearTimeMMD";
-	}
-
-protected:
-	/** method that computes the squared MMD in linear time (see class
-	 * description for the equation)
-	 *
-	 * @param kernel the kernel to be used for computing MMD. This will be
-	 * useful when multiple kernels are used
-	 * @param data the list of data on which kernels are computed. The order
-	 * of data in the list is \f$x,x',\cdots\sim p\f$ followed by
-	 * \f$y,y',\cdots\sim q\f$. It is assumed that detele_data flag is set
-	 * inside the list
-	 * @param num_this_run number of data points in current blocks
-	 * @return the MMD values (the h-vectors)
-	 */
-	 virtual SGVector<float64_t> compute_squared_mmd(CKernel* kernel,
-			 CList* data, index_t num_this_run);
-
-private:
-	/** helper method, same as compute_squared_mmd with an option to use
-	 * preallocated memory for faster processing */
-	void compute_squared_mmd(CKernel* kernel, CList* data,
-			SGVector<float64_t>& current, SGVector<float64_t>& pp,
-			SGVector<float64_t>& qq, SGVector<float64_t>& pq,
-			SGVector<float64_t>& qp, index_t num_this_run);
-
-};
-
-}
-
-#endif /* LINEAR_TIME_MMD_H_ */
-
diff --git a/src/shogun/statistics/MMDKernelSelection.h b/src/shogun/statistics/MMDKernelSelection.h
deleted file mode 100644
index 5616b2bbdd5..00000000000
--- a/src/shogun/statistics/MMDKernelSelection.h
+++ /dev/null
@@ -1,100 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef __MMDKERNELSELECTION_H_
-#define __MMDKERNELSELECTION_H_
-
-#include <shogun/lib/config.h>
-#include <shogun/statistics/KernelSelection.h>
-
-namespace shogun
-{
-
-class CKernelTwoSampleTest;
-class CKernel;
-
-/** @brief Base class for kernel selection for MMD-based two-sample test
- * statistic implementations.
- * Provides abstract methods for selecting kernels and computing criteria or
- * kernel weights for the implemented method. In order to implement new methods
- * for kernel selection, simply write a new implementation of this class.
- *
- * Kernel selection works this way: One passes an instance of CCombinedKernel
- * to the MMD statistic and appends all kernels that should be considered.
- * Depending on the type of kernel selection implementation, a single one or
- * a combination of those baseline kernels is selected and returned to the user.
- * This kernel can then be passed to the MMD instance to perform a test.
- *
- */
-class CMMDKernelSelection: public CKernelSelection
-{
-public:
-
-	/** Default constructor */
-	CMMDKernelSelection();
-
-	/** Constructor that initialises the underlying MMD instance
-	 *
-	 * @param mmd MMD instance to use. Has to be an MMD based kernel two-sample
-	 * test. Currently: linear or quadratic time MMD.
-	 */
-	CMMDKernelSelection(CKernelTwoSampleTest* mmd);
-
-	/** Destructor */
-	virtual ~CMMDKernelSelection();
-
-	/** If the the implemented method selects a single kernel, this computes
-	 * criteria for all underlying kernels. If the method selects combined
-	 * kernels, this method returns weights for the baseline kernels
-	 *
-	 * @return vector with criteria or kernel weights
-	 */
-	virtual SGVector<float64_t> compute_measures()=0;
-
-	/** Performs kernel selection on the base of the compute_measures() method
-	 * and returns the selected kernel which is either a single or a combined
-	 * one (with weights set)
-	 *
-	 * @return selected kernel (SG_REF'ed)
-	 */
-	virtual CKernel* select_kernel();
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const
-	{
-		return "MMDKernelSelection";
-	}
-
-};
-
-}
-
-#endif /* __MMDKERNELSELECTION_H_ */
diff --git a/src/shogun/statistics/MMDKernelSelectionComb.cpp b/src/shogun/statistics/MMDKernelSelectionComb.cpp
deleted file mode 100644
index aa1da177181..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionComb.cpp
+++ /dev/null
@@ -1,171 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/statistics/MMDKernelSelectionComb.h>
-#include <shogun/statistics/KernelTwoSampleTest.h>
-#include <shogun/kernel/CombinedKernel.h>
-
-using namespace shogun;
-
-CMMDKernelSelectionComb::CMMDKernelSelectionComb() :
-		CMMDKernelSelection()
-{
-	init();
-}
-
-CMMDKernelSelectionComb::CMMDKernelSelectionComb(
-		CKernelTwoSampleTest* mmd) : CMMDKernelSelection(mmd)
-{
-	init();
-}
-
-CMMDKernelSelectionComb::~CMMDKernelSelectionComb()
-{
-}
-
-void CMMDKernelSelectionComb::init()
-{
-	SG_ADD(&m_opt_max_iterations, "opt_max_iterations", "Maximum number of "
-			"iterations for qp solver", MS_NOT_AVAILABLE);
-	SG_ADD(&m_opt_epsilon, "opt_epsilon", "Stopping criterion for qp solver",
-			MS_NOT_AVAILABLE);
-	SG_ADD(&m_opt_low_cut, "opt_low_cut", "Low cut value for optimization "
-			"kernel weights", MS_NOT_AVAILABLE);
-
-	/* sensible values for optimization */
-	m_opt_max_iterations=10000;
-	m_opt_epsilon=10E-15;
-	m_opt_low_cut=10E-7;
-}
-
-CKernel* CMMDKernelSelectionComb::select_kernel()
-{
-	/* cast is safe due to assertion in constructor */
-	CCombinedKernel* combined=(CCombinedKernel*)m_estimator->get_kernel();
-
-	/* optimise for kernel weights and set them */
-	SGVector<float64_t> weights=compute_measures();
-	combined->set_subkernel_weights(weights);
-
-	/* note that kernel is SG_REF'ed from getter above */
-	return combined;
-}
-
-/* no reference counting, use the static context constructor of SGMatrix */
-SGMatrix<float64_t> CMMDKernelSelectionComb::m_Q=SGMatrix<float64_t>(false);
-
-const float64_t* CMMDKernelSelectionComb::get_Q_col(uint32_t i)
-{
-	return &m_Q[m_Q.num_rows*i];
-}
-
-/** helper function that prints current state */
-void CMMDKernelSelectionComb::print_state(libqp_state_T state)
-{
-	SG_SDEBUG("libqp state: primal=%f\n", state.QP);
-}
-
-SGVector<float64_t> CMMDKernelSelectionComb::solve_optimization(
-		SGVector<float64_t> mmds)
-{
-	/* readability */
-	index_t num_kernels=mmds.vlen;
-
-	/* compute sum of mmds to generate feasible point for convex program */
-	float64_t sum_mmds=0;
-	for (index_t i=0; i<mmds.vlen; ++i)
-		sum_mmds+=mmds[i];
-
-	/* QP: 0.5*x'*Q*x + f'*x
-	 * subject to
-	 * mmds'*x = b
-	 * LB[i] <= x[i] <= UB[i]   for all i=1..n */
-	SGVector<float64_t> Q_diag(num_kernels);
-	SGVector<float64_t> f(num_kernels);
-	SGVector<float64_t> lb(num_kernels);
-	SGVector<float64_t> ub(num_kernels);
-	SGVector<float64_t> weights(num_kernels);
-
-	/* init everything, there are two cases possible: i) at least one mmd is
-	 * is positive, ii) all mmds are negative */
-	bool one_pos=false;
-	for (index_t i=0; i<mmds.vlen; ++i)
-	{
-		if (mmds[i]>0)
-		{
-			SG_DEBUG("found at least one positive MMD\n")
-			one_pos=true;
-			break;
-		}
-	}
-
-	if (!one_pos)
-	{
-		SG_WARNING("All mmd estimates are negative. This is techically possible,"
-				"although extremely rare. Consider using different kernels. "
-				"This combination will lead to a bad two-sample test. Since any"
-				"combination is bad, will now just return equally distributed "
-				"kernel weights\n");
-
-		/* if no element is positive, we can choose arbritary weights since
-		 * the results will be bad anyway */
-		weights.set_const(1.0/num_kernels);
-	}
-	else
-	{
-		SG_DEBUG("one MMD entry is positive, performing optimisation\n")
-		/* do optimisation, init vectors */
-		for (index_t i=0; i<num_kernels; ++i)
-		{
-			Q_diag[i]=m_Q(i,i);
-			f[i]=0;
-			lb[i]=0;
-			ub[i]=CMath::INFTY;
-
-			/* initial point has to be feasible, i.e. mmds'*x = b */
-			weights[i]=1.0/sum_mmds;
-		}
-
-		/* start libqp solver with desired parameters */
-		SG_DEBUG("starting libqp optimization\n")
-		libqp_state_T qp_exitflag=libqp_gsmo_solver(&get_Q_col, Q_diag.vector,
-				f.vector, mmds.vector,
-				one_pos ? 1 : -1,
-				lb.vector, ub.vector,
-				weights.vector, num_kernels, m_opt_max_iterations,
-				m_opt_epsilon, &(CMMDKernelSelectionComb::print_state));
-
-		SG_DEBUG("libqp returns: nIts=%d, exit_flag: %d\n", qp_exitflag.nIter,
-				qp_exitflag.exitflag);
-
-		/* set really small entries to zero and sum up for normalization */
-		float64_t sum_weights=0;
-		for (index_t i=0; i<weights.vlen; ++i)
-		{
-			if (weights[i]<m_opt_low_cut)
-			{
-				SG_DEBUG("lowcut: weight[%i]=%f<%f setting to zero\n", i, weights[i],
-						m_opt_low_cut);
-				weights[i]=0;
-			}
-
-			sum_weights+=weights[i];
-		}
-
-		/* normalize (allowed since problem is scale invariant) */
-		for (index_t i=0; i<weights.vlen; ++i)
-			weights[i]/=sum_weights;
-	}
-
-	return weights;
-}
-#endif //USE_GPL_SHOGUN
diff --git a/src/shogun/statistics/MMDKernelSelectionComb.h b/src/shogun/statistics/MMDKernelSelectionComb.h
deleted file mode 100644
index 404ca221dbd..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionComb.h
+++ /dev/null
@@ -1,98 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#ifndef __MMDKERNELSELECTIONCOMB_H_
-#define __MMDKERNELSELECTIONCOMB_H_
-
-#include <shogun/lib/config.h>
-
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/statistics/MMDKernelSelection.h>
-#include <shogun/lib/SGMatrix.h>
-#include <shogun/lib/external/libqp.h>
-
-namespace shogun
-{
-
-class CLinearTimeMMD;
-
-/** @brief Base class for kernel selection of combined kernels. Given an MMD
- * instance whose underlying kernel is a combined one, this class provides an
- * interface to select weights of this combined kernel.
- */
-class CMMDKernelSelectionComb: public CMMDKernelSelection
-{
-public:
-
-	/** Default constructor */
-	CMMDKernelSelectionComb();
-
-	/** Constructor that initialises the underlying MMD instance. Currently,
-	 * only the linear time MMD is supported
-	 *
-	 * @param mmd MMD instance to use
-	 */
-	CMMDKernelSelectionComb(CKernelTwoSampleTest* mmd);
-
-	/** Destructor */
-	virtual ~CMMDKernelSelectionComb();
-
-	/** @return computes weights for the underlying kernel, sets them to it, and
-	 * returns it (SG_REF'ed)
-	 *
-	 * @return underlying kernel with weights set
-	 */
-	virtual CKernel* select_kernel();
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const=0;
-
-protected:
-	/** Solves the quadratic program
-	 * \f[
-	 * \min_\beta \{\beta^T Q \beta \quad \text{s.t.}\quad \beta^T \eta=1, \beta\succeq 0\},
-	 * \f]
-	 * where \f$\eta\f$ is a given parameter and \f$Q\f$ is the m_Q member.
-	 *
-	 * Note that at least one element is assumed \f$\eta\f$ has to be positive.
-	 *
-	 * @param mmds values that will be put into \f$\eta\f$. At least one element
-	 * is assumed to be positive
-	 * @return result of optimization \f$\beta\f$
-	 */
-	virtual SGVector<float64_t> solve_optimization(SGVector<float64_t> mmds);
-
-	/** return pointer to i-th column of m_Q. Helper for libqp */
-	static const float64_t* get_Q_col(uint32_t i);
-
-	/** helper function that prints current state */
-	static void print_state(libqp_state_T state);
-
-	/** maximum number of iterations of qp solver */
-	index_t m_opt_max_iterations;
-
-	/** stopping accuracy of qp solver */
-	float64_t m_opt_epsilon;
-
-	/** low cut for weights, if weights are under this value, are set to zero */
-	float64_t m_opt_low_cut;
-
-	/** matrix for selection of kernel weights (static because of libqp) */
-	static SGMatrix<float64_t> m_Q;
-
-private:
-	/** initializer */
-	void init();
-};
-
-}
-
-#endif //USE_GPL_SHOGUN
-#endif /* __MMDKERNELSELECTIONCOMB_H_ */
diff --git a/src/shogun/statistics/MMDKernelSelectionCombMaxL2.cpp b/src/shogun/statistics/MMDKernelSelectionCombMaxL2.cpp
deleted file mode 100644
index e150e7ce8c2..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionCombMaxL2.cpp
+++ /dev/null
@@ -1,74 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/statistics/MMDKernelSelectionCombMaxL2.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <shogun/mathematics/Statistics.h>
-
-
-using namespace shogun;
-
-CMMDKernelSelectionCombMaxL2::CMMDKernelSelectionCombMaxL2() :
-		CMMDKernelSelectionComb()
-{
-}
-
-CMMDKernelSelectionCombMaxL2::CMMDKernelSelectionCombMaxL2(
-		CKernelTwoSampleTest* mmd) : CMMDKernelSelectionComb(mmd)
-{
-	/* currently, this method is only developed for the linear time MMD */
-	REQUIRE(mmd->get_statistic_type()==S_QUADRATIC_TIME_MMD ||
-			mmd->get_statistic_type()==S_LINEAR_TIME_MMD, "%s::%s(): Only "
-			"CLinearTimeMMD is currently supported! Provided instance is "
-			"\"%s\"\n", get_name(), get_name(), mmd->get_name());
-}
-
-CMMDKernelSelectionCombMaxL2::~CMMDKernelSelectionCombMaxL2()
-{
-}
-
-SGVector<float64_t> CMMDKernelSelectionCombMaxL2::compute_measures()
-{
-	/* cast is safe due to assertion in constructor */
-	CCombinedKernel* kernel=(CCombinedKernel*)m_estimator->get_kernel();
-	index_t num_kernels=kernel->get_num_subkernels();
-	SG_UNREF(kernel);
-
-	/* compute mmds for all underlying kernels and create identity matrix Q
-	 * (see NIPS paper) */
-	SGVector<float64_t> mmds=m_estimator->compute_statistic(true);
-
-	/* free matrix by hand since it is static */
-	SG_FREE(m_Q.matrix);
-	m_Q.matrix=NULL;
-	m_Q.num_rows=0;
-	m_Q.num_cols=0;
-	m_Q=SGMatrix<float64_t>(num_kernels, num_kernels, false);
-	for (index_t i=0; i<num_kernels; ++i)
-	{
-		for (index_t j=0; j<num_kernels; ++j)
-			m_Q(i, j)=i==j ? 1 : 0;
-	}
-
-	/* solve the generated problem */
-	SGVector<float64_t> result=CMMDKernelSelectionComb::solve_optimization(mmds);
-
-	/* free matrix by hand since it is static (again) */
-	SG_FREE(m_Q.matrix);
-	m_Q.matrix=NULL;
-	m_Q.num_rows=0;
-	m_Q.num_cols=0;
-
-	return result;
-}
-#endif //USE_GPL_SHOGUN
diff --git a/src/shogun/statistics/MMDKernelSelectionCombMaxL2.h b/src/shogun/statistics/MMDKernelSelectionCombMaxL2.h
deleted file mode 100644
index 430300f97cc..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionCombMaxL2.h
+++ /dev/null
@@ -1,81 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#ifndef __MMDKERNELSELECTIONCOMBMAXL2_H_
-#define __MMDKERNELSELECTIONCOMBMAXL2_H_
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-#include <shogun/statistics/MMDKernelSelection.h>
-#include <shogun/statistics/MMDKernelSelectionComb.h>
-
-namespace shogun
-{
-
-/** @brief Implementation of maximum MMD kernel selection for combined kernel.
- * This class selects a combination of baseline kernels that maximises the
- * the MMD for a combined kernel based on a L2-regularization approach. This
- * boils down to solve the convex program
- * \f[
- * \min_\beta \{\beta^T \beta \quad \text{s.t.}\quad \beta^T \eta=1, \beta\succeq 0\},
- * \f]
- * where \f$\eta\f$ is a vector whose elements are the MMDs of the baseline
- * kernels.
- *
- * This is meant to work for the CQuadraticTimeMMD statistic.
- * Optimal weight selecton for CLinearTimeMMD can be found in
- * CMMDKernelSelectionCombOpt.
- *
- * The method is described in
- * Gretton, A., Sriperumbudur, B., Sejdinovic, D., Strathmann, H.,
- * Balakrishnan, S., Pontil, M., & Fukumizu, K. (2012).
- * Optimal kernel choice for large-scale two-sample tests.
- * Advances in Neural Information Processing Systems.
- */
-class CMMDKernelSelectionCombMaxL2: public CMMDKernelSelectionComb
-{
-public:
-
-	/** Default constructor */
-	CMMDKernelSelectionCombMaxL2();
-
-	/** Constructor that initialises the underlying MMD instance
-	 *
-	 * @param mmd MMD instance to use. Has to be an MMD based kernel two-sample
-	 * test. Currently: linear or quadratic time MMD.
-	 */
-	CMMDKernelSelectionCombMaxL2(CKernelTwoSampleTest* mmd);
-
-	/** Destructor */
-	virtual ~CMMDKernelSelectionCombMaxL2();
-
-	/** Computes kernel weights which maximise the MMD of the underlying
-	 * combined kernel using L2-regularization.
-	 *
-	 * This boils down to solving a convex program which is quadratic in the
-	 * number of kernels. See class description.
-	 *
-	 * SHOGUN has to be compiled with LAPACK to make this available. See
-	 * set_opt* methods for optimization parameters.
-	 *
-	 * IMPORTANT: Kernel weights have to be learned on different data than is
-	 * used for testing/evaluation!
-	 */
-	virtual SGVector<float64_t> compute_measures();
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const
-	{
-		return "MMDKernelSelectionCombMaxL2";
-	}
-};
-
-}
-#endif //USE_GPL_SHOGUN
-#endif /* __MMDKERNELSELECTIONCOMBMAXL2_H_ */
diff --git a/src/shogun/statistics/MMDKernelSelectionCombOpt.cpp b/src/shogun/statistics/MMDKernelSelectionCombOpt.cpp
deleted file mode 100644
index ceecf63b500..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionCombOpt.cpp
+++ /dev/null
@@ -1,99 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/statistics/MMDKernelSelectionCombOpt.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/kernel/CombinedKernel.h>
-
-
-using namespace shogun;
-
-CMMDKernelSelectionCombOpt::CMMDKernelSelectionCombOpt() :
-		CMMDKernelSelectionComb()
-{
-	init();
-}
-
-CMMDKernelSelectionCombOpt::CMMDKernelSelectionCombOpt(
-		CKernelTwoSampleTest* mmd, float64_t lambda) :
-		CMMDKernelSelectionComb(mmd)
-{
-	/* currently, this method is only developed for the linear time MMD */
-	REQUIRE(dynamic_cast<CLinearTimeMMD*>(mmd), "%s::%s(): Only "
-			"CLinearTimeMMD is currently supported! Provided instance is "
-			"\"%s\"\n", get_name(), get_name(), mmd->get_name());
-
-	init();
-
-	m_lambda=lambda;
-}
-
-CMMDKernelSelectionCombOpt::~CMMDKernelSelectionCombOpt()
-{
-}
-
-void CMMDKernelSelectionCombOpt::init()
-{
-	/* set to a sensible standard value that proved to be useful in
-	 * experiments, see NIPS paper */
-	m_lambda=1E-5;
-
-	SG_ADD(&m_lambda, "lambda", "Regularization parameter lambda",
-			MS_NOT_AVAILABLE);
-}
-
-SGVector<float64_t> CMMDKernelSelectionCombOpt::compute_measures()
-{
-	/* cast is safe due to assertion in constructor */
-	CCombinedKernel* kernel=(CCombinedKernel*)m_estimator->get_kernel();
-	index_t num_kernels=kernel->get_num_subkernels();
-	SG_UNREF(kernel);
-
-	/* allocate space for MMDs and Q matrix */
-	SGVector<float64_t> mmds(num_kernels);
-
-	/* free matrix by hand since it is static */
-	SG_FREE(m_Q.matrix);
-	m_Q.matrix=NULL;
-	m_Q.num_rows=0;
-	m_Q.num_cols=0;
-	m_Q=SGMatrix<float64_t>(num_kernels, num_kernels, false);
-
-	/* online compute mmds and covariance matrix Q of kernels */
-	((CLinearTimeMMD*)m_estimator)->compute_statistic_and_Q(mmds, m_Q);
-
-	/* evtl regularize to avoid numerical problems (see NIPS paper) */
-	if (m_lambda)
-	{
-		SG_DEBUG("regularizing matrix Q by adding %f to diagonal\n", m_lambda)
-		for (index_t i=0; i<num_kernels; ++i)
-			m_Q(i,i)+=m_lambda;
-	}
-
-	if (sg_io->get_loglevel()==MSG_DEBUG)
-	{
-		m_Q.display_matrix("(regularized) Q");
-		mmds.display_vector("mmds");
-	}
-
-	/* solve the generated problem */
-	SGVector<float64_t> result=CMMDKernelSelectionComb::solve_optimization(mmds);
-
-	/* free matrix by hand since it is static (again) */
-	SG_FREE(m_Q.matrix);
-	m_Q.matrix=NULL;
-	m_Q.num_rows=0;
-	m_Q.num_cols=0;
-
-	return result;
-}
-#endif //USE_GPL_SHOGUN
diff --git a/src/shogun/statistics/MMDKernelSelectionCombOpt.h b/src/shogun/statistics/MMDKernelSelectionCombOpt.h
deleted file mode 100644
index 9e7223ea6ee..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionCombOpt.h
+++ /dev/null
@@ -1,95 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#ifndef __MMDKERNELSELECTIONCOMBOPT_H_
-#define __MMDKERNELSELECTIONCOMBOPT_H_
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/statistics/MMDKernelSelectionComb.h>
-
-namespace shogun
-{
-
-class CLinearTimeMMD;
-
-/** @brief Implementation of optimal kernel selection for combined kernel.
- * This class selects a combination of baseline kernels that maximises the
- * ratio of the MMD and its standard deviation for a combined kernel. This
- * boils down to solve the convex program
- * \f[
- * \min_\beta \{\beta^T (Q+\lambda_m) \beta \quad \text{s.t.}\quad \beta^T \eta=1, \beta\succeq 0\},
- * \f]
- * where \f$\eta\f$ is a vector whose elements are the MMDs of the baseline
- * kernels and \f$Q\f$ is a linear time estimate of the covariance of \f$\eta\f$.
- *
- * This only works for the CLinearTimeMMD statistic. *
- * IMPORTANT: The kernel has to be selected on different data than the two-sample
- * test is performed on.
- *
- * The method is described in
- * Gretton, A., Sriperumbudur, B., Sejdinovic, D., Strathmann, H.,
- * Balakrishnan, S., Pontil, M., & Fukumizu, K. (2012).
- * Optimal kernel choice for large-scale two-sample tests.
- * Advances in Neural Information Processing Systems.
- */
-class CMMDKernelSelectionCombOpt: public CMMDKernelSelectionComb
-{
-public:
-
-	/** Default constructor */
-	CMMDKernelSelectionCombOpt();
-
-	/** Constructor that initialises the underlying MMD instance
-	 *
-	 * @param mmd linear time mmd MMD instance to use.
-	 * @param lambda ridge that is added to standard deviation, a sensible value
-	 * is 10E-5 which is the default
-	 */
-	CMMDKernelSelectionCombOpt(CKernelTwoSampleTest* mmd,
-			float64_t lambda=10E-5);
-
-	/** Destructor */
-	virtual ~CMMDKernelSelectionCombOpt();
-
-	/** Computes optimal kernel weights using the ratio of the squared MMD by its
-	 * standard deviation as a criterion, where both expressions are estimated
-	 * in linear time.
-	 *
-	 * This boils down to solving a convex program which is quadratic in the
-	 * number of kernels. See class description.
-	 *
-	 * SHOGUN has to be compiled with LAPACK to make this available. See
-	 * set_opt* methods for optimization parameters.
-	 *
-	 * IMPORTANT: Kernel weights have to be learned on different data than is
-	 * used for testing/evaluation!
-	 */
-	virtual SGVector<float64_t> compute_measures();
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const
-	{
-		return "MMDKernelSelectionCombOpt";
-	}
-
-private:
-	/** Initializer */
-	void init();
-
-protected:
-	/** Ridge that is added to the diagonal of the Q matrix in the optimization
-	 * problem */
-	float64_t m_lambda;
-};
-
-}
-#endif //USE_GPL_SHOGUN
-#endif /* __MMDKERNELSELECTIONCOMBOPT_H_ */
diff --git a/src/shogun/statistics/MMDKernelSelectionMax.cpp b/src/shogun/statistics/MMDKernelSelectionMax.cpp
deleted file mode 100644
index fe2c2c7ca6e..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionMax.cpp
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/MMDKernelSelectionMax.h>
-#include <shogun/statistics/KernelTwoSampleTest.h>
-
-using namespace shogun;
-
-CMMDKernelSelectionMax::CMMDKernelSelectionMax() : CMMDKernelSelection()
-{
-}
-
-CMMDKernelSelectionMax::CMMDKernelSelectionMax(
-		CKernelTwoSampleTest* mmd) : CMMDKernelSelection(mmd)
-{
-}
-
-CMMDKernelSelectionMax::~CMMDKernelSelectionMax()
-{
-}
-
-SGVector<float64_t> CMMDKernelSelectionMax::compute_measures()
-{
-	/* simply return vector with MMDs */
-	return m_estimator->compute_statistic(true);
-}
diff --git a/src/shogun/statistics/MMDKernelSelectionMax.h b/src/shogun/statistics/MMDKernelSelectionMax.h
deleted file mode 100644
index 34bb9fcadc0..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionMax.h
+++ /dev/null
@@ -1,60 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#ifndef __MMDKERNELSELECTIONMAX_H_
-#define __MMDKERNELSELECTIONMAX_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/MMDKernelSelection.h>
-
-namespace shogun
-{
-
-/** @brief Kernel selection class that selects the single kernel that maximises
- * the MMD statistic. Works for CQuadraticTimeMMD and CLinearTimeMMD. This leads
- * to a heuristic that is better than the standard median heuristic for
- * Gaussian kernels. However, it comes with no guarantees.
- *
- * Optimal selection of single kernels can be found in the class
- * CMMDKernelSelectionOpt
- *
- * This method was first described in
- * Sriperumbudur, B., Fukumizu, K., Gretton, A., Lanckriet, G. R. G.,
- * & Schoelkopf, B.
- * Kernel choice and classifiability for RKHS embeddings of probability
- * distributions. Advances in Neural Information Processing Systems (2009).
- */
-class CMMDKernelSelectionMax: public CMMDKernelSelection
-{
-public:
-
-	/** Default constructor */
-	CMMDKernelSelectionMax();
-
-	/** Constructor that initialises the underlying MMD instance
-	 *
-	 * @param mmd MMD instance to use. Has to be an MMD based kernel two-sample
-	 * test. Currently: linear or quadratic time MMD.
-	 */
-	CMMDKernelSelectionMax(CKernelTwoSampleTest* mmd);
-
-	/** Destructor */
-	virtual ~CMMDKernelSelectionMax();
-
-	/** @return vector the MMD of all single baseline kernels */
-	virtual SGVector<float64_t> compute_measures();
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const { return "MMDKernelSelectionMax"; }
-};
-
-}
-
-#endif /* __MMDKERNELSELECTIONMAX_H_ */
diff --git a/src/shogun/statistics/MMDKernelSelectionMedian.cpp b/src/shogun/statistics/MMDKernelSelectionMedian.cpp
deleted file mode 100644
index fad79dcb3d0..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionMedian.cpp
+++ /dev/null
@@ -1,237 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/MMDKernelSelectionMedian.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/statistics/QuadraticTimeMMD.h>
-#include <shogun/distance/EuclideanDistance.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <shogun/mathematics/Statistics.h>
-
-
-using namespace shogun;
-
-CMMDKernelSelectionMedian::CMMDKernelSelectionMedian() :
-		CMMDKernelSelection()
-{
-	init();
-}
-
-CMMDKernelSelectionMedian::CMMDKernelSelectionMedian(
-		CKernelTwoSampleTest* mmd, index_t num_data_distance) :
-		CMMDKernelSelection(mmd)
-{
-	/* assert that a combined kernel is used */
-	CKernel* kernel=mmd->get_kernel();
-	CFeatures* lhs=kernel->get_lhs();
-	CFeatures* rhs=kernel->get_rhs();
-	REQUIRE(kernel, "%s::%s(): No kernel set!\n", get_name(), get_name());
-	REQUIRE(kernel->get_kernel_type()==K_COMBINED, "%s::%s(): Requires "
-			"CombinedKernel as kernel. Yours is %s", get_name(), get_name(),
-			kernel->get_name());
-
-	/* assert that all subkernels are Gaussian kernels */
-	CCombinedKernel* combined=(CCombinedKernel*)kernel;
-
-	for (index_t k_idx=0; k_idx<combined->get_num_kernels(); k_idx++)
-	{
-		CKernel* subkernel=combined->get_kernel(k_idx);
-		REQUIRE(kernel, "%s::%s(): Subkernel (%d) of current kernel is not"
-				" of type GaussianKernel\n", get_name(), get_name(), k_idx);
-		SG_UNREF(subkernel);
-	}
-
-	/* assert 64 bit dense features since EuclideanDistance can only handle
-	 * those */
-	if (m_estimator->get_statistic_type()==S_QUADRATIC_TIME_MMD)
-	{
-		CFeatures* features=((CQuadraticTimeMMD*)m_estimator)->get_p_and_q();
-		REQUIRE(features->get_feature_class()==C_DENSE &&
-				features->get_feature_type()==F_DREAL, "%s::select_kernel(): "
-				"Only 64 bit float dense features allowed, these are \"%s\""
-				" and of type %d\n",
-				get_name(), features->get_name(), features->get_feature_type());
-		SG_UNREF(features);
-	}
-	else if (m_estimator->get_statistic_type()==S_LINEAR_TIME_MMD)
-	{
-		CStreamingFeatures* p=((CLinearTimeMMD*)m_estimator)->get_streaming_p();
-		CStreamingFeatures* q=((CLinearTimeMMD*)m_estimator)->get_streaming_q();
-		REQUIRE(p->get_feature_class()==C_STREAMING_DENSE &&
-				p->get_feature_type()==F_DREAL, "%s::select_kernel(): "
-				"Only 64 bit float streaming dense features allowed, these (p) "
-				"are \"%s\" and of type %d\n",
-				get_name(), p->get_name(), p->get_feature_type());
-
-		REQUIRE(p->get_feature_class()==C_STREAMING_DENSE &&
-				p->get_feature_type()==F_DREAL, "%s::select_kernel(): "
-				"Only 64 bit float streaming dense features allowed, these (q) "
-				"are \"%s\" and of type %d\n",
-				get_name(), q->get_name(), q->get_feature_type());
-		SG_UNREF(p);
-		SG_UNREF(q);
-	}
-
-	SG_UNREF(kernel);
-	SG_UNREF(lhs);
-	SG_UNREF(rhs);
-
-	init();
-
-	m_num_data_distance=num_data_distance;
-}
-
-CMMDKernelSelectionMedian::~CMMDKernelSelectionMedian()
-{
-}
-
-void CMMDKernelSelectionMedian::init()
-{
-	SG_ADD(&m_num_data_distance, "m_num_data_distance", "Number of elements to "
-			"to compute median distance on", MS_NOT_AVAILABLE);
-
-	/* this is a sensible value */
-	m_num_data_distance=1000;
-}
-
-SGVector<float64_t> CMMDKernelSelectionMedian::compute_measures()
-{
-	SG_ERROR("%s::compute_measures(): Not implemented. Use select_kernel() "
-			"method!\n", get_name());
-	return SGVector<float64_t>();
-}
-
-CKernel* CMMDKernelSelectionMedian::select_kernel()
-{
-	/* number of data for distace */
-	index_t num_data=CMath::min(m_num_data_distance, m_estimator->get_m());
-
-	SGMatrix<float64_t> dists;
-
-	/* compute all pairwise distances, depends which mmd statistic is used */
-	if (m_estimator->get_statistic_type()==S_QUADRATIC_TIME_MMD)
-	{
-		/* fixed data, create merged copy of a random subset */
-
-		/* create vector with that correspond to the num_data first points of
-		 * each distribution, remember data is stored jointly */
-		SGVector<index_t> subset(num_data*2);
-		index_t m=m_estimator->get_m();
-		for (index_t i=0; i<num_data; ++i)
-		{
-			/* num_data samples from each half of joint sample */
-			subset[i]=i;
-			subset[i+num_data]=i+m;
-		}
-
-		/* add subset and compute pairwise distances */
-		CQuadraticTimeMMD* quad_mmd=(CQuadraticTimeMMD*)m_estimator;
-		CFeatures* features=quad_mmd->get_p_and_q();
-		features->add_subset(subset);
-
-		/* cast is safe, see constructor */
-		CDenseFeatures<float64_t>* dense_features=
-				(CDenseFeatures<float64_t>*) features;
-
-		CEuclideanDistance* distance=new CEuclideanDistance(dense_features,
-				dense_features);
-		dists=distance->get_distance_matrix();
-		features->remove_subset();
-		SG_UNREF(distance);
-		SG_UNREF(features);
-	}
-	else if (m_estimator->get_statistic_type()==S_LINEAR_TIME_MMD)
-	{
-		/* just stream the desired number of points */
-		CLinearTimeMMD* linear_mmd=(CLinearTimeMMD*)m_estimator;
-
-		CStreamingFeatures* p=linear_mmd->get_streaming_p();
-		CStreamingFeatures* q=linear_mmd->get_streaming_q();
-
-		/* cast is safe, see constructor */
-		CDenseFeatures<float64_t>* p_streamed=(CDenseFeatures<float64_t>*)
-				p->get_streamed_features(num_data);
-		CDenseFeatures<float64_t>* q_streamed=(CDenseFeatures<float64_t>*)
-					q->get_streamed_features(num_data);
-
-		/* for safety */
-		SG_REF(p_streamed);
-		SG_REF(q_streamed);
-
-		/* create merged feature object */
-		CDenseFeatures<float64_t>* merged=(CDenseFeatures<float64_t>*)
-				p_streamed->create_merged_copy(q_streamed);
-
-		/* compute pairwise distances */
-		CEuclideanDistance* distance=new CEuclideanDistance(merged, merged);
-		dists=distance->get_distance_matrix();
-
-		/* clean up */
-		SG_UNREF(distance);
-		SG_UNREF(p_streamed);
-		SG_UNREF(q_streamed);
-		SG_UNREF(p);
-		SG_UNREF(q);
-	}
-
-	/* create a vector where the zeros have been removed, use upper triangle
-	 * only since distances are symmetric */
-	SGVector<float64_t> dist_vec(dists.num_rows*(dists.num_rows-1)/2);
-	index_t write_idx=0;
-	for (index_t i=0; i<dists.num_rows; ++i)
-	{
-		for (index_t j=i+1; j<dists.num_rows; ++j)
-			dist_vec[write_idx++]=dists(i,j);
-	}
-
-	/* now we have distance matrix, compute median, allow to modify matrix */
-	CMath::qsort<float64_t>(dist_vec.vector, dist_vec.vlen);
-	float64_t median_distance=dist_vec[dist_vec.vlen/2];
-	SG_DEBUG("median_distance: %f\n", median_distance);
-
-	/* shogun has no square and factor two in its kernel width, MATLAB does
-	 * median_width = sqrt(0.5*median_distance), we do this */
-	float64_t shogun_sigma=median_distance;
-	SG_DEBUG("kernel width (shogun): %f\n", shogun_sigma);
-
-	/* now of all kernels, find the one which has its width closest
-	 * Cast is safe due to constructor of MMDKernelSelection class */
-	CCombinedKernel* combined=(CCombinedKernel*)m_estimator->get_kernel();
-	float64_t min_distance=CMath::MAX_REAL_NUMBER;
-	CKernel* min_kernel=NULL;
-	float64_t distance;
-	for (index_t i=0; i<combined->get_num_subkernels(); ++i)
-	{
-		CKernel* current=combined->get_kernel(i);
-		REQUIRE(current->get_kernel_type()==K_GAUSSIAN, "%s::select_kernel(): "
-				"%d-th kernel is not a Gaussian but \"%s\"!\n", get_name(), i,
-				current->get_name());
-
-		/* check if width is closer to median width */
-		distance=CMath::abs(((CGaussianKernel*)current)->get_width()-
-				shogun_sigma);
-
-		if (distance<min_distance)
-		{
-			min_distance=distance;
-			min_kernel=current;
-		}
-
-		/* next kernel */
-		SG_UNREF(current);
-	}
-	SG_UNREF(combined);
-
-	/* returned referenced kernel */
-	SG_REF(min_kernel);
-	return min_kernel;
-}
diff --git a/src/shogun/statistics/MMDKernelSelectionMedian.h b/src/shogun/statistics/MMDKernelSelectionMedian.h
deleted file mode 100644
index 91523c3837a..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionMedian.h
+++ /dev/null
@@ -1,83 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#ifndef __MMDKERNELSELECTIONMEDIAN_H_
-#define __MMDKERNELSELECTIONMEDIAN_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/MMDKernelSelection.h>
-
-namespace shogun
-{
-
-/** @brief Implements MMD kernel selection for a number of Gaussian baseline
- * kernels via selecting the one with a bandwidth parameter that is closest to
- * the median of all pairwise distances in the underlying data. Therefore, it
- * only works for data to which a GaussianKernel can be applied, which are
- * grouped under the class CDotFeatures in SHOGUN.
- *
- * This method works reasonable if distinguishing characteristics of data are not
- * hidden at a different length-scale that the overall one. In addition it is
- * fast to compute. In other cases, it is a bad choice.
- *
- * Optimal selection of single kernels can be found in the class
- * CMMDKernelSelectionOpt
- *
- * Described among oher places in
- * Gretton, A., Borgwardt, K. M., Rasch, M. J., Schoelkopf, B., & Smola, A.
- * (2012).
- * A Kernel Two-Sample Test. Journal of Machine Learning Research, 13, 671-721.
- */
-class CMMDKernelSelectionMedian: public CMMDKernelSelection
-{
-public:
-
-	/** Default constructor */
-	CMMDKernelSelectionMedian();
-
-	/** Constructor that initialises the underlying MMD instance
-	 *
-	 * @param mmd MMD instance to use. Has to be an MMD based kernel two-sample
-	 * test.
-	 * @param num_data_distance Number of points that is used to compute the
-	 * median distance on. Since the median is stable, this do need need to be
-	 * all data, but a small subset is sufficient.
-	 */
-	CMMDKernelSelectionMedian(CKernelTwoSampleTest* mmd,
-			index_t num_data_distance=1000);
-
-	/** Destructor */
-	virtual ~CMMDKernelSelectionMedian();
-
-	/** @return Throws an error and shoold not be used */
-	virtual SGVector<float64_t> compute_measures();
-
-	/** Returns the baseline kernel whose bandwidth parameter is closest to the
-	 * median of the pairwise distances of the underlyinf data
-	 *
-	 * @return selected kernel (SG_REF'ed)
-	 */
-	virtual CKernel* select_kernel();
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const { return "MMDKernelSelectionMedian"; }
-
-private:
-	/* initialises and registers member variables */
-	void init();
-
-protected:
-	/** maximum number of data to be used for median distance computation */
-	index_t m_num_data_distance;
-};
-
-}
-
-#endif /* __MMDKERNELSELECTIONMEDIAN_H_ */
diff --git a/src/shogun/statistics/MMDKernelSelectionOpt.cpp b/src/shogun/statistics/MMDKernelSelectionOpt.cpp
deleted file mode 100644
index b03dfaee18f..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionOpt.cpp
+++ /dev/null
@@ -1,62 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/MMDKernelSelectionOpt.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/kernel/CombinedKernel.h>
-
-using namespace shogun;
-
-CMMDKernelSelectionOpt::CMMDKernelSelectionOpt() :
-		CMMDKernelSelection()
-{
-	init();
-}
-
-CMMDKernelSelectionOpt::CMMDKernelSelectionOpt(
-		CKernelTwoSampleTest* mmd, float64_t lambda) :
-		CMMDKernelSelection(mmd)
-{
-	init();
-
-	/* currently, this method is only developed for the linear time MMD */
-	REQUIRE(dynamic_cast<CLinearTimeMMD*>(mmd), "%s::%s(): Only "
-			"CLinearTimeMMD is currently supported! Provided instance is "
-			"\"%s\"\n", get_name(), get_name(), mmd->get_name());
-
-	m_lambda=lambda;
-}
-
-CMMDKernelSelectionOpt::~CMMDKernelSelectionOpt()
-{
-}
-
-SGVector<float64_t> CMMDKernelSelectionOpt::compute_measures()
-{
-	/* comnpute mmd on all subkernels using the same data. Note that underlying
-	 * kernel was asserted to be a combined one */
-	SGVector<float64_t> mmds;
-	SGVector<float64_t> vars;
-	((CLinearTimeMMD*)m_estimator)->compute_statistic_and_variance(mmds, vars, true);
-
-	/* we know that the underlying MMD is linear time version, cast is safe */
-	SGVector<float64_t> measures(mmds.vlen);
-
-	for (index_t i=0; i<measures.vlen; ++i)
-		measures[i]=mmds[i]/(CMath::sqrt(vars[i])+m_lambda);
-
-	return measures;
-}
-
-void CMMDKernelSelectionOpt::init()
-{
-	/* set to a sensible standard value that proved to be useful in
-	 * experiments, see NIPS paper */
-	m_lambda=1E-5;
-}
diff --git a/src/shogun/statistics/MMDKernelSelectionOpt.h b/src/shogun/statistics/MMDKernelSelectionOpt.h
deleted file mode 100644
index 0e07ae384e1..00000000000
--- a/src/shogun/statistics/MMDKernelSelectionOpt.h
+++ /dev/null
@@ -1,82 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#ifndef __MMDKERNELSELECTIONOPTSINGLE_H_
-#define __MMDKERNELSELECTIONOPTSINGLE_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/MMDKernelSelection.h>
-
-namespace shogun
-{
-
-class CLinearTimeMMD;
-
-/** @brief Implements optimal kernel selection for single kernels.
- * Given a number of baseline kernels, this method selects the one that
- * minimizes the type II error for a given type I error for a two-sample test.
- * This only works for the CLinearTimeMMD statistic.
- *
- * The idea is to maximise the ratio of MMD and its standard deviation.
- *
- * IMPORTANT: The kernel has to be selected on different data than the two-sample
- * test is performed on.
- *
- * Described in
- * Gretton, A., Sriperumbudur, B., Sejdinovic, D., Strathmann, H.,
- * Balakrishnan, S., Pontil, M., & Fukumizu, K. (2012).
- * Optimal kernel choice for large-scale two-sample tests.
- * Advances in Neural Information Processing Systems.
- */
-class CMMDKernelSelectionOpt: public CMMDKernelSelection
-{
-public:
-
-	/** Default constructor */
-	CMMDKernelSelectionOpt();
-
-	/** Constructor that initialises the underlying MMD instance. Currently,
-	 * only the linear time MMD is supported
-	 *
-	 * @param mmd MMD instance to use
-	 * @param lambda ridge that is added to standard deviation in order to
-	 * prevent division by zero. A sensivle value is for example 1E-5.
-	 */
-	CMMDKernelSelectionOpt(CKernelTwoSampleTest* mmd,
-			float64_t lambda=10E-5);
-
-	/** Destructor */
-	virtual ~CMMDKernelSelectionOpt();
-
-	/** Overwrites superclass method and ensures that all statistics are
-	 * computed on the same data. Since linear time MMD is a streaming
-	 * statistic, just computing all statistics one after another would use
-	 * different data. This method makes sure that all kernels are used at once
-	 *
-	 * @return vector with kernel criterion values for all attached kernels
-	 */
-	virtual SGVector<float64_t> compute_measures();
-
-	/** @return name of the SGSerializable */
-	virtual const char* get_name() const { return "MMDKernelSelectionOpt"; }
-
-private:
-	/** Initializer */
-	void init();
-
-protected:
-	/** Ridge that is added to the denumerator of the ratio of MMD and its
-	 * standard deviation */
-	float64_t m_lambda;
-};
-
-}
-
-#endif /* __MMDKERNELSELECTIONOPTSINGLE_H_ */
diff --git a/src/shogun/statistics/NOCCO.cpp b/src/shogun/statistics/NOCCO.cpp
deleted file mode 100644
index e2dc9ad4a4c..00000000000
--- a/src/shogun/statistics/NOCCO.cpp
+++ /dev/null
@@ -1,268 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * Written (w) 2012-2013 Heiko Strathmann
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/NOCCO.h>
-
-#include <shogun/features/Features.h>
-#include <shogun/kernel/Kernel.h>
-#include <shogun/kernel/CustomKernel.h>
-#include <shogun/mathematics/eigen3.h>
-#include <shogun/mathematics/Statistics.h>
-
-using namespace shogun;
-using namespace Eigen;
-
-CNOCCO::CNOCCO() : CKernelIndependenceTest()
-{
-	init();
-}
-
-CNOCCO::CNOCCO(CKernel* kernel_p, CKernel* kernel_q, CFeatures* p, CFeatures* q)
-	: CKernelIndependenceTest(kernel_p, kernel_q, p, q)
-{
-	init();
-
-	// only equal number of samples are allowed
-	if (p && q)
-	{
-		REQUIRE(p->get_num_vectors()==q->get_num_vectors(),
-				"Only equal number of samples from both the distributions are "
-				"possible. Provided %d samples from p and %d samples from q!\n",
-				p->get_num_vectors(), q->get_num_vectors());
-
-		m_num_features=p->get_num_vectors();
-	}
-}
-
-CNOCCO::~CNOCCO()
-{
-}
-
-void CNOCCO::init()
-{
-	SG_ADD(&m_num_features, "num_features",
-			"Number of features from each of the distributions",
-			MS_NOT_AVAILABLE);
-	SG_ADD(&m_epsilon, "epsilon", "The regularization constant",
-			MS_NOT_AVAILABLE);
-
-	m_num_features=0;
-	m_epsilon=0.0;
-
-	// we need PERMUTATION as the null approximation method here
-	m_null_approximation_method=PERMUTATION;
-}
-
-void CNOCCO::set_p(CFeatures* p)
-{
-	CIndependenceTest::set_p(p);
-	REQUIRE(m_p, "Provided feature for p cannot be null!\n");
-	m_num_features=m_p->get_num_vectors();
-}
-
-void CNOCCO::set_q(CFeatures* q)
-{
-	CIndependenceTest::set_q(q);
-	REQUIRE(m_q, "Provided feature for q cannot be null!\n");
-	m_num_features=m_q->get_num_vectors();
-}
-
-void CNOCCO::set_epsilon(float64_t epsilon)
-{
-	m_epsilon=epsilon;
-}
-
-float64_t CNOCCO::get_epsilon() const
-{
-	return m_epsilon;
-}
-
-SGMatrix<float64_t> CNOCCO::compute_helper(SGMatrix<float64_t> m)
-{
-	SG_DEBUG("Entering!\n");
-
-	const index_t n=m_num_features;
-	Map<MatrixXd> mat(m.matrix, n, n);
-
-	// the result matrix res = m * inv(m + n*epsilon*eye(n,n))
-	SGMatrix<float64_t> res(n, n);
-	MatrixXd to_inv=mat+n*m_epsilon*MatrixXd::Identity(n,n);
-
-	// since the matrix is SPD, instead of directly computing the inverse,
-	// we compute the Cholesky decomposition and solve systems (see class
-	// documentation for details)
-	LLT<MatrixXd> chol(to_inv);
-
-	// compute the matrix times inverse by solving systems
-	VectorXd e=VectorXd::Zero(n);
-	for (index_t i=0; i<n; ++i)
-	{
-		e(i)=1;
-
-		// the solution vector corresponds to the i-th column of the inverse
-		const VectorXd& x=chol.solve(e);
-#pragma omp parallel for shared (res, mat, x, i)
-		for (index_t j=0; j<n; ++j)
-		{
-			// since mat is symmetric we can use mat.col instead of mat.row here
-			// for faster execution since matrices are column-major
-			res(j,i)=x.dot(mat.col(j));
-		}
-		e(i)=0;
-	}
-
-	SG_DEBUG("Leaving!\n");
-
-	return res;
-}
-
-float64_t CNOCCO::compute_statistic()
-{
-	SG_DEBUG("Entering!\n");
-
-	REQUIRE(m_kernel_p, "Kernel for p is not set! Use set_kernel_p() method to "
-			"set the kernel for use!\n");
-	REQUIRE(m_kernel_q, "Kernel for q is not set! Use set_kernel_q() method to "
-			"set the kernel for use!\n");
-
-	REQUIRE(m_p && m_q, "features needed!\n")
-
-	// compute kernel matrices
-	SGMatrix<float64_t> Gx=get_kernel_matrix_K();
-	SGMatrix<float64_t> Gy=get_kernel_matrix_L();
-
-	// center the kernel matrices
-	Gx.center();
-	Gy.center();
-
-	SGMatrix<float64_t> Rx=compute_helper(Gx);
-	SGMatrix<float64_t> Ry=compute_helper(Gy);
-
-	Map<MatrixXd> Rx_map(Rx.matrix, Rx.num_rows, Rx.num_cols);
-	Map<MatrixXd> Ry_map(Ry.matrix, Ry.num_rows, Ry.num_cols);
-
-	// compute the trace of the matrix multiplication without computing the
-	// off-diagonal entries of the final matrix and just the diagonal entries
-	float64_t result=0.0;
-	for (index_t i=0; i<m_num_features; ++i)
-	{
-		// taking advantange of the symmetry, we can use Ry_map.col here
-		// instead of Ry_map.row for computing the trace for computational
-		// advantage since matrices are stored in column-major format
-		result+=Rx_map.col(i).dot(Ry_map.col(i));
-	}
-
-	SG_DEBUG("leaving!\n");
-
-	return result;
-}
-
-float64_t CNOCCO::compute_p_value(float64_t statistic)
-{
-	float64_t result=0;
-	switch (m_null_approximation_method)
-	{
-	case PERMUTATION:
-	{
-		/* sampling null is handled there */
-		result=CIndependenceTest::compute_p_value(statistic);
-		break;
-	}
-	default:
-		SG_ERROR("Use only PERMUTATION for null-approximation method "
-				"for NOCCO!\n");
-	}
-
-	return result;
-}
-
-float64_t CNOCCO::compute_threshold(float64_t alpha)
-{
-	float64_t result=0;
-	switch (m_null_approximation_method)
-	{
-	case PERMUTATION:
-	{
-		/* sampling null is handled there */
-		result=CIndependenceTest::compute_threshold(alpha);
-		break;
-	}
-	default:
-		SG_ERROR("Use only PERMUTATION for null-approximation method "
-				"for NOCCO!\n");
-	}
-
-	return result;
-}
-
-SGVector<float64_t> CNOCCO::sample_null()
-{
-	SG_DEBUG("Entering!\n")
-
-	/* replace current kernel via precomputed custom kernel and call superclass
-	 * method */
-
-	/* backup references to old kernels */
-	CKernel* kernel_p=m_kernel_p;
-	CKernel* kernel_q=m_kernel_q;
-
-	/* init kernels before to be sure that everything is fine
-	 * kernel function between two samples from different distributions
-	 * is never computed - in fact, they may as well have different features */
-	m_kernel_p->init(m_p, m_p);
-	m_kernel_q->init(m_q, m_q);
-
-	/* precompute kernel matrices */
-	CCustomKernel* precomputed_p=new CCustomKernel(m_kernel_p);
-	CCustomKernel* precomputed_q=new CCustomKernel(m_kernel_q);
-	SG_REF(precomputed_p);
-	SG_REF(precomputed_q);
-
-	/* temporarily replace own kernels */
-	m_kernel_p=precomputed_p;
-	m_kernel_q=precomputed_q;
-
-	/* use superclass sample_null which shuffles the entries for one
-	 * distribution using index permutation on rows and columns of
-	 * kernel matrix from one distribution, while accessing the other
-	 * in its original order and then compute statistic */
-	SGVector<float64_t> null_samples=CKernelIndependenceTest::sample_null();
-
-	/* restore kernels */
-	m_kernel_p=kernel_p;
-	m_kernel_q=kernel_q;
-
-	SG_UNREF(precomputed_p);
-	SG_UNREF(precomputed_q);
-
-	SG_DEBUG("Leaving!\n")
-	return null_samples;
-}
diff --git a/src/shogun/statistics/NOCCO.h b/src/shogun/statistics/NOCCO.h
deleted file mode 100644
index 7989a95ea55..00000000000
--- a/src/shogun/statistics/NOCCO.h
+++ /dev/null
@@ -1,224 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * Written (w) 2012-2013 Heiko Strathmann
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef NOCCO_H_
-#define NOCCO_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/KernelIndependenceTest.h>
-
-namespace shogun
-{
-
-template<class T> class SGMatrix;
-
-/** @brief This class implements the NOrmalized Cross Covariance Operator
- * (NOCCO) based independence test as described in [1].
- *
- * The test of independence is performed as follows: Given samples \f$Z=\{(x_i,
- * y_i)\}_{i=1}^n\f$ from the joint distribution \f$\textbf{P}_{XY}\f$,
- * does the joint distribution factorize as \f$\textbf{P}_{XY}=\textbf{P}_X
- * \textbf{P}_Y\f$? The null hypothesis says yes and the alternative hypothesis
- * says no.
- *
- * The dependence of the random variables \f$\mathbf X=\{x_i\}\f$ and \f$
- * \mathbf Y=\{y_i\}\f$ can be measured via the cross-covariance operator
- * \f$\boldsymbol\Sigma_{YX}\f$ which becomes \f$\mathbf{0}\f$ if and only if
- * \f$\mathbf X\f$ and \f$\mathbf Y\f$ are independent. This term factorizes as
- * \f$\boldsymbol\Sigma_{YX}=\boldsymbol\Sigma_{YY}^{\frac{1}{2}}\mathbf{V}_
- * {YX}\boldsymbol\Sigma_{XX}^{\frac{1}{2}}\f$, where \f$\boldsymbol\Sigma_
- * {XX}\f$ and \f$\boldsymbol\Sigma_{YY}\f$ are known as covariance operator and
- * \f$\mathbf{V}_{YX}\f$ is known as normalized cross-covariance operator. The
- * paper uses the Hilbert-Schmidt norm of \f$\mathbf V_{YX}\f$ as a dependence
- * measure of the independence test (see paper for theroretical details).
- *
- * This class overrides the compute_statistic() method of the superclass which
- * computes an unbiased estimate of the normalized cross covariance operator
- * norm. Given the kernels \f$K\f$ (for \f$\mathbf X\f$) and \f$L\f$ (for
- * \f$\mathbf Y\f$), if we denote the doubly centered Gram matrices as
- * \f$\mathbf{G}_X=\mathbf{HKH}\f$ and \f$\mathbf{G}_Y=\mathbf{HLH}\f$
- * (where \f$\mathbf H=\mathbf I-\frac{1}{n}\mathbf{1}\f$), then the operator
- * norm is estimated as
- * \f[
- *	\hat{I}^{\text{NOCCO}}=\text{Trace}\left[\mathbf{R_X R_Y}\right]
- * \f]
- * where \f$\mathbf{R}_X=\mathbf{G}_X(\mathbf{G}_X+n\varepsilon_n\mathbf{I})
- * ^{-1}\f$ and \f$\mathbf{R}_Y=\mathbf{G}_Y(\mathbf{G}_Y+n\varepsilon_n
- * \mathbf{I})^{-1}\f$ and \f$\varepsilon_n\gt 0\f$ is a regularization
- * constant.
- *
- * In order to avoid computing direct inverse in the above terms for avoiding
- * numerical issues, this class uses Cholesky decomposition of matrices
- * \f$\mathbf{GG}_*=\mathbf{LL}^\top\f$ (where \f$\mathbf{GG}_*=(\mathbf{G}_*+
- * n\varepsilon_n\mathbf{I})^{-1}\f$) and solve systems \f$\mathbf{GG}_*
- * \mathbf x_i=\mathbf{LL}^\top\mathbf x_i=\mathbf e_i\f$ (\f$\mathbf e_i\f$
- * being the \f$i^{\text{th}}\f$ column of \f$\mathbf I_n\f$) one by one. On
- * the fly it then uses the solution vectors \f$\mathbf x_i\f$ to compute the
- * matrix-matrix product \f$\mathbf C_*=\mathbf G_*\mathbf{GG}_*^{-1}\f$
- * using \f$\mathbf C_{*,(j,i)}=\mathbf G_{*,j}\cdot \mathbf x_i\f$, where
- * \f$\mathbf G_{*,j}\f$ is the \f$j^{\text{th}}\f$ row of \f$\mathbf G_*\f$ (or
- * column, since it is symmetric) and then discarding the vector.
- *
- * The final trace computation is also simplified using the symmetry of the
- * matrices \f$\mathbf R_X\f$ and \f$\mathbf R_Y\f$. Computation of the off-
- * diagonal elements are avoided using
- * \f[
- *	\text{Trace}\left[\mathbf R_X \mathbf R_Y\right ]=\sum_{i=1}^n
- *	\mathbf R_X^i\cdot \mathbf R_Y^i
- * \f]
- *
- * For performing the independence test, PERMUTATION test is used by first
- * randomly shuffling the samples from one of the distributions while keeping
- * the samples from the other distribution in the original order. This way we
- * sample the null distribution and compute p-value and threshold for a given
- * test power.
- *
- * [1]: Kenji Fukumizu, Arthur Gretton, Xiaohai Sun, Bernhard Scholkopf:
- * Kernel Measures of Conditional Dependence. NIPS 2007
- */
-class CNOCCO : public CKernelIndependenceTest
-{
-public:
-	/** Constructor */
-	CNOCCO();
-
-	/** Constructor.
-	 *
-	 * Initializes the kernels and features from the two distributions and
-	 * SG_REFs them
-	 *
-	 * @param kernel_p kernel to use on samples from p
-	 * @param kernel_q kernel to use on samples from q
-	 * @param p samples from distribution p
-	 * @param q samples from distribution q
-	 */
-	CNOCCO(CKernel* kernel_p, CKernel* kernel_q, CFeatures* p, CFeatures* q);
-
-	/** Destructor */
-	virtual ~CNOCCO();
-
-	/** Computes the NOCCO statistic (see class description) for underlying
-	 * kernels and data.
-	 *
-	 * Note that since kernel matrices have to be stored, it has quadratic
-	 * space costs.
-	 *
-	 * @return unbiased estimate of NOCCO
-	 */
-	virtual float64_t compute_statistic();
-
-	/** Computes a p-value based on current method for approximating the
-	 * null-distribution. The p-value is the 1-p quantile of the null-
-	 * distribution where the given statistic lies in.
-	 *
-	 * @param statistic statistic value to compute the p-value for
-	 * @return p-value parameter statistic is the (1-p) percentile of the
-	 * null distribution
-	 */
-	virtual float64_t compute_p_value(float64_t statistic);
-
-	/** Computes a threshold based on current method for approximating the
-	 * null-distribution. The threshold is the value that a statistic has
-	 * to have in ordner to reject the null-hypothesis.
-	 *
-	 * @param alpha test level to reject null-hypothesis
-	 * @return threshold for statistics to reject null-hypothesis
-	 */
-	virtual float64_t compute_threshold(float64_t alpha);
-
-	/** @return the class name */
-	virtual const char* get_name() const
-	{
-		return "NOCCO";
-	}
-
-	/** @return the statistic type of this test statistic */
-	virtual EStatisticType get_statistic_type() const
-	{
-		return S_NOCCO;
-	}
-
-	/** Setter for features from distribution p, SG_REFs it
-	 *
-	 * @param p features from p
-	 */
-	virtual void set_p(CFeatures* p);
-
-	/** Setter for features from distribution q, SG_REFs it
-	 *
-	 * @param q features from q
-	 */
-	virtual void set_q(CFeatures* q);
-
-	/**
-	 * Setter for regularization parameter epsilon
-	 * @param epsilon the regularization parameter
-	 */
-	void set_epsilon(float64_t epsilon);
-
-	/** @return epsilon the regularization parameter */
-	float64_t get_epsilon() const;
-
-	/** Merges both sets of samples and computes the test statistic
-	 * m_num_null_sample times. This version precomputes the kenrel matrix
-	 * once by hand, then samples using this one. The matrix has
-	 * to be stored anyway when statistic is computed.
-	 *
-	 * @return vector of all statistics
-	 */
-	virtual SGVector<float64_t> sample_null();
-
-protected:
-	/**
-	 * Helper method which computes the matrix times matrix inverse using LLT
-	 * solve (Cholesky) withoout storing the inverse (see class documentation).
-	 *
-	 * @param m the centered Gram matrix
-	 * @return the result matrix of the multiplication
-	 */
-	SGMatrix<float64_t> compute_helper(SGMatrix<float64_t> m);
-
-private:
-	/** Register parameters and initialize with defaults */
-	void init();
-
-	/** Number of features from the distributions (should be equal for both) */
-	index_t m_num_features;
-
-	/** The regularization constant */
-	float64_t m_epsilon;
-
-};
-
-}
-
-#endif // NOCCO_H_
diff --git a/src/shogun/statistics/QuadraticTimeMMD.cpp b/src/shogun/statistics/QuadraticTimeMMD.cpp
deleted file mode 100644
index fff724e5e54..00000000000
--- a/src/shogun/statistics/QuadraticTimeMMD.cpp
+++ /dev/null
@@ -1,1115 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/QuadraticTimeMMD.h>
-#include <shogun/features/Features.h>
-#include <shogun/mathematics/Math.h>
-#include <shogun/mathematics/Statistics.h>
-#include <shogun/kernel/Kernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <shogun/kernel/CustomKernel.h>
-
-using namespace shogun;
-
-#include <shogun/mathematics/eigen3.h>
-
-using namespace Eigen;
-
-CQuadraticTimeMMD::CQuadraticTimeMMD() : CKernelTwoSampleTest()
-{
-	init();
-}
-
-CQuadraticTimeMMD::CQuadraticTimeMMD(CKernel* kernel, CFeatures* p_and_q,
-		index_t m) :
-		CKernelTwoSampleTest(kernel, p_and_q, m)
-{
-	init();
-}
-
-CQuadraticTimeMMD::CQuadraticTimeMMD(CKernel* kernel, CFeatures* p,
-		CFeatures* q) : CKernelTwoSampleTest(kernel, p, q)
-{
-	init();
-}
-
-CQuadraticTimeMMD::CQuadraticTimeMMD(CCustomKernel* custom_kernel, index_t m) :
-		CKernelTwoSampleTest(custom_kernel, NULL, m)
-{
-	init();
-}
-
-CQuadraticTimeMMD::~CQuadraticTimeMMD()
-{
-}
-
-void CQuadraticTimeMMD::init()
-{
-	SG_ADD(&m_num_samples_spectrum, "num_samples_spectrum", "Number of samples"
-			" for spectrum method null-distribution approximation",
-			MS_NOT_AVAILABLE);
-	SG_ADD(&m_num_eigenvalues_spectrum, "num_eigenvalues_spectrum", "Number of "
-			" Eigenvalues for spectrum method null-distribution approximation",
-			MS_NOT_AVAILABLE);
-	SG_ADD((machine_int_t*)&m_statistic_type, "statistic_type",
-			"Biased or unbiased MMD", MS_NOT_AVAILABLE);
-
-	m_num_samples_spectrum=0;
-	m_num_eigenvalues_spectrum=0;
-	m_statistic_type=UNBIASED;
-}
-
-SGVector<float64_t> CQuadraticTimeMMD::compute_unbiased_statistic_variance(
-		int m, int n)
-{
-	SG_DEBUG("Entering!\n");
-
-	/* init kernel with features. NULL check is handled in compute_statistic */
-	m_kernel->init(m_p_and_q, m_p_and_q);
-
-	/* computing kernel values and their sums on the fly that are used both in
-	   computing statistic and variance */
-
-	/* the following matrix stores row-wise sum of kernel values k(X,X') in
-	   the first column and row-wise squared sum of kernel values k^2(X,X')
-	   in the second column. m entries in both column */
-	SGMatrix<float64_t> xx_sum_sq_sum_rowwise=m_kernel->
-		row_wise_sum_squared_sum_symmetric_block(0, m);
-
-	/* row-wise sum of kernel values k(Y,Y'), n entries */
-	SGVector<float64_t> yy_sum_rowwise=m_kernel->
-		row_wise_sum_symmetric_block(m, n);
-
-	/* row-wise and col-wise sum of kernel values k(X,Y), m+n entries
-	   first m entries are row-wise sum, rest n entries are col-wise sum */
-	SGVector<float64_t> xy_sum_rowcolwise=m_kernel->
-		row_col_wise_sum_block(0, m, m, n);
-
-	/* computing overall sum and squared sum from above for convenience */
-
-	SGVector<float64_t> xx_sum_rowwise(m);
-	std::copy(xx_sum_sq_sum_rowwise.matrix, xx_sum_sq_sum_rowwise.matrix+m,
-			xx_sum_rowwise.vector);
-
-	SGVector<float64_t> xy_sum_rowwise(m);
-	std::copy(xy_sum_rowcolwise.vector, xy_sum_rowcolwise.vector+m,
-			xy_sum_rowwise.vector);
-
-	SGVector<float64_t> xy_sum_colwise(n);
-	std::copy(xy_sum_rowcolwise.vector+m, xy_sum_rowcolwise.vector+m+n,
-			xy_sum_colwise.vector);
-
-	float64_t xx_sq_sum=0.0;
-	for (index_t i=0; i<xx_sum_sq_sum_rowwise.num_rows; i++)
-		xx_sq_sum+=xx_sum_sq_sum_rowwise(i, 1);
-
-	float64_t xx_sum=0.0;
-	for (index_t i=0; i<xx_sum_rowwise.vlen; i++)
-		xx_sum+=xx_sum_rowwise[i];
-
-	float64_t yy_sum=0.0;
-	for (index_t i=0; i<yy_sum_rowwise.vlen; i++)
-		yy_sum+=yy_sum_rowwise[i];
-
-	float64_t xy_sum=0.0;
-	for (index_t i=0; i<xy_sum_rowwise.vlen; i++)
-		xy_sum+=xy_sum_rowwise[i];
-
-	/* compute statistic */
-
-	/* split computations into three terms from JLMR paper (see documentation )*/
-
-	/* first term */
-	float64_t first=xx_sum/m/(m-1);
-
-	/* second term */
-	float64_t second=yy_sum/n/(n-1);
-
-	/* third term */
-	float64_t third=2.0*xy_sum/m/n;
-
-	float64_t statistic=first+second-third;
-
-	SG_INFO("Computed statistic!\n");
-	SG_DEBUG("first=%f, second=%f, third=%f\n", first, second, third);
-
-	/* compute variance under null */
-
-	/* split computations into three terms (see documentation) */
-
-	/* first term */
-	float64_t kappa_0=CMath::sq(xx_sum/m/(m-1));
-
-	/* second term */
-	float64_t kappa_1=0.0;
-	for (index_t i=0; i<m; ++i)
-		kappa_1+=CMath::sq(xx_sum_rowwise[i]/(m-1));
-	kappa_1/=m;
-
-	/* third term */
-	float64_t kappa_2=xx_sq_sum/m/(m-1);
-
-	float64_t var_null=2*(kappa_0-2*kappa_1+kappa_2);
-
-	SG_INFO("Computed variance under null!\n");
-	SG_DEBUG("kappa_0=%f, kappa_1=%f, kappa_2=%f\n", kappa_0, kappa_1, kappa_2);
-
-	/* compute variance under alternative */
-
-	/* split computations into four terms (see documentation) */
-
-	/* first term */
-	float64_t alt_var_first=0.0;
-	for (index_t i=0; i<m; ++i)
-	{
-		// use row-wise sum from k(X,X') and k(X,Y) blocks
-		float64_t term=xx_sum_rowwise[i]/(m-1)-xy_sum_rowwise[i]/n;
-		alt_var_first+=CMath::sq(term);
-	}
-	alt_var_first/=m;
-
-	/* second term */
-	float64_t alt_var_second=CMath::sq(xx_sum/m/(m-1)-xy_sum/m/n);
-
-	/* third term */
-	float64_t alt_var_third=0.0;
-	for (index_t i=0; i<n; ++i)
-	{
-		// use row-wise sum from k(Y,Y') and col-wise sum from k(X,Y)
-		// blocks to simulate row-wise sum from k(Y,X) blocks
-		float64_t term=yy_sum_rowwise[i]/(n-1)-xy_sum_colwise[i]/m;
-		alt_var_third+=CMath::sq(term);
-	}
-	alt_var_third/=n;
-
-	/* fourth term */
-	float64_t alt_var_fourth=CMath::sq(yy_sum/n/(n-1)-xy_sum/m/n);
-
-	/* finally computing variance */
-	float64_t rho_x=float64_t(m)/(m+n);
-	float64_t rho_y=float64_t(n)/(m+n);
-
-	float64_t var_alt=4*rho_x*(alt_var_first-alt_var_second)+
-		4*rho_y*(alt_var_third-alt_var_fourth);
-
-	SG_INFO("Computed variance under alternative!\n");
-	SG_DEBUG("first=%f, second=%f, third=%f, fourth=%f\n", alt_var_first,
-			alt_var_second, alt_var_third, alt_var_fourth);
-
-	SGVector<float64_t> results(3);
-	results[0]=statistic;
-	results[1]=var_null;
-	results[2]=var_alt;
-
-	SG_DEBUG("Leaving!\n");
-
-	return results;
-}
-
-SGVector<float64_t> CQuadraticTimeMMD::compute_biased_statistic_variance(int m, int n)
-{
-	SG_DEBUG("Entering!\n");
-
-	/* init kernel with features. NULL check is handled in compute_statistic */
-	m_kernel->init(m_p_and_q, m_p_and_q);
-
-	/* computing kernel values and their sums on the fly that are used both in
-	   computing statistic and variance */
-
-	/* the following matrix stores row-wise sum of kernel values k(X,X') in
-	   the first column and row-wise squared sum of kernel values k^2(X,X')
-	   in the second column. m entries in both column */
-	SGMatrix<float64_t> xx_sum_sq_sum_rowwise=m_kernel->
-		row_wise_sum_squared_sum_symmetric_block(0, m, false);
-
-	/* row-wise sum of kernel values k(Y,Y'), n entries */
-	SGVector<float64_t> yy_sum_rowwise=m_kernel->
-		row_wise_sum_symmetric_block(m, n, false);
-
-	/* row-wise and col-wise sum of kernel values k(X,Y), m+n entries
-	   first m entries are row-wise sum, rest n entries are col-wise sum */
-	SGVector<float64_t> xy_sum_rowcolwise=m_kernel->
-		row_col_wise_sum_block(0, m, m, n);
-
-	/* computing overall sum and squared sum from above for convenience */
-
-	SGVector<float64_t> xx_sum_rowwise(m);
-	std::copy(xx_sum_sq_sum_rowwise.matrix, xx_sum_sq_sum_rowwise.matrix+m,
-			xx_sum_rowwise.vector);
-
-	SGVector<float64_t> xy_sum_rowwise(m);
-	std::copy(xy_sum_rowcolwise.vector, xy_sum_rowcolwise.vector+m,
-			xy_sum_rowwise.vector);
-
-	SGVector<float64_t> xy_sum_colwise(n);
-	std::copy(xy_sum_rowcolwise.vector+m, xy_sum_rowcolwise.vector+m+n,
-			xy_sum_colwise.vector);
-
-	float64_t xx_sq_sum=0.0;
-	for (index_t i=0; i<xx_sum_sq_sum_rowwise.num_rows; i++)
-		xx_sq_sum+=xx_sum_sq_sum_rowwise(i, 1);
-
-	float64_t xx_sum=0.0;
-	for (index_t i=0; i<xx_sum_rowwise.vlen; i++)
-		xx_sum+=xx_sum_rowwise[i];
-
-	float64_t yy_sum=0.0;
-	for (index_t i=0; i<yy_sum_rowwise.vlen; i++)
-		yy_sum+=yy_sum_rowwise[i];
-
-	float64_t xy_sum=0.0;
-	for (index_t i=0; i<xy_sum_rowwise.vlen; i++)
-		xy_sum+=xy_sum_rowwise[i];
-
-	/* compute statistic */
-
-	/* split computations into three terms from JLMR paper (see documentation )*/
-
-	/* first term */
-	float64_t first=xx_sum/m/m;
-
-	/* second term */
-	float64_t second=yy_sum/n/n;
-
-	/* third term */
-	float64_t third=2.0*xy_sum/m/n;
-
-	float64_t statistic=first+second-third;
-
-	SG_INFO("Computed statistic!\n");
-	SG_DEBUG("first=%f, second=%f, third=%f\n", first, second, third);
-
-	/* compute variance under null */
-
-	/* split computations into three terms (see documentation) */
-
-	/* first term */
-	float64_t kappa_0=CMath::sq(xx_sum/m/m);
-
-	/* second term */
-	float64_t kappa_1=0.0;
-	for (index_t i=0; i<m; ++i)
-		kappa_1+=CMath::sq(xx_sum_rowwise[i]/m);
-	kappa_1/=m;
-
-	/* third term */
-	float64_t kappa_2=xx_sq_sum/m/m;
-
-	float64_t var_null=2*(kappa_0-2*kappa_1+kappa_2);
-
-	SG_INFO("Computed variance under null!\n");
-	SG_DEBUG("kappa_0=%f, kappa_1=%f, kappa_2=%f\n", kappa_0, kappa_1, kappa_2);
-
-	/* compute variance under alternative */
-
-	/* split computations into four terms (see documentation) */
-
-	/* first term */
-	float64_t alt_var_first=0.0;
-	for (index_t i=0; i<m; ++i)
-	{
-		// use row-wise sum from k(X,X') and k(X,Y) blocks
-		float64_t term=xx_sum_rowwise[i]/m-xy_sum_rowwise[i]/n;
-		alt_var_first+=CMath::sq(term);
-	}
-	alt_var_first/=m;
-
-	/* second term */
-	float64_t alt_var_second=CMath::sq(xx_sum/m/m-xy_sum/m/n);
-
-	/* third term */
-	float64_t alt_var_third=0.0;
-	for (index_t i=0; i<n; ++i)
-	{
-		// use row-wise sum from k(Y,Y') and col-wise sum from k(X,Y)
-		// blocks to simulate row-wise sum from k(Y,X) blocks
-		float64_t term=yy_sum_rowwise[i]/n-xy_sum_colwise[i]/m;
-		alt_var_third+=CMath::sq(term);
-	}
-	alt_var_third/=n;
-
-	/* fourth term */
-	float64_t alt_var_fourth=CMath::sq(yy_sum/n/n-xy_sum/m/n);
-
-	/* finally computing variance */
-	float64_t rho_x=float64_t(m)/(m+n);
-	float64_t rho_y=float64_t(n)/(m+n);
-
-	float64_t var_alt=4*rho_x*(alt_var_first-alt_var_second)+
-		4*rho_y*(alt_var_third-alt_var_fourth);
-
-	SG_INFO("Computed variance under alternative!\n");
-	SG_DEBUG("first=%f, second=%f, third=%f, fourth=%f\n", alt_var_first,
-			alt_var_second, alt_var_third, alt_var_fourth);
-
-	SGVector<float64_t> results(3);
-	results[0]=statistic;
-	results[1]=var_null;
-	results[2]=var_alt;
-
-	SG_DEBUG("Leaving!\n");
-
-	return results;
-}
-
-SGVector<float64_t> CQuadraticTimeMMD::compute_incomplete_statistic_variance(int n)
-{
-	SG_DEBUG("Entering!\n");
-
-	/* init kernel with features. NULL check is handled in compute_statistic */
-	m_kernel->init(m_p_and_q, m_p_and_q);
-
-	/* computing kernel values and their sums on the fly that are used both in
-	   computing statistic and variance */
-
-	/* the following matrix stores row-wise sum of kernel values k(X,X') in
-	   the first column and row-wise squared sum of kernel values k^2(X,X')
-	   in the second column. n entries in both column */
-	SGMatrix<float64_t> xx_sum_sq_sum_rowwise=m_kernel->
-		row_wise_sum_squared_sum_symmetric_block(0, n);
-
-	/* row-wise sum of kernel values k(Y,Y'), n entries */
-	SGVector<float64_t> yy_sum_rowwise=m_kernel->
-		row_wise_sum_symmetric_block(n, n);
-
-	/* row-wise and col-wise sum of kernel values k(X,Y), 2n entries
-	   first n entries are row-wise sum, rest n entries are col-wise sum */
-	SGVector<float64_t> xy_sum_rowcolwise=m_kernel->
-		row_col_wise_sum_block(0, n, n, n, true);
-
-	/* computing overall sum and squared sum from above for convenience */
-
-	SGVector<float64_t> xx_sum_rowwise(n);
-	std::copy(xx_sum_sq_sum_rowwise.matrix, xx_sum_sq_sum_rowwise.matrix+n,
-			xx_sum_rowwise.vector);
-
-	SGVector<float64_t> xy_sum_rowwise(n);
-	std::copy(xy_sum_rowcolwise.vector, xy_sum_rowcolwise.vector+n,
-			xy_sum_rowwise.vector);
-
-	SGVector<float64_t> xy_sum_colwise(n);
-	std::copy(xy_sum_rowcolwise.vector+n, xy_sum_rowcolwise.vector+2*n,
-			xy_sum_colwise.vector);
-
-	float64_t xx_sq_sum=0.0;
-	for (index_t i=0; i<xx_sum_sq_sum_rowwise.num_rows; i++)
-		xx_sq_sum+=xx_sum_sq_sum_rowwise(i, 1);
-
-	float64_t xx_sum=0.0;
-	for (index_t i=0; i<xx_sum_rowwise.vlen; i++)
-		xx_sum+=xx_sum_rowwise[i];
-
-	float64_t yy_sum=0.0;
-	for (index_t i=0; i<yy_sum_rowwise.vlen; i++)
-		yy_sum+=yy_sum_rowwise[i];
-
-	float64_t xy_sum=0.0;
-	for (index_t i=0; i<xy_sum_rowwise.vlen; i++)
-		xy_sum+=xy_sum_rowwise[i];
-
-	/* compute statistic */
-
-	/* split computations into three terms from JLMR paper (see documentation )*/
-
-	/* first term */
-	float64_t first=xx_sum/n/(n-1);
-
-	/* second term */
-	float64_t second=yy_sum/n/(n-1);
-
-	/* third term */
-	float64_t third=2.0*xy_sum/n/(n-1);
-
-	float64_t statistic=first+second-third;
-
-	SG_INFO("Computed statistic!\n");
-	SG_DEBUG("first=%f, second=%f, third=%f\n", first, second, third);
-
-	/* compute variance under null */
-
-	/* split computations into three terms (see documentation) */
-
-	/* first term */
-	float64_t kappa_0=CMath::sq(xx_sum/n/(n-1));
-
-	/* second term */
-	float64_t kappa_1=0.0;
-	for (index_t i=0; i<n; ++i)
-		kappa_1+=CMath::sq(xx_sum_rowwise[i]/(n-1));
-	kappa_1/=n;
-
-	/* third term */
-	float64_t kappa_2=xx_sq_sum/n/(n-1);
-
-	float64_t var_null=2*(kappa_0-2*kappa_1+kappa_2);
-
-	SG_INFO("Computed variance under null!\n");
-	SG_DEBUG("kappa_0=%f, kappa_1=%f, kappa_2=%f\n", kappa_0, kappa_1, kappa_2);
-
-	/* compute variance under alternative */
-
-	/* split computations into four terms (see documentation) */
-
-	/* first term */
-	float64_t alt_var_first=0.0;
-	for (index_t i=0; i<n; ++i)
-	{
-		// use row-wise sum from k(X,X') and k(X,Y) blocks
-		float64_t term=(xx_sum_rowwise[i]-xy_sum_rowwise[i])/(n-1);
-		alt_var_first+=CMath::sq(term);
-	}
-	alt_var_first/=n;
-
-	/* second term */
-	float64_t alt_var_second=CMath::sq(xx_sum/n/(n-1)-xy_sum/n/(n-1));
-
-	/* third term */
-	float64_t alt_var_third=0.0;
-	for (index_t i=0; i<n; ++i)
-	{
-		// use row-wise sum from k(Y,Y') and col-wise sum from k(X,Y)
-		// blocks to simulate row-wise sum from k(Y,X) blocks
-		float64_t term=(yy_sum_rowwise[i]-xy_sum_colwise[i])/(n-1);
-		alt_var_third+=CMath::sq(term);
-	}
-	alt_var_third/=n;
-
-	/* fourth term */
-	float64_t alt_var_fourth=CMath::sq(yy_sum/n/(n-1)-xy_sum/n/(n-1));
-
-	/* finally computing variance */
-	float64_t rho_x=0.5;
-	float64_t rho_y=0.5;
-
-	float64_t var_alt=4*rho_x*(alt_var_first-alt_var_second)+
-		4*rho_y*(alt_var_third-alt_var_fourth);
-
-	SG_INFO("Computed variance under alternative!\n");
-	SG_DEBUG("first=%f, second=%f, third=%f, fourth=%f\n", alt_var_first,
-			alt_var_second, alt_var_third, alt_var_fourth);
-
-	SGVector<float64_t> results(3);
-	results[0]=statistic;
-	results[1]=var_null;
-	results[2]=var_alt;
-
-	SG_DEBUG("Leaving!\n");
-
-	return results;
-}
-
-float64_t CQuadraticTimeMMD::compute_unbiased_statistic(int m, int n)
-{
-	return compute_unbiased_statistic_variance(m, n)[0];
-}
-
-float64_t CQuadraticTimeMMD::compute_biased_statistic(int m, int n)
-{
-	return compute_biased_statistic_variance(m, n)[0];
-}
-
-float64_t CQuadraticTimeMMD::compute_incomplete_statistic(int n)
-{
-	return compute_incomplete_statistic_variance(n)[0];
-}
-
-float64_t CQuadraticTimeMMD::compute_statistic()
-{
-	REQUIRE(m_kernel, "No kernel specified!\n")
-
-	index_t m=m_m;
-	index_t n=0;
-
-	/* check if kernel is precomputed (custom kernel) */
-	if (m_kernel->get_kernel_type()==K_CUSTOM)
-		n=m_kernel->get_num_vec_lhs()-m;
-	else
-	{
-		REQUIRE(m_p_and_q, "The samples are not initialized!\n");
-		n=m_p_and_q->get_num_vectors()-m;
-	}
-
-	SG_DEBUG("Computing MMD with %d samples from p and %d samples from q!\n",
-			m, n);
-
-	float64_t result=0;
-	switch (m_statistic_type)
-	{
-	case UNBIASED:
-		result=compute_unbiased_statistic(m, n);
-		result*=m*n/float64_t(m+n);
-		break;
-	case UNBIASED_DEPRECATED:
-		result=compute_unbiased_statistic(m, n);
-		result*=m==n ? m : (m+n);
-		break;
-	case BIASED:
-		result=compute_biased_statistic(m, n);
-		result*=m*n/float64_t(m+n);
-		break;
-	case BIASED_DEPRECATED:
-		result=compute_biased_statistic(m, n);
-		result*=m==n? m : (m+n);
-		break;
-	case INCOMPLETE:
-		REQUIRE(m==n, "Only possible with equal number of samples from both"
-				"distribution!\n")
-		result=compute_incomplete_statistic(n);
-		result*=n/2;
-		break;
-	default:
-		SG_ERROR("Unknown statistic type!\n");
-		break;
-	}
-
-	return result;
-}
-
-SGVector<float64_t> CQuadraticTimeMMD::compute_variance()
-{
-	REQUIRE(m_kernel, "No kernel specified!\n")
-
-	index_t m=m_m;
-	index_t n=0;
-
-	/* check if kernel is precomputed (custom kernel) */
-	if (m_kernel->get_kernel_type()==K_CUSTOM)
-		n=m_kernel->get_num_vec_lhs()-m;
-	else
-	{
-		REQUIRE(m_p_and_q, "The samples are not initialized!\n");
-		n=m_p_and_q->get_num_vectors()-m;
-	}
-
-	SG_DEBUG("Computing MMD with %d samples from p and %d samples from q!\n",
-			m, n);
-
-	SGVector<float64_t> result(2);
-	switch (m_statistic_type)
-	{
-	case UNBIASED:
-	case UNBIASED_DEPRECATED:
-	{
-		SGVector<float64_t> res=compute_unbiased_statistic_variance(m, n);
-		result[0]=res[1];
-		result[1]=res[2];
-		break;
-	}
-	case BIASED:
-	case BIASED_DEPRECATED:
-	{
-		SGVector<float64_t> res=compute_biased_statistic_variance(m, n);
-		result[0]=res[1];
-		result[1]=res[2];
-		break;
-	}
-	case INCOMPLETE:
-	{
-		REQUIRE(m==n, "Only possible with equal number of samples from both"
-				"distribution!\n")
-		SGVector<float64_t> res=compute_incomplete_statistic_variance(n);
-		result[0]=res[1];
-		result[1]=res[2];
-		break;
-	}
-	default:
-		SG_ERROR("Unknown statistic type!\n");
-		break;
-	}
-
-	return result;
-}
-
-float64_t CQuadraticTimeMMD::compute_variance_under_null()
-{
-	return compute_variance()[0];
-}
-
-float64_t CQuadraticTimeMMD::compute_variance_under_alternative()
-{
-	return compute_variance()[1];
-}
-
-SGVector<float64_t> CQuadraticTimeMMD::compute_statistic(bool multiple_kernels)
-{
-	SGVector<float64_t> mmds;
-	if (!multiple_kernels)
-	{
-		/* just one mmd result */
-		mmds=SGVector<float64_t>(1);
-		mmds[0]=compute_statistic();
-	}
-	else
-	{
-		REQUIRE(m_kernel, "No kernel specified!\n")
-		REQUIRE(m_kernel->get_kernel_type()==K_COMBINED,
-			"multiple kernels specified, but underlying kernel is not of type "
-			"K_COMBINED\n");
-
-		/* cast and allocate memory for results */
-		CCombinedKernel* combined=(CCombinedKernel*)m_kernel;
-		SG_REF(combined);
-		mmds=SGVector<float64_t>(combined->get_num_subkernels());
-
-		/* iterate through all kernels and compute statistic */
-		/* TODO this might be done in parallel */
-		for (index_t i=0; i<mmds.vlen; ++i)
-		{
-			CKernel* current=combined->get_kernel(i);
-			/* temporarily replace underlying kernel and compute statistic */
-			m_kernel=current;
-			mmds[i]=compute_statistic();
-
-			SG_UNREF(current);
-		}
-
-		/* restore combined kernel */
-		m_kernel=combined;
-		SG_UNREF(combined);
-	}
-
-	return mmds;
-}
-
-SGMatrix<float64_t> CQuadraticTimeMMD::compute_variance(bool multiple_kernels)
-{
-	SGMatrix<float64_t> vars;
-	if (!multiple_kernels)
-	{
-		/* just one mmd result */
-		vars=SGMatrix<float64_t>(1, 2);
-		SGVector<float64_t> result=compute_variance();
-		vars(0, 0)=result[0];
-		vars(0, 1)=result[1];
-	}
-	else
-	{
-		REQUIRE(m_kernel, "No kernel specified!\n")
-		REQUIRE(m_kernel->get_kernel_type()==K_COMBINED,
-			"multiple kernels specified, but underlying kernel is not of type "
-			"K_COMBINED\n");
-
-		/* cast and allocate memory for results */
-		CCombinedKernel* combined=(CCombinedKernel*)m_kernel;
-		SG_REF(combined);
-		vars=SGMatrix<float64_t>(combined->get_num_subkernels(), 2);
-
-		/* iterate through all kernels and compute variance */
-		/* TODO this might be done in parallel */
-		for (index_t i=0; i<vars.num_rows; ++i)
-		{
-			CKernel* current=combined->get_kernel(i);
-			/* temporarily replace underlying kernel and compute variance */
-			m_kernel=current;
-			SGVector<float64_t> result=compute_variance();
-			vars(i, 0)=result[0];
-			vars(i, 1)=result[1];
-
-			SG_UNREF(current);
-		}
-
-		/* restore combined kernel */
-		m_kernel=combined;
-		SG_UNREF(combined);
-	}
-
-	return vars;
-}
-
-float64_t CQuadraticTimeMMD::compute_p_value(float64_t statistic)
-{
-	float64_t result=0;
-
-	switch (m_null_approximation_method)
-	{
-	case MMD2_SPECTRUM:
-	{
-		/* get samples from null-distribution and compute p-value of statistic */
-		SGVector<float64_t> null_samples=sample_null_spectrum(
-				m_num_samples_spectrum, m_num_eigenvalues_spectrum);
-		CMath::qsort(null_samples);
-		index_t pos=null_samples.find_position_to_insert(statistic);
-		result=1.0-((float64_t)pos)/null_samples.vlen;
-		break;
-	}
-
-	case MMD2_SPECTRUM_DEPRECATED:
-	{
-		/* get samples from null-distribution and compute p-value of statistic */
-		SGVector<float64_t> null_samples=sample_null_spectrum_DEPRECATED(
-				m_num_samples_spectrum, m_num_eigenvalues_spectrum);
-		CMath::qsort(null_samples);
-		index_t pos=null_samples.find_position_to_insert(statistic);
-		result=1.0-((float64_t)pos)/null_samples.vlen;
-		break;
-	}
-
-	case MMD2_GAMMA:
-	{
-		/* fit gamma and return cdf at statistic */
-		SGVector<float64_t> params=fit_null_gamma();
-		result=CStatistics::gamma_cdf(statistic, params[0], params[1]);
-		break;
-	}
-
-	default:
-		result=CKernelTwoSampleTest::compute_p_value(statistic);
-		break;
-	}
-
-	return result;
-}
-
-float64_t CQuadraticTimeMMD::compute_threshold(float64_t alpha)
-{
-	float64_t result=0;
-
-	switch (m_null_approximation_method)
-	{
-	case MMD2_SPECTRUM:
-	{
-		/* get samples from null-distribution and compute threshold */
-		SGVector<float64_t> null_samples=sample_null_spectrum(
-				m_num_samples_spectrum, m_num_eigenvalues_spectrum);
-		CMath::qsort(null_samples);
-		result=null_samples[index_t(CMath::floor(null_samples.vlen*(1-alpha)))];
-		break;
-	}
-
-	case MMD2_SPECTRUM_DEPRECATED:
-	{
-		/* get samples from null-distribution and compute threshold */
-		SGVector<float64_t> null_samples=sample_null_spectrum_DEPRECATED(
-				m_num_samples_spectrum, m_num_eigenvalues_spectrum);
-		CMath::qsort(null_samples);
-		result=null_samples[index_t(CMath::floor(null_samples.vlen*(1-alpha)))];
-		break;
-	}
-
-	case MMD2_GAMMA:
-	{
-		/* fit gamma and return inverse cdf at alpha */
-		SGVector<float64_t> params=fit_null_gamma();
-		result=CStatistics::gamma_inverse_cdf(alpha, params[0], params[1]);
-		break;
-	}
-
-	default:
-		/* sampling null is handled here */
-		result=CKernelTwoSampleTest::compute_threshold(alpha);
-		break;
-	}
-
-	return result;
-}
-
-
-SGVector<float64_t> CQuadraticTimeMMD::sample_null_spectrum(index_t num_samples,
-		index_t num_eigenvalues)
-{
-	REQUIRE(m_kernel, "(%d, %d): No kernel set!\n", num_samples,
-			num_eigenvalues);
-	REQUIRE(m_kernel->get_kernel_type()==K_CUSTOM || m_p_and_q,
-			"(%d, %d): No features set and no custom kernel in use!\n",
-			num_samples, num_eigenvalues);
-
-	index_t m=m_m;
-	index_t n=0;
-
-	/* check if kernel is precomputed (custom kernel) */
-	if (m_kernel && m_kernel->get_kernel_type()==K_CUSTOM)
-		n=m_kernel->get_num_vec_lhs()-m;
-	else
-	{
-		REQUIRE(m_p_and_q, "The samples are not initialized!\n");
-		n=m_p_and_q->get_num_vectors()-m;
-	}
-
-	if (num_samples<=2)
-	{
-		SG_ERROR("Number of samples has to be at least 2, "
-				"better in the hundreds");
-	}
-
-	if (num_eigenvalues>m+n-1)
-		SG_ERROR("Number of Eigenvalues too large\n");
-
-	if (num_eigenvalues<1)
-		SG_ERROR("Number of Eigenvalues too small\n");
-
-	/* imaginary matrix K=[K KL; KL' L] (MATLAB notation)
-	 * K is matrix for XX, L is matrix for YY, KL is XY, LK is YX
-	 * works since X and Y are concatenated here */
-	m_kernel->init(m_p_and_q, m_p_and_q);
-	SGMatrix<float64_t> K=m_kernel->get_kernel_matrix();
-
-	/* center matrix K=H*K*H */
-	K.center();
-
-	/* compute eigenvalues and select num_eigenvalues largest ones */
-	Map<MatrixXd> c_kernel_matrix(K.matrix, K.num_rows, K.num_cols);
-	SelfAdjointEigenSolver<MatrixXd> eigen_solver(c_kernel_matrix);
-	REQUIRE(eigen_solver.info()==Eigen::Success,
-			"Eigendecomposition failed!\n");
-	index_t max_num_eigenvalues=eigen_solver.eigenvalues().rows();
-
-	/* finally, sample from null distribution */
-	SGVector<float64_t> null_samples(num_samples);
-	for (index_t i=0; i<num_samples; ++i)
-	{
-		null_samples[i]=0;
-		for (index_t j=0; j<num_eigenvalues; ++j)
-		{
-			float64_t z_j=CMath::randn_double();
-
-			SG_DEBUG("z_j=%f\n", z_j);
-
-			float64_t multiple=CMath::sq(z_j);
-
-			/* take largest EV, scale by 1/(m+n) on the fly and take abs value*/
-			float64_t eigenvalue_estimate=CMath::abs(1.0/(m+n)
-				*eigen_solver.eigenvalues()[max_num_eigenvalues-1-j]);
-
-			if (m_statistic_type==UNBIASED)
-				multiple-=1;
-
-			SG_DEBUG("multiple=%f, eigenvalue=%f\n", multiple,
-					eigenvalue_estimate);
-
-			null_samples[i]+=eigenvalue_estimate*multiple;
-		}
-	}
-
-	return null_samples;
-}
-
-SGVector<float64_t> CQuadraticTimeMMD::sample_null_spectrum_DEPRECATED(
-		index_t num_samples, index_t num_eigenvalues)
-{
-	REQUIRE(m_kernel, "(%d, %d): No kernel set!\n", num_samples,
-			num_eigenvalues);
-	REQUIRE(m_kernel->get_kernel_type()==K_CUSTOM || m_p_and_q,
-			"(%d, %d): No features set and no custom kernel in use!\n",
-			num_samples, num_eigenvalues);
-
-	index_t m=m_m;
-	index_t n=0;
-
-	/* check if kernel is precomputed (custom kernel) */
-	if (m_kernel && m_kernel->get_kernel_type()==K_CUSTOM)
-		n=m_kernel->get_num_vec_lhs()-m;
-	else
-	{
-		REQUIRE(m_p_and_q, "The samples are not initialized!\n");
-		n=m_p_and_q->get_num_vectors()-m;
-	}
-
-	if (num_samples<=2)
-	{
-		SG_ERROR("Number of samples has to be at least 2, "
-				"better in the hundreds");
-	}
-
-	if (num_eigenvalues>m+n-1)
-		SG_ERROR("Number of Eigenvalues too large\n");
-
-	if (num_eigenvalues<1)
-		SG_ERROR("Number of Eigenvalues too small\n");
-
-	/* imaginary matrix K=[K KL; KL' L] (MATLAB notation)
-	 * K is matrix for XX, L is matrix for YY, KL is XY, LK is YX
-	 * works since X and Y are concatenated here */
-	m_kernel->init(m_p_and_q, m_p_and_q);
-	SGMatrix<float64_t> K=m_kernel->get_kernel_matrix();
-
-	/* center matrix K=H*K*H */
-	K.center();
-
-	/* compute eigenvalues and select num_eigenvalues largest ones */
-	Map<MatrixXd> c_kernel_matrix(K.matrix, K.num_rows, K.num_cols);
-	SelfAdjointEigenSolver<MatrixXd> eigen_solver(c_kernel_matrix);
-	REQUIRE(eigen_solver.info()==Eigen::Success,
-			"Eigendecomposition failed!\n");
-	index_t max_num_eigenvalues=eigen_solver.eigenvalues().rows();
-
-	/* precomputing terms with rho_x and rho_y of equation 10 in [1]
-	 * (see documentation) */
-	float64_t rho_x=float64_t(m)/(m+n);
-	float64_t rho_y=1-rho_x;
-
-	/* instead of using two Gaussian rv's ~ N(0,1), we'll use just one rv
-	 * ~ N(0, 1/rho_x+1/rho_y) (derived from eq 10 in [1]) */
-	float64_t std_dev=CMath::sqrt(1/rho_x+1/rho_y);
-	float64_t inv_rho_x_y=1/(rho_x*rho_y);
-
-	SG_DEBUG("Using Gaussian samples ~ N(0,%f)\n", std_dev*std_dev);
-
-	/* finally, sample from null distribution */
-	SGVector<float64_t> null_samples(num_samples);
-	for (index_t i=0; i<num_samples; ++i)
-	{
-		null_samples[i]=0;
-		for (index_t j=0; j<num_eigenvalues; ++j)
-		{
-			/* compute the right hand multiple of eq. 10 in [1] using one RV.
-			 * randn_double() gives a sample from N(0,1), we need samples
-			 * from N(0,1/rho_x+1/rho_y) */
-			float64_t z_j=std_dev*CMath::randn_double();
-
-			SG_DEBUG("z_j=%f\n", z_j);
-
-			float64_t multiple=CMath::pow(z_j, 2);
-
-			/* take largest EV, scale by 1/(m+n) on the fly and take abs value*/
-			float64_t eigenvalue_estimate=CMath::abs(1.0/(m+n)
-				*eigen_solver.eigenvalues()[max_num_eigenvalues-1-j]);
-
-			if (m_statistic_type==UNBIASED_DEPRECATED)
-				multiple-=inv_rho_x_y;
-
-			SG_DEBUG("multiple=%f, eigenvalue=%f\n", multiple,
-					eigenvalue_estimate);
-
-			null_samples[i]+=eigenvalue_estimate*multiple;
-		}
-	}
-
-	/* when m=n, return m*MMD^2 instead */
-	if (m==n)
-		null_samples.scale(0.5);
-
-	return null_samples;
-}
-
-SGVector<float64_t> CQuadraticTimeMMD::fit_null_gamma()
-{
-	REQUIRE(m_kernel, "No kernel set!\n");
-	REQUIRE(m_kernel->get_kernel_type()==K_CUSTOM || m_p_and_q,
-			"No features set and no custom kernel in use!\n");
-
-	index_t n=0;
-
-	/* check if kernel is precomputed (custom kernel) */
-	if (m_kernel && m_kernel->get_kernel_type()==K_CUSTOM)
-		n=m_kernel->get_num_vec_lhs()-m_m;
-	else
-	{
-		REQUIRE(m_p_and_q, "The samples are not initialized!\n");
-		n=m_p_and_q->get_num_vectors()-m_m;
-	}
-	REQUIRE(m_m==n, "Only possible with equal number of samples "
-			"from both distribution!\n")
-
-	index_t num_data;
-	if (m_kernel->get_kernel_type()==K_CUSTOM)
-		num_data=m_kernel->get_num_vec_rhs();
-	else
-		num_data=m_p_and_q->get_num_vectors();
-
-	if (m_m!=num_data/2)
-		SG_ERROR("Currently, only equal sample sizes are supported\n");
-
-	/* evtl. warn user not to use wrong statistic type */
-	if (m_statistic_type!=BIASED_DEPRECATED)
-	{
-		SG_WARNING("Note: provided statistic has "
-				"to be BIASED. Please ensure that! To get rid of warning,"
-				"call %s::set_statistic_type(BIASED_DEPRECATED)\n", get_name());
-	}
-
-	/* imaginary matrix K=[K KL; KL' L] (MATLAB notation)
-	 * K is matrix for XX, L is matrix for YY, KL is XY, LK is YX
-	 * works since X and Y are concatenated here */
-	m_kernel->init(m_p_and_q, m_p_and_q);
-
-	/* compute mean under H0 of MMD, which is
-	 * meanMMD  = 2/m * ( 1  - 1/m*sum(diag(KL))  );
-	 * in MATLAB.
-	 * Remove diagonals on the fly */
-	float64_t mean_mmd=0;
-	for (index_t i=0; i<m_m; ++i)
-	{
-		/* virtual KL matrix is in upper right corner of SHOGUN K matrix
-		 * so this sums the diagonal of the matrix between X and Y*/
-		mean_mmd+=m_kernel->kernel(i, m_m+i);
-	}
-	mean_mmd=2.0/m_m*(1.0-1.0/m_m*mean_mmd);
-
-	/* compute variance under H0 of MMD, which is
-	 * varMMD = 2/m/(m-1) * 1/m/(m-1) * sum(sum( (K + L - KL - KL').^2 ));
-	 * in MATLAB, so sum up all elements */
-	float64_t var_mmd=0;
-	for (index_t i=0; i<m_m; ++i)
-	{
-		for (index_t j=0; j<m_m; ++j)
-		{
-			/* dont add diagonal of all pairs of imaginary kernel matrices */
-			if (i==j || m_m+i==j || m_m+j==i)
-				continue;
-
-			float64_t to_add=m_kernel->kernel(i, j);
-			to_add+=m_kernel->kernel(m_m+i, m_m+j);
-			to_add-=m_kernel->kernel(i, m_m+j);
-			to_add-=m_kernel->kernel(m_m+i, j);
-			var_mmd+=CMath::pow(to_add, 2);
-		}
-	}
-	var_mmd*=2.0/m_m/(m_m-1)*1.0/m_m/(m_m-1);
-
-	/* parameters for gamma distribution */
-	float64_t a=CMath::pow(mean_mmd, 2)/var_mmd;
-	float64_t b=var_mmd*m_m / mean_mmd;
-
-	SGVector<float64_t> result(2);
-	result[0]=a;
-	result[1]=b;
-
-	return result;
-}
-
-void CQuadraticTimeMMD::set_num_samples_spectrum(index_t
-		num_samples_spectrum)
-{
-	m_num_samples_spectrum=num_samples_spectrum;
-}
-
-void CQuadraticTimeMMD::set_num_eigenvalues_spectrum(
-		index_t num_eigenvalues_spectrum)
-{
-	m_num_eigenvalues_spectrum=num_eigenvalues_spectrum;
-}
-
-void CQuadraticTimeMMD::set_statistic_type(EQuadraticMMDType
-		statistic_type)
-{
-	m_statistic_type=statistic_type;
-}
-
diff --git a/src/shogun/statistics/QuadraticTimeMMD.h b/src/shogun/statistics/QuadraticTimeMMD.h
deleted file mode 100644
index f9981b0cec9..00000000000
--- a/src/shogun/statistics/QuadraticTimeMMD.h
+++ /dev/null
@@ -1,487 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef QUADRATIC_TIME_MMD_H_
-#define QUADRATIC_TIME_MMD_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/KernelTwoSampleTest.h>
-
-namespace shogun
-{
-
-class CFeatures;
-class CKernel;
-class CCustomKernel;
-
-/** Enum to select which statistic type of quadratic time MMD should be computed */
-enum EQuadraticMMDType
-{
-	BIASED,
-	BIASED_DEPRECATED,
-	UNBIASED,
-	UNBIASED_DEPRECATED,
-	INCOMPLETE
-};
-
-/** @brief This class implements the quadratic time Maximum Mean Statistic as
- * described in [1].
- * The MMD is the distance of two probability distributions \f$p\f$ and \f$q\f$
- * in a RKHS which we denote by
- * \f[
- * 	\hat{\eta_k}=\text{MMD}[\mathcal{F},p,q]^2=\textbf{E}_{x,x'}
- * 	\left[ k(x,x')\right]-2\textbf{E}_{x,y}\left[ k(x,y)\right]
- * 	+\textbf{E}_{y,y'}\left[ k(y,y')\right]=||\mu_p - \mu_q||^2_\mathcal{F}
- * \f]
- *
- * Given two sets of samples \f$\{x_i\}_{i=1}^{n_x}\sim p\f$ and
- * \f$\{y_i\}_{i=1}^{n_y}\sim q\f$, \f$n_x+n_y=n\f$,
- * the unbiased estimate of the above statistic is computed as
- * \f[
- * 	\hat{\eta}_{k,U}=\frac{1}{n_x(n_x-1)}\sum_{i=1}^{n_x}\sum_{j\neq i}
- * 	k(x_i,x_j)+\frac{1}{n_y(n_y-1)}\sum_{i=1}^{n_y}\sum_{j\neq i}k(y_i,y_j)
- * 	-\frac{2}{n_xn_y}\sum_{i=1}^{n_x}\sum_{j=1}^{n_y}k(x_i,y_j)
- * \f]
- *
- * A biased version is
- * \f[
- * 	\hat{\eta}_{k,V}=\frac{1}{n_x^2}\sum_{i=1}^{n_x}\sum_{j=1}^{n_x}
- * 	k(x_i,x_j)+\frac{1}{n_y^2}\sum_{i=1}^{n_y}\sum_{j=1}^{n_y}k(y_i,y_j)
- * 	-\frac{2}{n_xn_y}\sum_{i=1}^{n_x}\sum_{j=1}^{n_y}k(x_i,y_j)
- * \f]
- *
- * When \f$n_x=n_y=\frac{n}{2}\f$, an incomplete version can also be computed
- * as the following
- * \f[
- * 	\hat{\eta}_{k,U^-}=\frac{1}{\frac{n}{2}(\frac{n}{2}-1)}\sum_{i\neq j}
- * 	h(z_i,z_j)
- * \f]
- * where for each pair \f$z=(x,y)\f$, \f$h(z,z')=k(x,x')+k(y,y')-k(x,y')-
- * k(x',y)\f$.
- *
- * The type (biased/unbiased/incomplete) can be selected via set_statistic_type().
- * Note that there are presently two setups for computing statistic. While using
- * BIASED, UNBIASED or INCOMPLETE, the estimate returned by compute_statistic()
- * is \f$\frac{n_xn_y}{n_x+n_y}\hat{\eta}_k\f$. If DEPRECATED ones are used, then
- * this returns \f$(n_x+n_y)\hat{\eta}_k\f$ in general and \f$(\frac{n}{2})
- * \hat{\eta}_k\f$ when \f$n_x=n_y=\frac{n}{2}\f$. This holds for the null
- * distribution samples as well.
- *
- * Estimating variance of the asymptotic distribution of the statistic under
- * null and alternative hypothesis can be done using compute_variance() method.
- * This is internally done alongwise computing statistics to avoid recomputing
- * the kernel.
- *
- * Variance under null is computed as
- * \f$\sigma_{k,0}^2=2\hat{\kappa}_2=2(\kappa_2-2\kappa_1+\kappa_0)\f$
- * where
- * \f$\kappa_0=\left(\mathbb{E}_{X,X'}k(X,X')\right )^2\f$,
- * \f$\kappa_1=\mathbb{E}_X\left[(\mathbb{E}_{X'}k(X,X'))^2\right]\f$, and
- * \f$\kappa_2=\mathbb{E}_{X,X'}k^2(X,X')\f$
- * and variance under alternative is computed as
- * \f[
- * 	\sigma_{k,A}^2=4\rho_y\left\{\mathbb{E}_X\left[\left(\mathbb{E}_{X'}
- * 	k(X,X')-\mathbb{E}_Yk(X,Y)\right)^2 \right ] -\left(\mathbb{E}_{X,X'}
- * 	k(X,X')-\mathbb{E}_{X,Y}k(X,Y) \right)^2\right \}+4\rho_x\left\{
- * 	\mathbb{E}_Y\left[\left(\mathbb{E}_{Y'}k(Y,Y')-\mathbb{E}_Xk(X,Y)
- * 	\right)^2\right ] -\left(\mathbb{E}_{Y,Y'}k(Y,Y')-\mathbb{E}_{X,Y}
- * 	k(X,Y) \right)^2\right \}
- * \f]
- * where \f$\rho_x=\frac{n_x}{n}\f$ and \f$\rho_y=\frac{n_y}{n}\f$.
- *
- * Note that statistic and variance estimation can be done for multiple kernels
- * at once as well.
- *
- * Along with the statistic comes a method to compute a p-value based on
- * different methods. Permutation test is also possible. If unsure which one to
- * use, sampling with 250 permutation iterations always is correct (but slow).
- *
- * To choose, use set_null_approximation_method() and choose from.
- *
- * MMD2_SPECTRUM_DEPRECATED: For a fast, consistent test based on the spectrum of
- * the kernel matrix, as described in [2]. Only supported if Eigen3 is installed.
- *
- * MMD2_SPECTRUM: Similar to the deprecated version except it estimates the
- * statistic under null as \f$\frac{n_xn_y}{n_x+n_y}\hat{\eta}_{k,U}\rightarrow
- * \sum_r\lambda_r(Z_r^2-1)\f$ instead (see method description for more details).
- *
- * MMD2_GAMMA: for a very fast, but not consistent test based on moment matching
- * of a Gamma distribution, as described in [2].
- *
- * PERMUTATION: For permuting available samples to sample null-distribution
- *
- * If you do not know about your data, but want to use the MMD from a kernel
- * matrix, just use the custom kernel constructor. Everything else will work as
- * usual.
- *
- * For kernel selection see CMMDKernelSelection.
- *
- * NOTE: \f$n_x\f$ and \f$n_y\f$ are represented by \f$m\f$ and \f$n\f$,
- * respectively in the implementation.
- *
- * [1]: Gretton, A., Borgwardt, K. M., Rasch, M. J., Schoelkopf, B., & Smola, A. (2012).
- * A Kernel Two-Sample Test. Journal of Machine Learning Research, 13, 671-721.
- *
- * [2]: Gretton, A., Fukumizu, K., & Harchaoui, Z. (2011).
- * A fast, consistent kernel two-sample test.
- *
- */
-class CQuadraticTimeMMD : public CKernelTwoSampleTest
-{
-public:
-	/** default constructor */
-	CQuadraticTimeMMD();
-
-	/** Constructor
-	 *
-	 * @param p_and_q feature data. Is assumed to contain samples from both
-	 * p and q. First m samples from p, then from index m all samples from q
-	 *
-	 * @param kernel kernel to use
-	 * @param p_and_q samples from p and q, appended
-	 * @param m index of first sample of q
-	 */
-	CQuadraticTimeMMD(CKernel* kernel, CFeatures* p_and_q, index_t m);
-
-	/** Constructor.
-	 * This is a convienience constructor which copies both features to one
-	 * element and then calls the other constructor. Needs twice the memory
-	 * for a short time
-	 *
-	 * @param kernel kernel for MMD
-	 * @param p samples from distribution p, will be copied and NOT SG_REF'ed
-	 * @param q samples from distribution q, will be copied and NOT SG_REF'ed
-	 */
-	CQuadraticTimeMMD(CKernel* kernel, CFeatures* p, CFeatures* q);
-
-	/** Constructor.
-	 * This is a convienience constructor which allows to only specify
-	 * a custom kernel. In this case, the features are completely ignored
-	 * and all computations will be done on the custom kernel
-	 *
-	 * @param custom_kernel custom kernel for MMD, which is a kernel between
-	 * the appended features p and q
-	 * @param m index of first sample of q
-	 */
-	CQuadraticTimeMMD(CCustomKernel* custom_kernel, index_t m);
-
-	/** destructor */
-	virtual ~CQuadraticTimeMMD();
-
-	/** Computes the squared quadratic time MMD for the current data. Note
-	 * that the type (biased/unbiased/incomplete) can be specified with
-	 * set_statistic_type() method.
-	 *
-	 * @return (biased, unbiased or incomplete) \f$\frac{mn}{m+n}\hat{\eta}_k\f$.
-	 * If DEPRECATED types are used, then it returns \f$(m+m)\hat{\eta}_k\f$ in
-	 * general and \f$m\hat{\eta}_k\f$ when \f$m=n\f$.
-	 */
-	virtual float64_t compute_statistic();
-
-	/** Same as compute_statistic(), but with the possibility to perform on
-	 * multiple kernels at once
-	 *
-	 * @param multiple_kernels if true, and underlying kernel is K_COMBINED,
-	 * method will be executed on all subkernels on the same data
-	 * @return vector of results for subkernels
-	 */
-	SGVector<float64_t> compute_statistic(bool multiple_kernels);
-
-	/**
-	 * Wrapper for computing variance estimate of the asymptotic distribution
-	 * of the statistic (unbisaed/biased/incomplete) under null and alternative
-	 * hypothesis (see class description for details)
-	 *
-	 * @return a vector of two values containing asymptotic variance estimate
-	 * under null and alternative, respectively
-	 */
-	virtual SGVector<float64_t> compute_variance();
-
-	/** Same as compute_variance(), but with the possibility to perform on
-	 * multiple kernels at once
-	 *
-	 * @param multiple_kernels if true, and underlying kernel is K_COMBINED,
-	 * method will be executed on all subkernels on the same data
-	 * @return matrix of results for subkernels, one row for each subkernel
-	 */
-	SGMatrix<float64_t> compute_variance(bool multiple_kernels);
-
-	/**
-	 * Wrapper method for compute_variance()
-	 *
-	 * @return variance estimation of asymptotic distribution of statistic
-	 * under null hypothesis
-	 */
-	float64_t compute_variance_under_null();
-
-	/**
-	 * Wrapper method for compute_variance()
-	 *
-	 * @return variance estimation of asymptotic distribution of statistic
-	 * under alternative hypothesis
-	 */
-	float64_t compute_variance_under_alternative();
-
-	/** computes a p-value based on current method for approximating the
-	 * null-distribution. The p-value is the 1-p quantile of the null-
-	 * distribution where the given statistic lies in.
-	 *
-	 * Not all methods for computing the p-value are compatible with all
-	 * methods of computing the statistic (biased/unbiased/incomplete).
-	 *
-	 * @param statistic statistic value to compute the p-value for
-	 * @return p-value parameter statistic is the (1-p) percentile of the
-	 * null distribution
-	 */
-	virtual float64_t compute_p_value(float64_t statistic);
-
-	/** computes a threshold based on current method for approximating the
-	 * null-distribution. The threshold is the value that a statistic has
-	 * to have in ordner to reject the null-hypothesis.
-	 *
-	 * Not all methods for computing the p-value are compatible with all
-	 * methods of computing the statistic (biased/unbiased/incomplete).
-	 *
-	 * @param alpha test level to reject null-hypothesis
-	 * @return threshold for statistics to reject null-hypothesis
-	 */
-	virtual float64_t compute_threshold(float64_t alpha);
-
-	/** @return the class name */
-	virtual const char* get_name() const
-	{
-		return "QuadraticTimeMMD";
-	};
-
-	/** returns the statistic type of this test statistic */
-	virtual EStatisticType get_statistic_type() const
-	{
-		return S_QUADRATIC_TIME_MMD;
-	}
-
-	/** Returns a set of samples of an estimate of the null distribution
-	 * using the Eigen-spectrum of the centered kernel matrix of the merged
-	 * samples of p and q. May be used to compute p-value (easy).
-	 *
-	 * The estimate is computed as
-	 * \f[
-	 *	\frac{n_xn_y}{n_x+n_y}\hat{\eta}_{k,U}\rightarrow\sum_{l=1}^\infty
-	 *	\lambda_l\left(Z^2_l-1 \right)
-	 * \f]
-	 * where \f${Z_l}\stackrel{i.i.d.}{\sim}\mathcal{N}(0,1)\f$ and
-	 * \f$\lambda_l\f$ are the eigenvalues of centered kernel matrix HKH.
-	 *
-	 * kernel matrix needs to be stored in memory
-	 *
-	 * Note that m*n/(m+n)*Null-distribution is returned,
-	 * which is fine since the statistic is also m*n/(m+n)*MMD^2
-	 *
-	 * Works well if the kernel matrix is NOT diagonal dominant.
-	 * See Gretton, A., Fukumizu, K., & Harchaoui, Z. (2011).
-	 * A fast, consistent kernel two-sample test.
-	 *
-	 * @param num_samples number of samples to draw
-	 * @param num_eigenvalues number of eigenvalues to use to draw samples
-	 * Maximum number of m+n-1 where m and n are the sizes of samples from
-	 * p and q respectively.
-	 * @return samples from the estimated null distribution
-	 */
-	SGVector<float64_t> sample_null_spectrum(index_t num_samples,
-			index_t num_eigenvalues);
-
-	/** Returns a set of samples of an estimate of the null distribution
-	 * using the Eigen-spectrum of the centered kernel matrix of the merged
-	 * samples of p and q. May be used to compute p-value (easy).
-	 *
-	 * The unbiased version uses
-	 * \f[
-	 *	t\text{MMD}_u^2[\mathcal{F},X,Y]\rightarrow\sum_{l=1}^\infty
-	 *	\lambda_l\left((a_l\rho_x^{-\frac{1}{{2}}}
-	 *	-b_l\rho_y^{-\frac{1}{{2}}})^2-(\rho_x\rho_y)^{-1} \right)
-	 * \f]
-	 * where \f$t=m+n\f$, \f$\lim_{m,n\rightarrow\infty}m/t\rightarrow
-	 * \rho_x\f$ and \f$\rho_y\f$ likewise (equation 10 from [1]) and
-	 * \f$\lambda_l\f$ are estimated as \f$\frac{\nu_l}{(m+n)}\f$, where
-	 * \f$\nu_l\f$ are the eigenvalues of centered kernel matrix HKH.
-	 *
-	 * The biased version uses
-	 * \f[
-	 * 	t\text{MMD}_b^2[\mathcal{F},X,Y]\rightarrow\sum_{l=1}^\infty
-	 *	\lambda_l\left((a_l\rho_x^{-\frac{1}{{2}}}-
-	 *	b_l\rho_y^{-\frac{1}{{2}}})^2\right)
-	 * \f]
-	 *
-	 * kernel matrix needs to be stored in memory
-	 *
-	 * Note that (m+n)*Null-distribution is returned,
-	 * which is fine since the statistic is also (m+n)*MMD:
-	 * except when m and n are equal, then m*MMD^2 is returned
-	 *
-	 * Works well if the kernel matrix is NOT diagonal dominant.
-	 * See Gretton, A., Fukumizu, K., & Harchaoui, Z. (2011).
-	 * A fast, consistent kernel two-sample test.
-	 *
-	 * @param num_samples number of samples to draw
-	 * @param num_eigenvalues number of eigenvalues to use to draw samples
-	 * Maximum number of m+n-1 where m and n are the sizes of samples from
-	 * p and q respectively.
-	 * It is usually safe to use a smaller number since they decay very
-	 * fast, however, a conservative approach would be to use all (-1 does
-	 * this). See paper for details.
-	 * @return samples from the estimated null distribution
-	 */
-	SGVector<float64_t> sample_null_spectrum_DEPRECATED(index_t num_samples,
-			index_t num_eigenvalues);
-
-	/** setter for number of samples to use in spectrum based p-value
-	 * computation.
-	 *
-	 * @param num_samples_spectrum number of samples to draw from
-	 * approximate null-distributrion
-	 */
-	void set_num_samples_spectrum(index_t num_samples_spectrum);
-
-	/** setter for number of eigenvalues to use in spectrum based p-value
-	 * computation. Maximum is m_m+m_n-1
-	 *
-	 * @param num_eigenvalues_spectrum number of eigenvalues to use to
-	 * approximate null-distributrion
-	 */
-	void set_num_eigenvalues_spectrum(index_t num_eigenvalues_spectrum);
-
-	/** @param statistic_type statistic type (biased/unbiased/incomplete) to use */
-	void set_statistic_type(EQuadraticMMDType statistic_type);
-
-	/** Approximates the null-distribution by the two parameter gamma
-	 * distribution. It works in O(m^2) where m is the number of samples
-	 * from each distribution. Its very fast, but may be inaccurate.
-	 * However, there are cases where it performs very well.
-	 * Returns parameters of gamma distribution that is fitted.
-	 *
-	 * Called by compute_p_value() if null approximation method is set to
-	 * MMD2_GAMMA.
-	 *
-	 * Note that when being used for constructing a test, the provided
-	 * statistic HAS to be the biased version (see paper for details). To use,
-	 * set BIASED_DEPRECATED as statistic type. Note that m*Null-distribution
-	 * is fitted, which is fine since the statistic is also m*MMD.
-	 *
-	 * See Gretton, A., Fukumizu, K., & Harchaoui, Z. (2011).
-	 * A fast, consistent kernel two-sample test.
-	 *
-	 * @return vector with two parameter for gamma distribution. To use:
-	 * call gamma_cdf(statistic, a, b).
-	 */
-	SGVector<float64_t> fit_null_gamma();
-
-protected:
-	/**
-	 * Helper method to compute unbiased estimate of squared quadratic time MMD
-	 * and variance estimate under null and alternative hypothesis
-	 *
-	 * @param m number of samples from p
-	 * @param n number of samples from q
-	 * @return a vector of three values
-	 * first - unbiased \f$\text{MMD}^2\f$ estimate \f$\hat{\eta}_{k,U}\f$
-	 * second - variance under null hypothesis (see class documentation)
-	 * third - variance under alternative hypothesis (see class documentation)
-	 */
-	SGVector<float64_t> compute_unbiased_statistic_variance(int m, int n);
-
-	/**
-	 * Helper method to compute biased estimate of squared quadratic time MMD
-	 * and variance estimate under null and alternative hypothesis
-	 *
-	 * @param m number of samples from p
-	 * @param n number of samples from q
-	 * @return a vector of three values
-	 * first - biased \f$\text{MMD}^2\f$ estimate \f$\hat{\eta}_{k,V}\f$
-	 * second - variance under null hypothesis (see class documentation)
-	 * third - variance under alternative hypothesis (see class documentation)
-	 */
-	SGVector<float64_t> compute_biased_statistic_variance(int m, int n);
-
-	/**
-	 * Helper method to compute incomplete estimate of squared quadratic time MMD
-	 * and variance estimate under null and alternative hypothesis
-	 *
-	 * @param n number of samples from p and q
-	 * @return a vector of three values
-	 * first - incomplete \f$\text{MMD}^2\f$ estimate \f$\hat{\eta}_{k,U^-}\f$
-	 * second - variance under null hypothesis (see class documentation)
-	 * third - variance under alternative hypothesis (see class documentation)
-	 */
-	SGVector<float64_t> compute_incomplete_statistic_variance(int n);
-
-	/** Wrapper method for computing unbiased estimate of MMD^2
-	 *
-	 * @param m number of samples from p
-	 * @param n number of samples from q
-	 * @return unbiased \f$\text{MMD}^2\f$ estimate \f$\hat{\eta}_{k,U}\f$
-	 */
-	float64_t compute_unbiased_statistic(int m, int n);
-
-	/** Wrapper method for computing biased estimate of MMD^2
-	 *
-	 * @param m number of samples from p
-	 * @param n number of samples from q
-	 * @return biased \f$\text{MMD}^2\f$ estimate \f$\hat{\eta}_{k,V}\f$
-	 */
-	float64_t compute_biased_statistic(int m, int n);
-
-	/** Wrapper method for computing incomplete estimate of MMD^2
-	 *
-	 * @param n number of samples from p and q
-	 * @return incomplete \f$\text{MMD}^2\f$ estimate \f$\hat{\eta}_{k,U^-}\f$
-	 */
-	float64_t compute_incomplete_statistic(int n);
-
-private:
-	/** register parameters and initialize with defaults */
-	void init();
-
-protected:
-	/** number of samples for spectrum null-dstribution-approximation */
-	index_t m_num_samples_spectrum;
-
-	/** number of Eigenvalues for spectrum null-dstribution-approximation */
-	index_t m_num_eigenvalues_spectrum;
-
-	/** type of statistic (biased/unbiased/incomplete as well as deprecated
-	 * versions of biased/unbiased)
-	 */
-	EQuadraticMMDType m_statistic_type;
-};
-
-}
-
-#endif /* QUADRATIC_TIME_MMD_H_ */
diff --git a/src/shogun/statistics/StreamingMMD.cpp b/src/shogun/statistics/StreamingMMD.cpp
deleted file mode 100644
index 5ba3d10a796..00000000000
--- a/src/shogun/statistics/StreamingMMD.cpp
+++ /dev/null
@@ -1,325 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/StreamingMMD.h>
-#include <shogun/features/Features.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/mathematics/Statistics.h>
-#include <shogun/lib/List.h>
-
-using namespace shogun;
-
-CStreamingMMD::CStreamingMMD() : CKernelTwoSampleTest()
-{
-	init();
-}
-
-CStreamingMMD::CStreamingMMD(CKernel* kernel, CStreamingFeatures* p,
-		CStreamingFeatures* q, index_t m, index_t blocksize) :
-		CKernelTwoSampleTest(kernel, NULL, m)
-{
-	init();
-
-	m_streaming_p=p;
-	SG_REF(m_streaming_p);
-
-	m_streaming_q=q;
-	SG_REF(m_streaming_q);
-
-	m_blocksize=blocksize;
-}
-
-CStreamingMMD::~CStreamingMMD()
-{
-	SG_UNREF(m_streaming_p);
-	SG_UNREF(m_streaming_q);
-
-	/* m_kernel is SG_UNREFed in base desctructor */
-}
-
-void CStreamingMMD::init()
-{
-	SG_ADD((CSGObject**)&m_streaming_p, "streaming_p", "Streaming features p",
-				MS_NOT_AVAILABLE);
-	SG_ADD((CSGObject**)&m_streaming_q, "streaming_q", "Streaming features p",
-				MS_NOT_AVAILABLE);
-	SG_ADD(&m_blocksize, "blocksize", "Number of elements processed at once",
-				MS_NOT_AVAILABLE);
-	SG_ADD(&m_simulate_h0, "simulate_h0", "Whether p and q are mixed",
-				MS_NOT_AVAILABLE);
-
-	m_streaming_p=NULL;
-	m_streaming_q=NULL;
-	m_blocksize=10000;
-	m_simulate_h0=false;
-}
-
-float64_t CStreamingMMD::compute_statistic()
-{
-	/* use wrapper method and compute for single kernel */
-	SGVector<float64_t> statistic;
-	SGVector<float64_t> variance;
-	compute_statistic_and_variance(statistic, variance, false);
-
-	return statistic[0];
-}
-
-SGVector<float64_t> CStreamingMMD::compute_statistic(bool multiple_kernels)
-{
-	/* make sure multiple_kernels flag is used only with a combined kernel */
-	REQUIRE(!multiple_kernels || m_kernel->get_kernel_type()==K_COMBINED,
-			"multiple kernels specified, but underlying kernel is not of type "
-			"K_COMBINED\n");
-
-	SGVector<float64_t> statistic;
-	SGVector<float64_t> variance;
-	compute_statistic_and_variance(statistic, variance, multiple_kernels);
-
-	return statistic;
-}
-
-float64_t CStreamingMMD::compute_variance_estimate()
-{
-	/* use wrapper method and compute for single kernel */
-	SGVector<float64_t> statistic;
-	SGVector<float64_t> variance;
-	compute_statistic_and_variance(statistic, variance, false);
-
-	return variance[0];
-}
-
-float64_t CStreamingMMD::compute_p_value(float64_t statistic)
-{
-	float64_t result=0;
-
-	switch (m_null_approximation_method)
-	{
-	case MMD1_GAUSSIAN:
-		{
-			/* compute variance and use to estimate Gaussian distribution */
-			float64_t std_dev=CMath::sqrt(compute_variance_estimate());
-			result=1.0-CStatistics::normal_cdf(statistic, std_dev);
-		}
-		break;
-
-	default:
-		/* sampling null is handled here */
-		result=CKernelTwoSampleTest::compute_p_value(statistic);
-		break;
-	}
-
-	return result;
-}
-
-float64_t CStreamingMMD::compute_threshold(float64_t alpha)
-{
-	float64_t result=0;
-
-	switch (m_null_approximation_method)
-	{
-	case MMD1_GAUSSIAN:
-		{
-			/* compute variance and use to estimate Gaussian distribution */
-			float64_t std_dev=CMath::sqrt(compute_variance_estimate());
-			result=1.0-CStatistics::inverse_normal_cdf(1-alpha, 0, std_dev);
-		}
-		break;
-
-	default:
-		/* sampling null is handled here */
-		result=CKernelTwoSampleTest::compute_threshold(alpha);
-		break;
-	}
-
-	return result;
-}
-
-float64_t CStreamingMMD::perform_test()
-{
-	float64_t result=0;
-
-	switch (m_null_approximation_method)
-	{
-	case MMD1_GAUSSIAN:
-		{
-			/* compute variance and use to estimate Gaussian distribution, use
-			 * wrapper method and compute for single kernel */
-			SGVector<float64_t> statistic;
-			SGVector<float64_t> variance;
-			compute_statistic_and_variance(statistic, variance, false);
-
-			/* estimate Gaussian distribution */
-			result=1.0-CStatistics::normal_cdf(statistic[0],
-					CMath::sqrt(variance[0]));
-		}
-		break;
-
-	default:
-		/* sampling null can be done separately in superclass */
-		result=CHypothesisTest::perform_test();
-		break;
-	}
-
-	return result;
-}
-
-SGVector<float64_t> CStreamingMMD::sample_null()
-{
-	SGVector<float64_t> samples(m_num_null_samples);
-
-	/* instead of permutating samples, just samples new data all the time. */
-	CStreamingFeatures* p=m_streaming_p;
-	CStreamingFeatures* q=m_streaming_q;
-	SG_REF(p);
-	SG_REF(q);
-
-	bool old=m_simulate_h0;
-	set_simulate_h0(true);
-	for (index_t i=0; i<m_num_null_samples; ++i)
-	{
-		/* compute statistic for this permutation of mixed samples */
-		samples[i]=compute_statistic();
-	}
-	set_simulate_h0(old);
-	m_streaming_p=p;
-	m_streaming_q=q;
-	SG_UNREF(p);
-	SG_UNREF(q);
-
-	return samples;
-}
-
-CList* CStreamingMMD::stream_data_blocks(index_t num_blocks,
-		index_t num_this_run)
-{
-	SG_DEBUG("entering!\n");
-
-	/* the list of blocks of data to be returned, turning delete_data flag
-	 * on which SG_REFs the elements when appended or returned. */
-	CList* data=new CList(true);
-
-	SG_DEBUG("streaming %d blocks from p of blocksize %d!\n", num_blocks,
-			num_this_run);
-
-	/* stream data from p num_blocks of time*/
-	for (index_t i=0; i<num_blocks; ++i)
-	{
-		CFeatures* block=m_streaming_p->get_streamed_features(num_this_run);
-		data->append_element(block);
-	}
-
-	SG_DEBUG("streaming %d blocks from q of blocksize %d!\n", num_blocks,
-			num_this_run);
-
-	/* stream data from q num_blocks of time*/
-	for (index_t i=0; i<num_blocks; ++i)
-	{
-		CFeatures* block=m_streaming_q->get_streamed_features(num_this_run);
-		data->append_element(block);
-	}
-
-	/* check whether h0 should be simulated and permute if so */
-	if (m_simulate_h0)
-	{
-		/* create merged copy of all feature instances to permute */
-		SG_DEBUG("merging and premuting features!\n");
-
-		/* use the first element to merge rest of the data into */
-		CFeatures* first=(CFeatures*)data->get_first_element();
-
-		/* this delete element doesn't deallocate first element but just removes
-		 * from the list and does a SG_UNREF. But its not deleted because
-		 * get_first_element() does a SG_REF before returning so we need to later
-		 * manually take care of its destruction via SG_UNREF here itself */
-		data->delete_element();
-
-		CFeatures* merged=first->create_merged_copy(data);
-
-		/* now we can get rid of unnecessary feature objects */
-		SG_UNREF(first);
-		data->delete_all_elements();
-
-		/* permute */
-		SGVector<index_t> inds(merged->get_num_vectors());
-		inds.range_fill();
-		CMath::permute(inds);
-		merged->add_subset(inds);
-
-		/* copy back */
-		SGVector<index_t> copy(num_this_run);
-		copy.range_fill();
-		for (index_t i=0; i<2*num_blocks; ++i)
-		{
-			CFeatures* current=merged->copy_subset(copy);
-			data->append_element(current);
-			/* SG_UNREF'ing since copy_subset does a SG_REF, this is
-			 * safe since the object is already SG_REF'ed inside the list */
-			SG_UNREF(current);
-
-			if (i<2*num_blocks-1)
-				copy.add(num_this_run);
-		}
-
-		/* clean up */
-		SG_UNREF(merged);
-	}
-
-	SG_REF(data);
-
-	SG_DEBUG("leaving!\n");
-	return data;
-}
-
-void CStreamingMMD::set_p_and_q(CFeatures* p_and_q)
-{
-	SG_ERROR("Method not implemented since linear time mmd is based on "
-			"streaming features\n");
-}
-
-CFeatures* CStreamingMMD::get_p_and_q()
-{
-	SG_ERROR("Method not implemented since linear time mmd is based on "
-			"streaming features\n");
-	return NULL;
-}
-
-CStreamingFeatures* CStreamingMMD::get_streaming_p()
-{
-	SG_REF(m_streaming_p);
-	return m_streaming_p;
-}
-
-CStreamingFeatures* CStreamingMMD::get_streaming_q()
-{
-	SG_REF(m_streaming_q);
-	return m_streaming_q;
-}
-
diff --git a/src/shogun/statistics/StreamingMMD.h b/src/shogun/statistics/StreamingMMD.h
deleted file mode 100644
index d2e8d0a3e0d..00000000000
--- a/src/shogun/statistics/StreamingMMD.h
+++ /dev/null
@@ -1,310 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef STREAMING_MMD_H_
-#define STREAMING_MMD_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/KernelTwoSampleTest.h>
-
-namespace shogun
-{
-
-class CStreamingFeatures;
-class CFeatures;
-
-/** @brief Abstract base class that provides an interface for performing kernel
- * two-sample test on streaming data using Maximum Mean Discrepancy (MMD) as
- * the test statistic. The MMD is the distance of two probability distributions
- * \f$p\f$ and \f$q\f$ in a RKHS (see [1] for formal description).
- *
- * \f[
- * \text{MMD}[\mathcal{F},p,q]^2=\textbf{E}_{x,x'}\left[ k(x,x')\right]-
- * 2\textbf{E}_{x,y}\left[ k(x,y)\right]
- * +\textbf{E}_{y,y'}\left[ k(y,y')\right]=||\mu_p - \mu_q||^2_\mathcal{F}
- * \f]
- *
- * where \f$x,x'\sim p\f$ and \f$y,y'\sim q\f$. The data has to be provided as
- * streaming features, which are processed in blocks for a given blocksize.
- * The blocksize determines how many examples are processed at once. A method
- * for getting a specified number of blocks of data is provided which can
- * optionally merge and permute the data within the current burst. The exact
- * computation of kernel functions for MMD computation is abstract and has to
- * be defined by its subclasses, which should return a vector of function
- * values. Please note that for streaming MMD, the number of data points from
- * both the distributions has to be equal.
- *
- * Along with the statistic comes a method to compute a p-value based on a
- * Gaussian approximation of the null-distribution which is possible in
- * linear time and constant space. Sampling from null is also possible (no
- * permutations but new examples will be used here).
- * If unsure which one to use, sampling with 250 iterations always is
- * correct (but slow). When the sample size is large (>1000) at least,
- * the Gaussian approximation is an accurate and much faster choice.
- *
- * To choose, use set_null_approximation_method() and choose from
- *
- * MMD1_GAUSSIAN: Approximates the null-distribution with a Gaussian. Only use
- * from at least 1000 samples. If using, check if type I error equals the
- * desired value.
- *
- * PERMUTATION: For permuting available samples to sample null-distribution.
- *
- * For kernel selection see CMMDKernelSelection.
- *
- * [1]: Gretton, A., Borgwardt, K. M., Rasch, M. J., Schoelkopf, B., &
- * Smola, A. (2012). A Kernel Two-Sample Test. Journal of Machine Learning
- * Research, 13, 671-721.
- */
-class CStreamingMMD: public CKernelTwoSampleTest
-{
-public:
-	/** default constructor */
-	CStreamingMMD();
-
-	/** constructor.
-	 *
-	 * @param kernel kernel to use
-	 * @param p streaming features p to use
-	 * @param q streaming features q to use
-	 * @param m number of samples from each distribution
-	 * @param blocksize size of examples that are processed at once when
-	 * computing statistic/threshold.
-	 */
-	CStreamingMMD(CKernel* kernel, CStreamingFeatures* p,
-			CStreamingFeatures* q, index_t m, index_t blocksize=10000);
-
-	/** destructor */
-	virtual ~CStreamingMMD();
-
-	/** Computes the squared MMD for the current data. This is an unbiased
-	 * estimate. This method relies on compute_statistic_and_variance which
-	 * has to be defined in the subclasses
-	 *
-	 * Note that the underlying streaming feature parser has to be started
-	 * before this is called. Otherwise deadlock.
-	 *
-	 * @return squared MMD
-	 */
-	virtual float64_t compute_statistic();
-
-	/** Same as compute_statistic(), but with the possibility to perform on
-	 * multiple kernels at once
-	 *
-	 * @param multiple_kernels if true, and underlying kernel is K_COMBINED,
-	 * method will be executed on all subkernels on the same data
-	 * @return vector of results for subkernels
-	 */
-	virtual SGVector<float64_t> compute_statistic(bool multiple_kernels);
-
-	/** computes a p-value based on current method for approximating the
-	 * null-distribution. The p-value is the 1-p quantile of the null-
-	 * distribution where the given statistic lies in.
-	 *
-	 * The method for computing the p-value can be set via
-	 * set_null_approximation_method().
-	 * Since the null- distribution is normal, a Gaussian approximation
-	 * is available.
-	 *
-	 * @param statistic statistic value to compute the p-value for
-	 * @return p-value parameter statistic is the (1-p) percentile of the
-	 * null distribution
-	 */
-	virtual float64_t compute_p_value(float64_t statistic);
-
-	/** Performs the complete two-sample test on current data and returns a
-	 * p-value.
-	 *
-	 * In case null distribution should be estimated with MMD1_GAUSSIAN,
-	 * statistic and p-value are computed in the same loop, which is more
-	 * efficient than first computing statistic and then computung p-values.
-	 *
-	 * In case of sampling null, superclass method is called.
-	 *
-	 * The method for computing the p-value can be set via
-	 * set_null_approximation_method().
-	 *
-	 * @return p-value such that computed statistic is the (1-p) quantile
-	 * of the estimated null distribution
-	 */
-	virtual float64_t perform_test();
-
-	/** computes a threshold based on current method for approximating the
-	 * null-distribution. The threshold is the value that a statistic has
-	 * to have in ordner to reject the null-hypothesis.
-	 *
-	 * The method for computing the p-value can be set via
-	 * set_null_approximation_method().
-	 * Since the null- distribution is normal, a Gaussian approximation
-	 * is available.
-	 *
-	 * @param alpha test level to reject null-hypothesis
-	 * @return threshold for statistics to reject null-hypothesis
-	 */
-	virtual float64_t compute_threshold(float64_t alpha);
-
-	/** computes a linear time estimate of the variance of the squared mmd,
-	 * which may be used for an approximation of the null-distribution
-	 * The value is the variance of the vector of which the MMD is the mean.
-	 *
-	 * @return variance estimate
-	 */
-	virtual float64_t compute_variance_estimate();
-
-	/** Abstract method that computes MMD and a linear time variance estimate.
-	 * If multiple_kernels is set to true, each subkernel is evaluated on the
-	 * same data.
-	 *
-	 * @param statistic return parameter for statistic, vector with entry for
-	 * each kernel. May be allocated before but doesn not have to be
-	 *
-	 * @param variance return parameter for statistic, vector with entry for
-	 * each kernel. May be allocated before but doesn not have to be
-	 *
-	 * @param multiple_kernels optional flag, if set to true, it is assumed that
-	 * the underlying kernel is of type K_COMBINED. Then, the MMD is computed on
-	 * all subkernel separately rather than computing it on the combination.
-	 * This is used by kernel selection strategies that need to evaluate
-	 * multiple kernels on the same data. Since the linear time MMD works on
-	 * streaming data, one cannot simply compute MMD, change kernel since data
-	 * would be different for every kernel.
-	 */
-	virtual void compute_statistic_and_variance(
-			SGVector<float64_t>& statistic, SGVector<float64_t>& variance,
-			bool multiple_kernels=false)=0;
-
-	/** Same as compute_statistic_and_variance, but computes a linear time
-	 * estimate of the covariance of the multiple-kernel-MMD.
-	 * See [1] for details.
-	 */
-	virtual void compute_statistic_and_Q(
-			SGVector<float64_t>& statistic, SGMatrix<float64_t>& Q)=0;
-
-	/** Mimics sampling null for MMD. However, samples are not permutated but
-	 * constantly streamed and then merged. Usually, this is not necessary
-	 * since there is the Gaussian approximation for the null distribution.
-	 * However, in certain cases this may fail and sampling the null
-	 * distribution might be numerically more stable. Ovewrite superclass
-	 * method that merges samples.
-	 *
-	 * @return vector of all statistics
-	 */
-	virtual SGVector<float64_t> sample_null();
-
-	/** Setter for the blocksize of examples to be processed at once
-	 * @param blocksize new blocksize to use
-	 */
-	void set_blocksize(index_t blocksize)
-	{
-		m_blocksize=blocksize;
-	}
-
-	/** Streams num_blocks data from each distribution with blocks of size
-	 * num_this_run. If m_simulate_h0 is set, it merges the blocks together,
-	 * shuffles and redistributes between the blocks.
-	 *
-	 * @param num_blocks number of blocks to be streamed from each distribution
-	 * @param num_this_run number of data points to be streamed for one block
-	 * @return an ordered list of blocks of data. The order in the
-	 * list is \f$x,x',\cdots\sim p\f$ followed by \f$y,y',\cdots\sim q\f$.
-	 * The features inside the list are SG_REF'ed and delete_data is set in the
-	 * list, which will destroy the at CList's destructor call
-	 */
-	CList* stream_data_blocks(index_t num_blocks, index_t num_this_run);
-
-	/** Not implemented for streaming MMD since it uses streaming feautres */
-	virtual void set_p_and_q(CFeatures* p_and_q);
-
-	/** Not implemented for streaming MMD since it uses streaming feautres */
-	virtual CFeatures* get_p_and_q();
-
-	/** Getter for streaming features of p distribution.
-	 * @return streaming features object for p distribution, SG_REF'ed
-	 */
-	virtual CStreamingFeatures* get_streaming_p();
-
-	/** Getter for streaming features of q distribution.
-	 * @return streaming features object for q distribution, SG_REF'ed
-	 */
-	virtual CStreamingFeatures* get_streaming_q();
-
-	/** @param simulate_h0 if true, samples from p and q will be mixed and
-	 * permuted
-	 */
-	inline void set_simulate_h0(bool simulate_h0)
-	{
-		m_simulate_h0=simulate_h0;
-	}
-
-	/** @return the class name */
-	virtual const char* get_name() const
-	{
-		return "StreamingMMD";
-	}
-
-protected:
-	/** abstract method that computes the squared MMD
-	 *
-	 * @param kernel the kernel to be used for computing MMD. This will be
-	 * useful when multiple kernels are used
-	 * @param data the list of data on which kernels are computed. The order
-	 * of data in the list is \f$x,x',\cdots\sim p\f$ followed by
-	 * \f$y,y',\cdots\sim q\f$. It is assumed that detele_data flag is set
-	 * inside the list
-	 * @param num_this_run number of data points in current blocks
-	 * @return the MMD values
-	 */
-	virtual SGVector<float64_t> compute_squared_mmd(CKernel* kernel,
-			CList* data, index_t num_this_run)=0;
-
-	/** Streaming feature objects that are used instead of merged samples */
-	CStreamingFeatures* m_streaming_p;
-
-	/** Streaming feature objects that are used instead of merged samples*/
-	CStreamingFeatures* m_streaming_q;
-
-	/** Number of examples processed at once, i.e. in one burst */
-	index_t m_blocksize;
-
-	/** If this is true, samples will be mixed between p and q in any method
-	 * that computes the statistic */
-	bool m_simulate_h0;
-
-private:
-	/** register parameters and initialize with defaults */
-	void init();
-};
-
-}
-
-#endif /* STREAMING_MMD_H_ */
-
diff --git a/src/shogun/statistics/TwoSampleTest.cpp b/src/shogun/statistics/TwoSampleTest.cpp
deleted file mode 100644
index 0510f3b5e48..00000000000
--- a/src/shogun/statistics/TwoSampleTest.cpp
+++ /dev/null
@@ -1,176 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/TwoSampleTest.h>
-#include <shogun/features/Features.h>
-#include <shogun/mathematics/Math.h>
-
-using namespace shogun;
-
-CTwoSampleTest::CTwoSampleTest() : CHypothesisTest()
-{
-	init();
-}
-
-CTwoSampleTest::CTwoSampleTest(CFeatures* p_and_q, index_t m) :
-	CHypothesisTest()
-{
-	init();
-
-	m_p_and_q=p_and_q;
-	SG_REF(m_p_and_q);
-
-	m_m=m;
-}
-
-CTwoSampleTest::CTwoSampleTest(CFeatures* p, CFeatures* q) :
-	CHypothesisTest()
-{
-	init();
-
-	m_p_and_q=p->create_merged_copy(q);
-	SG_REF(m_p_and_q);
-
-	m_m=p->get_num_vectors();
-}
-
-CTwoSampleTest::~CTwoSampleTest()
-{
-	SG_UNREF(m_p_and_q);
-}
-
-void CTwoSampleTest::init()
-{
-	SG_ADD((CSGObject**)&m_p_and_q, "p_and_q", "Concatenated samples p and q",
-			MS_NOT_AVAILABLE);
-	SG_ADD(&m_m, "m", "Index of first sample of q",
-			MS_NOT_AVAILABLE);
-
-	m_p_and_q=NULL;
-	m_m=0;
-}
-
-SGVector<float64_t> CTwoSampleTest::sample_null()
-{
-	SG_DEBUG("entering!\n")
-
-	REQUIRE(m_p_and_q, "No appended features p and q!\n");
-
-	/* compute sample statistics for null distribution */
-	SGVector<float64_t> results(m_num_null_samples);
-
-	/* memory for index permutations. Adding of subset has to happen
-	 * inside the loop since it may be copied if there already is one set */
-	SGVector<index_t> ind_permutation(m_p_and_q->get_num_vectors());
-	ind_permutation.range_fill();
-
-	for (index_t i=0; i<m_num_null_samples; ++i)
-	{
-		/* idea: merge features of p and q, shuffle, and compute statistic.
-		 * This is done using subsets here */
-
-		/* create index permutation and add as subset. This will mix samples
-		 * from p and q */
-		CMath::permute(ind_permutation);
-
-		/* compute statistic for this permutation of mixed samples */
-		m_p_and_q->add_subset(ind_permutation);
-		results[i]=compute_statistic();
-		m_p_and_q->remove_subset();
-	}
-
-	SG_DEBUG("leaving!\n")
-	return results;
-}
-
-float64_t CTwoSampleTest::compute_p_value(float64_t statistic)
-{
-	float64_t result=0;
-
-	if (m_null_approximation_method==PERMUTATION)
-	{
-		/* sample a bunch of MMD values from null distribution */
-		SGVector<float64_t> values=sample_null();
-
-		/* find out percentile of parameter "statistic" in null distribution */
-		CMath::qsort(values);
-		float64_t i=values.find_position_to_insert(statistic);
-
-		/* return corresponding p-value */
-		result=1.0-i/values.vlen;
-	}
-	else
-		SG_ERROR("Unknown method to approximate null distribution!\n");
-
-	return result;
-}
-
-float64_t CTwoSampleTest::compute_threshold(float64_t alpha)
-{
-	float64_t result=0;
-
-	if (m_null_approximation_method==PERMUTATION)
-	{
-		/* sample a bunch of MMD values from null distribution */
-		SGVector<float64_t> values=sample_null();
-
-		/* return value of (1-alpha) quantile */
-		result=values[index_t(CMath::floor(values.vlen*(1-alpha)))];
-	}
-	else
-		SG_ERROR("Unknown method to approximate null distribution!\n");
-
-	return result;
-}
-
-void CTwoSampleTest::set_p_and_q(CFeatures* p_and_q)
-{
-	/* ref before unref to avoid problems when instances are equal */
-	SG_REF(p_and_q);
-	SG_UNREF(m_p_and_q);
-	m_p_and_q=p_and_q;
-}
-
-void CTwoSampleTest::set_m(index_t m)
-{
-	REQUIRE(m_p_and_q, "Samples are not specified!\n");
-	REQUIRE(m_p_and_q->get_num_vectors()>m, "Provided sample size for p"
-			"(%d) is greater than total number of samples (%d)!\n",
-			m, m_p_and_q->get_num_vectors());
-	m_m=m;
-}
-
-CFeatures* CTwoSampleTest::get_p_and_q()
-{
-	SG_REF(m_p_and_q);
-	return m_p_and_q;
-}
-
diff --git a/src/shogun/statistics/TwoSampleTest.h b/src/shogun/statistics/TwoSampleTest.h
deleted file mode 100644
index aa57c5a1ccb..00000000000
--- a/src/shogun/statistics/TwoSampleTest.h
+++ /dev/null
@@ -1,144 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#ifndef TWO_SAMPLE_TEST_H_
-#define TWO_SAMPLE_TEST_H_
-
-#include <shogun/lib/config.h>
-
-#include <shogun/statistics/HypothesisTest.h>
-
-namespace shogun
-{
-
-class CFeatures;
-
-/** @brief Provides an interface for performing the classical two-sample test
- * i.e. Given samples from two distributions \f$p\f$ and \f$q\f$, the
- * null-hypothesis is: \f$H_0: p=q\f$, the alternative hypothesis:
- * \f$H_1: p\neq q\f$.
- *
- * Abstract base class. Provides all interfaces and implements approximating
- * the null distribution via permutation, i.e. repeatedly merging both samples
- * and them compute the test statistic on them.
- *
- */
-class CTwoSampleTest : public CHypothesisTest
-{
-public:
-	/** default constructor */
-	CTwoSampleTest();
-
-	/** Constructor
-	 *
-	 * @param p_and_q feature data. Is assumed to contain samples from both
-	 * p and q. First all samples from p, then from index m all
-	 * samples from q
-	 *
-	 * @param p_and_q samples from p and q, appended
-	 * @param m index of first sample of q
-	 */
-	CTwoSampleTest(CFeatures* p_and_q, index_t m);
-
-	/** Constructor.
-	 * This is a convienience constructor which copies both features to one
-	 * element and then calls the other constructor. Needs twice the memory
-	 * for a short time
-	 *
-	 * @param p samples from distribution p, will be copied and NOT
-	 * SG_REF'ed
-	 * @param q samples from distribution q, will be copied and NOT
-	 * SG_REF'ed
-	 */
-	CTwoSampleTest(CFeatures* p, CFeatures* q);
-
-	/** destructor */
-	virtual ~CTwoSampleTest();
-
-	/** merges both sets of samples and computes the test statistic
-	 * m_num_permutation_iteration times
-	 *
-	 * @return vector of all statistics
-	 */
-	virtual SGVector<float64_t> sample_null();
-
-	/** computes a p-value based on current method for approximating the
-	 * null-distribution. The p-value is the 1-p quantile of the null-
-	 * distribution where the given statistic lies in.
-	 *
-	 * @param statistic statistic value to compute the p-value for
-	 * @return p-value parameter statistic is the (1-p) percentile of the
-	 * null distribution
-	 */
-	virtual float64_t compute_p_value(float64_t statistic);
-
-	/** computes a threshold based on current method for approximating the
-	 * null-distribution. The threshold is the argument of the \f$1-\alpha\f$
-	 * quantile of the null. \f$\alpha\f$ is provided.
-	 *
-	 * @param alpha \f$\alpha\f$ quantile to get the threshold for
-	 * @return threshold which is the \f$1-\alpha\f$ quantile of the null
-	 * distribution
-	 */
-	virtual float64_t compute_threshold(float64_t alpha);
-
-	/** Setter for joint features
-	 * @param p_and_q joint features from p and q to set
-	 */
-	virtual void set_p_and_q(CFeatures* p_and_q);
-
-	/** Getter for joint features, SG_REF'ed
-	 * @return joint feature object
-	 */
-	virtual CFeatures* get_p_and_q();
-
-	/** @param m number of samples from first distribution p */
-	void set_m(index_t m);
-
-	/** @return number of to be used samples m */
-	index_t get_m() { return m_m; }
-
-	virtual const char* get_name() const=0;
-
-private:
-	void init();
-
-protected:
-	/** concatenated samples of the two distributions (two blocks) */
-	CFeatures* m_p_and_q;
-
-	/** defines the first index of samples of q */
-	index_t m_m;
-};
-
-}
-
-#endif /* TWO_SAMPLE_TEST_H_ */
diff --git a/tests/unit/base/SGObject_unittest.cc b/tests/unit/base/SGObject_unittest.cc
index af312927b11..e81520d530c 100644
--- a/tests/unit/base/SGObject_unittest.cc
+++ b/tests/unit/base/SGObject_unittest.cc
@@ -18,7 +18,6 @@
 #include <shogun/machine/gp/ZeroMean.h>
 #include <shogun/machine/gp/GaussianLikelihood.h>
 #include <shogun/io/SerializableAsciiFile.h>
-#include <shogun/statistics/QuadraticTimeMMD.h>
 #include <shogun/neuralnets/NeuralNetwork.h>
 #include "MockObject.h"
 #include <shogun/base/some.h>
@@ -44,13 +43,13 @@ TEST(SGObject,equals_NULL_parameter)
 
 	CDenseFeatures<float64_t>* feats=new CDenseFeatures<float64_t>(data);
 	CGaussianKernel* kernel=new CGaussianKernel();
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, feats, 5);
-	CQuadraticTimeMMD* mmd2=new CQuadraticTimeMMD(NULL, feats, 5);
+	CGaussianKernel* kernel2=new CGaussianKernel();
+	kernel2->init(feats, feats);
 
-	mmd->equals(mmd2);
+	EXPECT_FALSE(kernel->equals(kernel2));
 
-	SG_UNREF(mmd);
-	SG_UNREF(mmd2);
+	SG_UNREF(kernel);
+	SG_UNREF(kernel2);
 }
 
 #ifdef USE_REFERENCE_COUNTING
@@ -385,4 +384,4 @@ TEST(SGObject, tags_has)
 	EXPECT_EQ(obj->has("foo"), false);
 	EXPECT_EQ(obj->has<int32_t>("foo"), false);
 	EXPECT_EQ(obj->has(Tag<int32_t>("foo")), false);
-}
\ No newline at end of file
+}
diff --git a/tests/unit/features/DenseFeatures_unittest.cc b/tests/unit/features/DenseFeatures_unittest.cc
index 076dfc1511b..649bbede44c 100644
--- a/tests/unit/features/DenseFeatures_unittest.cc
+++ b/tests/unit/features/DenseFeatures_unittest.cc
@@ -145,12 +145,12 @@ TEST(DenseFeaturesTest, shallow_copy_subset_data)
 
 	SGMatrix<float64_t> orig_matrix=features->get_feature_matrix();
 	SGMatrix<float64_t> copy_matrix=static_cast<CDenseFeatures<float64_t>*>(features_copy)->get_feature_matrix();
-	
+
 
 	for (index_t i=0; i<dim; ++i)
-		for (index_t j=0; j<inds.size(); ++j)	
+		for (index_t j=0; j<inds.size(); ++j)
 			EXPECT_EQ(orig_matrix(i,j), copy_matrix(i,j));
-	
+
 	SG_UNREF(features_copy);
 	SG_UNREF(features);
 }
diff --git a/tests/unit/features/StreamingDenseFeatures_unittest.cc b/tests/unit/features/StreamingDenseFeatures_unittest.cc
index 23b9a8638f7..83c1511b937 100644
--- a/tests/unit/features/StreamingDenseFeatures_unittest.cc
+++ b/tests/unit/features/StreamingDenseFeatures_unittest.cc
@@ -95,3 +95,33 @@ TEST(StreamingDenseFeaturesTest, example_reading_from_features)
 
 	SG_UNREF(feats);
 }
+
+TEST(StreamingDenseFeaturesTest, reset_stream)
+{
+	index_t n=20;
+	index_t dim=2;
+
+	SGMatrix<float64_t> data(dim,n);
+	for (index_t i=0; i<dim*n; ++i)
+		data.matrix[i]=sg_rand->std_normal_distrib();
+
+	CDenseFeatures<float64_t>* orig_feats=new CDenseFeatures<float64_t>(data);
+	CStreamingDenseFeatures<float64_t>* feats=new CStreamingDenseFeatures<float64_t>(orig_feats);
+
+	feats->start_parser();
+
+	CDenseFeatures<float64_t>* streamed=dynamic_cast<CDenseFeatures<float64_t>*>(feats->get_streamed_features(n));
+	ASSERT_TRUE(streamed!=nullptr);
+	ASSERT_TRUE(orig_feats->equals(streamed));
+	SG_UNREF(streamed);
+
+	feats->reset_stream();
+
+	streamed=dynamic_cast<CDenseFeatures<float64_t>*>(feats->get_streamed_features(n));
+	ASSERT_TRUE(streamed!=nullptr);
+	ASSERT_TRUE(orig_feats->equals(streamed));
+	SG_UNREF(streamed);
+
+	feats->end_parser();
+	SG_UNREF(feats);
+}
diff --git a/tests/unit/preprocessor/BAHSIC_unittest.cc b/tests/unit/preprocessor/BAHSIC_unittest.cc
deleted file mode 100644
index 0d1458c6e0d..00000000000
--- a/tests/unit/preprocessor/BAHSIC_unittest.cc
+++ /dev/null
@@ -1,147 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/lib/config.h>
-#include <shogun/lib/SGVector.h>
-#include <shogun/lib/SGMatrix.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/labels/BinaryLabels.h>
-#include <shogun/mathematics/Math.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/preprocessor/BAHSIC.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-TEST(BAHSIC, get_selected_feats)
-{
-	const index_t dim=25;
-	const index_t num_data=100;
-
-	// use fix seed for reproducibility
-	CMath::init_random(12345);
-
-	SGMatrix<float64_t> data(dim, num_data);
-	for (index_t i=0; i<dim*num_data; ++i)
-		data.matrix[i]=CMath::randn_double();
-
-	SGVector<float64_t> labels_vec(num_data);
-	for (index_t i=0; i<num_data; ++i)
-		labels_vec[i]=CMath::random(0, 1);
-
-	CDenseFeatures<float64_t>* feats=new CDenseFeatures<float64_t>(data);
-	CBinaryLabels* labels=new CBinaryLabels(labels_vec);
-
-	float64_t sigma=0.5;
-
-	CGaussianKernel* kernel_p=new CGaussianKernel(10, 2*CMath::sq(sigma));
-	CGaussianKernel* kernel_q=new CGaussianKernel(10, 2*CMath::sq(sigma));
-
-	CBAHSIC* fs=new CBAHSIC();
-
-	index_t target_dim=dim/5;
-
-	fs->set_labels(labels);
-	fs->set_target_dim(target_dim);
-	fs->set_kernel_features(kernel_p);
-	fs->set_kernel_labels(kernel_q);
-	fs->set_policy(N_LARGEST);
-
-	// remove one feature at a time
-	fs->set_num_remove(1);
-
-	CFeatures* selected=fs->apply(feats);
-
-	SGMatrix<float64_t> selected_data
-		=((CDenseFeatures<float64_t>*)selected)->get_feature_matrix();
-
-	SGVector<index_t> inds=fs->get_selected_feats();
-
-	for (index_t i=0; i<target_dim; ++i)
-	{
-		for (index_t j=0; j<num_data; ++j)
-			EXPECT_NEAR(data(inds[i], j), selected_data(i, j), 1E-15);
-	}
-
-	SG_UNREF(selected);
-	SG_UNREF(fs);
-	SG_UNREF(feats);
-}
-
-TEST(BAHSIC, apply)
-{
-	const index_t dim=8;
-	const index_t num_data=5;
-
-	// use fix seed for reproducibility
-	CMath::init_random(1);
-
-	SGMatrix<float64_t> data(dim, num_data);
-	for (index_t i=0; i<dim*num_data; ++i)
-		data.matrix[i]=(i+1.0)/dim/num_data;
-
-	SGVector<float64_t> labels_vec(num_data);
-	for (index_t i=0; i<num_data; ++i)
-		labels_vec[i]=CMath::random(0, 1);
-
-	CDenseFeatures<float64_t>* feats=new CDenseFeatures<float64_t>(data);
-	CBinaryLabels* labels=new CBinaryLabels(labels_vec);
-	float64_t sigma=1.0;
-	CGaussianKernel* kernel_p=new CGaussianKernel(10, 2*CMath::sq(sigma));
-	CGaussianKernel* kernel_q=new CGaussianKernel(10, 2*CMath::sq(sigma));
-
-	CBAHSIC* fs=new CBAHSIC();
-	index_t target_dim=dim/2;
-	fs->set_labels(labels);
-	fs->set_target_dim(target_dim);
-	fs->set_kernel_features(kernel_p);
-	fs->set_kernel_labels(kernel_q);
-	fs->set_policy(N_LARGEST);
-	fs->set_num_remove(dim-target_dim);
-	CFeatures* selected=fs->apply(feats);
-
-	SGVector<index_t> selected_inds=fs->get_selected_feats();
-
-	// ensure that selected feats are the same as computed in local machine
-	SGVector<index_t> inds(target_dim);
-	inds[0]=0;
-	inds[1]=1;
-	inds[2]=2;
-	inds[3]=3;
-
-	EXPECT_EQ(selected_inds.vlen, inds.vlen);
-
-	for (index_t i=0; i<target_dim; ++i)
-		EXPECT_EQ(selected_inds[i], inds[i]);
-
-	SG_UNREF(selected);
-	SG_UNREF(fs);
-	SG_UNREF(feats);
-}
diff --git a/tests/unit/preprocessor/FeatureSelection_unittest.cc b/tests/unit/preprocessor/FeatureSelection_unittest.cc
deleted file mode 100644
index 8de9acffcba..00000000000
--- a/tests/unit/preprocessor/FeatureSelection_unittest.cc
+++ /dev/null
@@ -1,138 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/lib/config.h>
-#include <shogun/lib/SGVector.h>
-#include <shogun/lib/SGMatrix.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/labels/BinaryLabels.h>
-#include <shogun/mathematics/Math.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/statistics/HSIC.h>
-#include <shogun/preprocessor/BAHSIC.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-TEST(FeatureSelection, remove_feats)
-{
-	const index_t dim=8;
-	const index_t num_data=5;
-
-	// use fix seed for reproducibility
-	CMath::init_random(1);
-
-	SGMatrix<float64_t> data(dim, num_data);
-	for (index_t i=0; i<dim*num_data; ++i)
-		data.matrix[i]=i;
-
-	CDenseFeatures<float64_t>* feats=new CDenseFeatures<float64_t>(data);
-
-	CBAHSIC* fs=new CBAHSIC();
-	index_t target_dim=dim/2;
-	fs->set_num_remove(dim-target_dim);
-	fs->set_policy(N_LARGEST);
-
-	// create a dummy argsorted vector to remove last dim/2 features
-	SGVector<index_t> argsorted(dim);
-	argsorted.range_fill();
-
-	CFeatures* reduced=fs->remove_feats(feats, argsorted);
-	SGMatrix<float64_t> reduced_data
-		=((CDenseFeatures<float64_t>*)reduced)->get_feature_matrix();
-
-	for (index_t i=0; i<target_dim; ++i)
-	{
-		for (index_t j=0; j<num_data; ++j)
-			EXPECT_NEAR(data(i, j), reduced_data(i, j), 1E-15);
-	}
-
-	SG_UNREF(reduced);
-	SG_UNREF(fs);
-}
-
-TEST(FeatureSelection, compute_measures)
-{
-	const index_t dim=8;
-	const index_t num_data=5;
-
-	// use fix seed for reproducibility
-	CMath::init_random(1);
-
-	SGMatrix<float64_t> data(dim, num_data);
-	for (index_t i=0; i<dim*num_data; ++i)
-		data.matrix[i]=(i+1.0)/dim/num_data;
-
-	SGVector<float64_t> labels_vec(num_data);
-	for (index_t i=0; i<num_data; ++i)
-		labels_vec[i]=CMath::random(0, 1);
-
-	CDenseFeatures<float64_t>* feats=new CDenseFeatures<float64_t>(data);
-	CBinaryLabels* labels=new CBinaryLabels(labels_vec);
-	float64_t sigma=1.0;
-	CGaussianKernel* kernel_p=new CGaussianKernel(10, 2*CMath::sq(sigma));
-	CGaussianKernel* kernel_q=new CGaussianKernel(10, 2*CMath::sq(sigma));
-
-	// SG_REF'ing the kernel for q because it is SG_UNREF'ed in precompute
-	// call and to replace by a CCustomKernel
-	SG_REF(kernel_q);
-
-	CBAHSIC* fs=new CBAHSIC();
-	fs->set_labels(labels);
-	fs->set_kernel_features(kernel_p);
-	fs->set_kernel_labels(kernel_q);
-
-	// compute the measure removing dimension 0
-	float64_t measure=fs->compute_measures(feats, 0);
-
-	// recreate this using HSIC
-	SGVector<index_t> inds(dim-1);
-	for (index_t i=0; i<inds.vlen; ++i)
-		inds[i]=i+1;
-	CFeatures* transformed=feats->copy_dimension_subset(inds);
-
-	SGMatrix<float64_t> l_data(1, num_data);
-	memcpy(l_data.matrix, labels_vec.vector, sizeof(float64_t)*num_data);
-	CDenseFeatures<float64_t>* l_feats=new CDenseFeatures<float64_t>(l_data);
-
-	CHSIC* hsic=new CHSIC();
-	hsic->set_p(transformed);
-	hsic->set_q(l_feats);
-	hsic->set_kernel_p(kernel_p);
-	hsic->set_kernel_q(kernel_q);
-
-	EXPECT_NEAR(measure, hsic->compute_statistic(), 1E-15);
-
-	SG_UNREF(fs);
-	SG_UNREF(hsic);
-	SG_UNREF(kernel_q);
-	SG_UNREF(feats);
-	SG_UNREF(transformed);
-}
diff --git a/tests/unit/statistical_testing/KernelSelection_unittest.cc b/tests/unit/statistical_testing/KernelSelection_unittest.cc
new file mode 100644
index 00000000000..d6717a25f82
--- /dev/null
+++ b/tests/unit/statistical_testing/KernelSelection_unittest.cc
@@ -0,0 +1,390 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2012-2013 Heiko Strathmann
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/base/some.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/kernel/CombinedKernel.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/LinearTimeMMD.h>
+#include <shogun/statistical_testing/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/kernelselection/KernelSelectionStrategy.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+
+TEST(KernelSelectionMaxMMD, linear_time_single_kernel_streaming)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MAXIMIZE_MMD);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 0.03125, 1E-10);
+}
+
+TEST(KernelSelectionMaxMMD, quadratic_time_single_kernel_dense)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(m);
+	auto feats_q=gen_q->get_streamed_features(n);
+
+	auto mmd=some<CQuadraticTimeMMD>(feats_p, feats_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MAXIMIZE_MMD);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 0.25, 1E-10);
+}
+
+TEST(KernelSelectionMaxMMD, linear_time_weighted_kernel_streaming)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MAXIMIZE_MMD, true);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto weighted_kernel=dynamic_cast<CCombinedKernel*>(mmd->get_kernel());
+	ASSERT_TRUE(weighted_kernel!=nullptr);
+	ASSERT_TRUE(weighted_kernel->get_num_subkernels()==num_kernels);
+
+	SGVector<float64_t> weights=weighted_kernel->get_subkernel_weights();
+	for (auto i=0; i<weights.size(); ++i)
+		EXPECT_NEAR(weights[i], 0.1, 1E-10);
+}
+
+TEST(KernelSelectionMaxTestPower, linear_time_single_kernel_streaming)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MAXIMIZE_POWER);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 0.03125, 1E-10);
+}
+
+TEST(KernelSelectionMaxTestPower, quadratic_time_single_kernel)
+{
+	const index_t m=10;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mmd=some<CQuadraticTimeMMD>(gen_p, gen_q);
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MAXIMIZE_POWER);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 0.25, 1E-10);
+}
+
+TEST(KernelSelectionMaxTestPower, linear_time_weighted_kernel_streaming)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MAXIMIZE_POWER, true);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto weighted_kernel=dynamic_cast<CCombinedKernel*>(mmd->get_kernel());
+	ASSERT_TRUE(weighted_kernel!=nullptr);
+	ASSERT_TRUE(weighted_kernel->get_num_subkernels()==num_kernels);
+
+	SGVector<float64_t> weights=weighted_kernel->get_subkernel_weights();
+	for (auto i=0; i<weights.size(); ++i)
+		EXPECT_NEAR(weights[i], 0.1, 1E-10);
+}
+
+TEST(KernelSelectionMaxCrossValidation, quadratic_time_single_kernel_dense)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=5;
+	const index_t num_runs=1;
+	const index_t num_folds=3;
+	const float64_t train_test_ratio=3;
+	const float64_t alpha=0.05;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+	auto feats_p=gen_p->get_streamed_features(m);
+	auto feats_q=gen_q->get_streamed_features(n);
+
+	auto mmd=some<CQuadraticTimeMMD>(feats_p, feats_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+	mmd->set_num_null_samples(10);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_CROSS_VALIDATION, num_runs, num_folds, alpha);
+
+	mmd->set_train_test_mode(true);
+	mmd->set_train_test_ratio(train_test_ratio);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 0.25, 1E-10);
+}
+
+TEST(KernelSelectionMaxCrossValidation, linear_time_single_kernel_dense)
+{
+	const index_t m=8;
+	const index_t n=12;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+	const index_t num_runs=1;
+	const index_t num_folds=3;
+	const float64_t train_test_ratio=3;
+	const float64_t alpha=0.05;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+	auto feats_p=gen_p->get_streamed_features(m);
+	auto feats_q=gen_q->get_streamed_features(n);
+
+	auto mmd=some<CLinearTimeMMD>(feats_p, feats_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_CROSS_VALIDATION, num_runs, num_folds, alpha);
+
+	mmd->set_train_test_mode(true);
+	mmd->set_train_test_ratio(train_test_ratio);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 0.03125, 1E-10);
+}
+
+TEST(KernelSelectionMedianHeuristic, quadratic_time_single_kernel_dense)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mmd=some<CQuadraticTimeMMD>(gen_p, gen_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MEDIAN_HEURISTIC);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 1.0, 1E-10);
+}
+
+TEST(KernelSelectionMedianHeuristic, linear_time_single_kernel_dense)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+
+	sg_rand->set_seed(12345);
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->add_kernel(new CGaussianKernel(10, tau));
+	}
+	mmd->set_kernel_selection_strategy(KSM_MEDIAN_HEURISTIC);
+
+	mmd->set_train_test_mode(true);
+	mmd->select_kernel();
+	mmd->set_train_test_mode(false);
+
+	auto selected_kernel=static_cast<CGaussianKernel*>(mmd->get_kernel());
+	EXPECT_NEAR(selected_kernel->get_width(), 1.0, 1E-10);
+}
diff --git a/tests/unit/statistical_testing/LinearTimeMMD_unittest.cc b/tests/unit/statistical_testing/LinearTimeMMD_unittest.cc
new file mode 100644
index 00000000000..c74b92e4189
--- /dev/null
+++ b/tests/unit/statistical_testing/LinearTimeMMD_unittest.cc
@@ -0,0 +1,546 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2012-2013 Heiko Strathmann
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/base/some.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/LinearTimeMMD.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+
+TEST(LinearTimeMMD, biased_same_num_samples)
+{
+	const index_t m=4;
+	const index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	// assert matlab result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.090438791828373444, 1E-5);
+}
+
+TEST(LinearTimeMMD, unbiased_same_num_samples)
+{
+	const index_t m=4;
+	const index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	// assert matlab result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.066491458266665582, 1E-5);
+}
+
+TEST(LinearTimeMMD, incomplete_same_num_samples)
+{
+	const index_t m=4;
+	const index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_UNBIASED_INCOMPLETE);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	// assert local machine computed result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.083423196012644057, 1E-5);
+}
+
+TEST(LinearTimeMMD, biased_different_null_samples)
+{
+	const index_t m=4;
+	const index_t n=6;
+	const index_t d=3;
+	SGMatrix<float64_t> data(d,m+n);
+	for (index_t i=0; i<d*(m+n); ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, n);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*n);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+		data_p.matrix[i]/=max_p;
+	for (index_t i=0; i<d*n; ++i)
+		data_q.matrix[i]/=max_q;
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	// assert matlab result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.06525051478776954, 1E-5);
+}
+
+TEST(LinearTimeMMD, unbiased_different_null_samples)
+{
+	const index_t m=4;
+	const index_t n=6;
+	const index_t d=3;
+	SGMatrix<float64_t> data(d,m+n);
+	for (index_t i=0; i<d*(m+n); ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, n);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*n);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+		data_p.matrix[i]/=max_p;
+	for (index_t i=0; i<d*n; ++i)
+		data_q.matrix[i]/=max_q;
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	// assert matlab result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.039823645725702045, 1E-5);
+}
+
+TEST(LinearTimeMMD, compute_variance_null)
+{
+	const index_t m=8;
+	const index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(features_p, features_q);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	// assert local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	float64_t var=mmd->compute_variance();
+	EXPECT_NEAR(var, 0.0022330284118652344, 1E-10);
+
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	var=mmd->compute_variance();
+	EXPECT_NEAR(var, 0.0022330284118652344, 1E-10);
+
+	mmd->set_statistic_type(ST_UNBIASED_INCOMPLETE);
+	var=mmd->compute_variance();
+	EXPECT_NEAR(var, 0.0022330284118652344, 1E-10);
+}
+
+TEST(LinearTimeMMD, perform_test_permutation_biased_full)
+{
+	const index_t m=20;
+	const index_t n=30;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+
+	// compute p-value using permutation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	float64_t p_value=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value, 0.0, 1E-10);
+}
+
+TEST(LinearTimeMMD, perform_test_permutation_unbiased_full)
+{
+	const index_t m=20;
+	const index_t n=30;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+
+	// compute p-value using permutation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	float64_t p_value=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value, 0.0, 1E-10);
+}
+
+TEST(LinearTimeMMD, perform_test_permutation_unbiased_incomplete)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+
+	// compute p-value using permutation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_INCOMPLETE);
+	float64_t p_value=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value, 0.59999999999999998, 1E-10);
+}
+
+TEST(LinearTimeMMD, perform_test_gaussian_biased_full)
+{
+	const index_t m=20;
+	const index_t n=30;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_MMD1_GAUSSIAN);
+
+	// biased case
+
+	// compute p-value using Gaussian approximation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	float64_t p_value_gaussian=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value_gaussian, 0.0, 1E-10);
+}
+
+TEST(LinearTimeMMD, perform_test_gaussian_unbiased_full)
+{
+	const index_t m=20;
+	const index_t n=30;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_MMD1_GAUSSIAN);
+
+	// unbiased case
+
+	// compute p-value using spectrum approximation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	float64_t p_value_gaussian=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value_gaussian, 0.060947882185221292, 1E-6);
+}
+
+TEST(LinearTimeMMD, perform_test_gaussian_unbiased_incomplete)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CLinearTimeMMD>(gen_p, gen_q);
+	mmd->set_num_samples_p(m);
+	mmd->set_num_samples_q(n);
+	mmd->set_num_blocks_per_burst(1000);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_MMD1_GAUSSIAN);
+
+	// unbiased case
+
+	// compute p-value using spectrum approximation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_INCOMPLETE);
+	float64_t p_value_gaussian=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value_gaussian, 0.40645354706402292, 1E-6);
+}
diff --git a/tests/unit/statistical_testing/QuadraticTimeMMD_unittest.cc b/tests/unit/statistical_testing/QuadraticTimeMMD_unittest.cc
new file mode 100644
index 00000000000..e410172cc8a
--- /dev/null
+++ b/tests/unit/statistical_testing/QuadraticTimeMMD_unittest.cc
@@ -0,0 +1,680 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (W) 2012-2013 Heiko Strathmann
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/base/some.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/kernel/CustomKernel.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/mathematics/Statistics.h>
+#include <shogun/mathematics/eigen3.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/QuadraticTimeMMD.h>
+#include <shogun/statistical_testing/MultiKernelQuadraticTimeMMD.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace Eigen;
+
+TEST(QuadraticTimeMMD, biased_same_num_samples)
+{
+	index_t m=8;
+	index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_kernel(kernel);
+
+	// assert matlab result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.17882546486779649, 1E-5);
+}
+
+TEST(QuadraticTimeMMD, unbiased_same_num_samples)
+{
+	index_t m=8;
+	index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	mmd->set_kernel(kernel);
+
+	// assert matlab result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.13440094336133723, 1E-5);
+}
+
+TEST(QuadraticTimeMMD, incomplete_same_num_samples)
+{
+	index_t m=8;
+	index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	// normalise data
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_UNBIASED_INCOMPLETE);
+	mmd->set_kernel(kernel);
+
+	// assert local machine computed result
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.16743977201175841, 1E-5);
+}
+
+TEST(QuadraticTimeMMD, unbiased_different_num_samples)
+{
+	const index_t m=5;
+	const index_t n=6;
+	const index_t d=1;
+	float64_t data[] = {0.61318059, -0.69222999, 0.94424411, -0.48769626,
+		-0.00709551,  0.35025598, 0.20741384, -0.63622519, -1.21315264,
+	   	-0.77349617, -0.42707091};
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data[0]), sizeof(float64_t)*m);
+
+	SGMatrix<float64_t> data_q(d, n);
+	memcpy(&(data_q.matrix[0]), &(data[m]), sizeof(float64_t)*n);
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	CGaussianKernel* kernel=new CGaussianKernel(10, 2);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	mmd->set_kernel(kernel);
+
+	// assert python result at
+	// https://github.com/lambday/shogun-hypothesis-testing/blob/master/mmd.py
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, -0.037500338130199401, 1E-5);
+}
+
+TEST(QuadraticTimeMMD, biased_different_num_samples)
+{
+	const index_t m=5;
+	const index_t n=6;
+	const index_t d=1;
+	float64_t data[] = {-0.47616889, -2.1767364, -0.04185537, -1.20787529,
+		1.94875193, -0.16695709, 2.51282666, -0.58116389, 1.52366887,
+		0.18985099, 0.76120258};
+
+	// create data matrix for each features (appended is not supported)
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data[0]), sizeof(float64_t)*m);
+
+	SGMatrix<float64_t> data_q(d, n);
+	memcpy(&(data_q.matrix[0]), &(data[m]), sizeof(float64_t)*n);
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	// shoguns kernel width is different
+	CGaussianKernel* kernel=new CGaussianKernel(10, 2);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(features_p, features_q);
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	mmd->set_kernel(kernel);
+
+	// assert python result at
+	// https://github.com/lambday/shogun-hypothesis-testing/blob/master/mmd.py
+	float64_t statistic=mmd->compute_statistic();
+	EXPECT_NEAR(statistic, 0.54418915736201567, 1E-5);
+}
+
+TEST(QuadraticTimeMMD, compute_variance_h0)
+{
+	index_t m=8;
+	index_t d=3;
+	SGMatrix<float64_t> data(d,2*m);
+	for (index_t i=0; i<2*d*m; ++i)
+		data.matrix[i]=i;
+
+	SGMatrix<float64_t> data_p(d, m);
+	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
+
+	SGMatrix<float64_t> data_q(d, m);
+	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
+
+	float64_t max_p=data_p.max_single();
+	float64_t max_q=data_q.max_single();
+
+	for (index_t i=0; i<d*m; ++i)
+	{
+		data_p.matrix[i]/=max_p;
+		data_q.matrix[i]/=max_q;
+	}
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	auto mmd=some<CQuadraticTimeMMD>(features_p, features_q);
+	mmd->set_kernel(kernel);
+
+	float64_t var=mmd->compute_variance_h0();
+	EXPECT_NEAR(var, 0.0042963027954101562, 1E-10);
+}
+
+TEST(QuadraticTimeMMD, compute_variance_h1)
+{
+	const index_t m=5;
+	const index_t d=1;
+	const float64_t sigma=0.1;
+
+	SGVector<float64_t> samples(2*m);
+	samples[0]=1.935070;
+	samples[1]=-0.068707;
+	samples[2]=0.022104;
+	samples[3]=-0.454249;
+	samples[4]=0.926944;
+	samples[5]=-0.62854;
+	samples[6]=0.91924;
+	samples[7]=-0.25241;
+	samples[8]=1.64107;
+	samples[9]=-0.65426;
+
+	SGMatrix<float64_t> data_p(d, m);
+	std::copy(samples.data(), samples.data()+m, data_p.data());
+
+	SGMatrix<float64_t> data_q(d, m);
+	std::copy(samples.data()+m, samples.data()+samples.size(), data_q.data());
+
+	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
+	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
+
+	CGaussianKernel* kernel=new CGaussianKernel(10, sigma*sigma*2);
+
+	auto mmd=some<CQuadraticTimeMMD>(features_p, features_q);
+	mmd->set_kernel(kernel);
+	float64_t var=mmd->compute_variance_h1();
+	EXPECT_NEAR(var, 0.017511, 1E-6);
+
+	mmd->precompute_kernel_matrix(false);
+	var=mmd->compute_variance_h1();
+	EXPECT_NEAR(var, 0.017511, 1E-6);
+}
+
+TEST(QuadraticTimeMMD, perform_test_permutation_biased_full)
+{
+	const index_t m=20;
+	const index_t n=30;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	// stream some data from generator
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+
+	// compute p-value using permutation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	float64_t p_value=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value, 0.0, 1E-10);
+}
+
+TEST(QuadraticTimeMMD, perform_test_permutation_unbiased_full)
+{
+	const index_t m=20;
+	const index_t n=30;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	// stream some data from generator
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+
+	// compute p-value using permutation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	float64_t p_value=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value, 0.0, 1E-10);
+}
+
+TEST(QuadraticTimeMMD, perform_test_permutation_unbiased_incomplete)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	// stream some data from generator
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+
+	// compute p-value using permutation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_INCOMPLETE);
+	float64_t p_value=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value, 0.0, 1E-10);
+}
+
+TEST(QuadraticTimeMMD, perform_test_spectrum)
+{
+	const index_t m=20;
+	const index_t n=30;
+	const index_t dim=3;
+
+	// use fixed seed
+	sg_rand->set_seed(12345);
+
+	float64_t difference=0.5;
+
+	// streaming data generator for mean shift distributions
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	// stream some data from generator
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	// shoguns kernel width is different
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	// create MMD instance, convienience constructor
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	index_t num_eigenvalues=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_MMD2_SPECTRUM);
+	mmd->spectrum_set_num_eigenvalues(num_eigenvalues);
+
+	// biased case
+
+	// compute p-value using spectrum approximation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_BIASED_FULL);
+	float64_t p_value_spectrum=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value_spectrum, 0.0, 1E-10);
+
+	// unbiased case
+
+	// compute p-value using spectrum approximation for null distribution and
+	// assert against local machine computed result
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	p_value_spectrum=mmd->compute_p_value(mmd->compute_statistic());
+	EXPECT_NEAR(p_value_spectrum, 0.0, 1E-10);
+}
+
+TEST(QuadraticTimeMMD, precomputed_vs_nonprecomputed)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=3;
+
+	float64_t difference=0.5;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	float64_t sigma=2;
+	float64_t sq_sigma_twice=sigma*sigma*2;
+	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
+
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	mmd->set_kernel(kernel);
+
+	index_t num_null_samples=10;
+	mmd->set_num_null_samples(num_null_samples);
+	mmd->set_null_approximation_method(NAM_PERMUTATION);
+
+	sg_rand->set_seed(12345);
+	SGVector<float64_t> result_1=mmd->sample_null();
+
+	mmd->precompute_kernel_matrix(false);
+	sg_rand->set_seed(12345);
+	SGVector<float64_t> result_2=mmd->sample_null();
+
+	ASSERT_EQ(result_1.size(), result_2.size());
+	for (auto i=0; i<result_1.size(); ++i)
+		EXPECT_NEAR(result_1[i], result_2[i], 1E-6);
+}
+
+TEST(QuadraticTimeMMD, multikernel_compute_statistic)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=1;
+	const index_t num_kernels=10;
+
+	float64_t difference=0.5;
+	sg_rand->set_seed(12345);
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->multikernel()->add_kernel(new CGaussianKernel(10, tau));
+	}
+	SGVector<float64_t> mmd_multiple=mmd->multikernel()->compute_statistic();
+	mmd->multikernel()->cleanup();
+
+	SGVector<float64_t> mmd_single(num_kernels);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->set_kernel(new CGaussianKernel(10, tau));
+		mmd_single[i]=mmd->compute_statistic();
+	}
+
+	ASSERT_EQ(mmd_multiple.size(), mmd_single.size());
+	for (auto i=0; i<mmd_multiple.size(); ++i)
+		EXPECT_NEAR(mmd_multiple[i], mmd_single[i], 1E-4);
+}
+
+TEST(QuadraticTimeMMD, multikernel_compute_variance_h1)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=1;
+	const index_t num_kernels=10;
+
+	float64_t difference=0.5;
+	sg_rand->set_seed(12345);
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->multikernel()->add_kernel(new CGaussianKernel(10, tau));
+	}
+	SGVector<float64_t> var_est_multiple=mmd->multikernel()->compute_variance_h1();
+	mmd->multikernel()->cleanup();
+
+	SGVector<float64_t> var_est_single(num_kernels);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->set_kernel(new CGaussianKernel(10, tau));
+		var_est_single[i]=mmd->compute_variance_h1();
+	}
+
+	ASSERT_EQ(var_est_multiple.size(), var_est_single.size());
+	for (auto i=0; i<var_est_multiple.size(); ++i)
+		EXPECT_NEAR(var_est_multiple[i], var_est_single[i], 1E-4);
+}
+
+TEST(QuadraticTimeMMD, multikernel_compute_test_power)
+{
+	const index_t m=20;
+	const index_t n=20;
+	const index_t dim=1;
+	const index_t num_kernels=10;
+
+	float64_t difference=0.5;
+	sg_rand->set_seed(12345);
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	mmd->set_statistic_type(ST_UNBIASED_FULL);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->multikernel()->add_kernel(new CGaussianKernel(10, tau));
+	}
+	SGVector<float64_t> test_power_multiple=mmd->multikernel()->compute_test_power();
+	mmd->multikernel()->cleanup();
+
+	SGVector<float64_t> test_power_single(num_kernels);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->set_kernel(new CGaussianKernel(10, tau));
+		test_power_single[i]=mmd->compute_statistic()*(m+n)/m/n/CMath::sqrt(mmd->compute_variance_h1()+1E-5);
+	}
+
+	ASSERT_EQ(test_power_multiple.size(), test_power_single.size());
+	for (auto i=0; i<test_power_multiple.size(); ++i)
+		EXPECT_NEAR(test_power_multiple[i], test_power_single[i], 1E-4);
+}
+
+TEST(QuadraticTimeMMD, multikernel_perform_test)
+{
+	const index_t m=8;
+	const index_t n=12;
+	const index_t dim=1;
+	const index_t num_kernels=10;
+	const float64_t alpha=0.05;
+	const index_t num_null_samples=200;
+	const index_t cache_size=10;
+
+	float64_t difference=0.5;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	CFeatures* feat_p=gen_p->get_streamed_features(m);
+	CFeatures* feat_q=gen_q->get_streamed_features(n);
+
+	auto mmd=some<CQuadraticTimeMMD>(feat_p, feat_q);
+	mmd->set_num_null_samples(num_null_samples);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->multikernel()->add_kernel(new CGaussianKernel(cache_size, tau));
+	}
+	sg_rand->set_seed(12345);
+	SGVector<bool> rejections_multiple=mmd->multikernel()->perform_test(alpha);
+	mmd->multikernel()->cleanup();
+
+	SGVector<bool> rejections_single(num_kernels);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+	{
+		float64_t tau=pow(2, sigma);
+		mmd->set_kernel(new CGaussianKernel(cache_size, tau));
+		sg_rand->set_seed(12345);
+		rejections_single[i]=mmd->perform_test(alpha);
+	}
+
+	ASSERT_EQ(rejections_multiple.size(), rejections_single.size());
+	for (auto i=0; i<rejections_multiple.size(); ++i)
+		EXPECT_EQ(rejections_multiple[i], rejections_single[i]);
+}
diff --git a/tests/unit/statistical_testing/TwoDistributionTest_unittest.cc b/tests/unit/statistical_testing/TwoDistributionTest_unittest.cc
new file mode 100644
index 00000000000..0db451d8147
--- /dev/null
+++ b/tests/unit/statistical_testing/TwoDistributionTest_unittest.cc
@@ -0,0 +1,205 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/base/some.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/distance/EuclideanDistance.h>
+#include <shogun/statistical_testing/TwoDistributionTest.h>
+#include <gtest/gtest.h>
+#include <gmock/gmock.h>
+
+namespace shogun
+{
+
+class CTwoDistributionTestMock : public CTwoDistributionTest
+{
+public:
+	MOCK_METHOD0(compute_statistic, float64_t());
+	MOCK_METHOD0(sample_null, SGVector<float64_t>());
+};
+
+}
+
+using namespace shogun;
+
+TEST(TwoDistributionTest, compute_distance_dense)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=static_cast<CDenseFeatures<float64_t>*>(gen_p->get_streamed_features(m));
+	auto feats_q=static_cast<CDenseFeatures<float64_t>*>(gen_q->get_streamed_features(n));
+
+	auto mock_obj=some<CTwoDistributionTestMock>();
+	mock_obj->set_p(feats_p);
+	mock_obj->set_q(feats_q);
+
+	auto euclidean_distance=some<CEuclideanDistance>();
+	auto distance=mock_obj->compute_distance(euclidean_distance);
+	auto distance_mat1=distance->get_distance_matrix();
+	SG_UNREF(distance);
+
+	euclidean_distance->init(feats_p, feats_q);
+	auto distance_mat2=euclidean_distance->get_distance_matrix();
+
+	EXPECT_TRUE(distance_mat1.num_rows==distance_mat2.num_rows);
+	EXPECT_TRUE(distance_mat1.num_cols==distance_mat2.num_cols);
+	for (size_t i=0; i<distance_mat1.size(); ++i)
+		EXPECT_NEAR(distance_mat1.data()[i], distance_mat2.data()[i], 1E-6);
+}
+
+TEST(TwoDistributionTest, compute_joint_distance_dense)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=static_cast<CDenseFeatures<float64_t>*>(gen_p->get_streamed_features(m));
+	auto feats_q=static_cast<CDenseFeatures<float64_t>*>(gen_q->get_streamed_features(n));
+
+	auto mock_obj=some<CTwoDistributionTestMock>();
+	mock_obj->set_p(feats_p);
+	mock_obj->set_q(feats_q);
+
+	auto euclidean_distance=some<CEuclideanDistance>();
+	auto distance=mock_obj->compute_joint_distance(euclidean_distance);
+	auto distance_mat1=distance->get_distance_matrix();
+
+	SGMatrix<float64_t> data_p_and_q(dim, m+n);
+	auto data_p=feats_p->get_feature_matrix();
+	auto data_q=feats_q->get_feature_matrix();
+	std::copy(data_p.data(), data_p.data()+data_p.size(), data_p_and_q.data());
+	std::copy(data_q.data(), data_q.data()+data_q.size(), data_p_and_q.data()+data_p.size());
+	auto feats_p_and_q=some<CDenseFeatures<float64_t> >(data_p_and_q);
+
+	euclidean_distance->init(feats_p_and_q, feats_p_and_q);
+	auto distance_mat2=euclidean_distance->get_distance_matrix();
+
+	EXPECT_TRUE(distance_mat1.num_rows==distance_mat2.num_rows);
+	EXPECT_TRUE(distance_mat1.num_cols==distance_mat2.num_cols);
+	for (size_t i=0; i<distance_mat1.size(); ++i)
+		EXPECT_NEAR(distance_mat1.data()[i], distance_mat2.data()[i], 1E-6);
+
+	SG_UNREF(distance);
+}
+
+TEST(TwoDistributionTest, compute_distance_streaming)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mock_obj=some<CTwoDistributionTestMock>();
+	mock_obj->set_p(gen_p);
+	mock_obj->set_q(gen_q);
+	mock_obj->set_num_samples_p(m);
+	mock_obj->set_num_samples_q(n);
+
+	sg_rand->set_seed(12345);
+	auto euclidean_distance=some<CEuclideanDistance>();
+	auto distance=mock_obj->compute_distance(euclidean_distance);
+	auto distance_mat1=distance->get_distance_matrix();
+
+	sg_rand->set_seed(12345);
+	auto feats_p=static_cast<CDenseFeatures<float64_t>*>(gen_p->get_streamed_features(m));
+	auto feats_q=static_cast<CDenseFeatures<float64_t>*>(gen_q->get_streamed_features(n));
+	euclidean_distance->init(feats_p, feats_q);
+	auto distance_mat2=euclidean_distance->get_distance_matrix();
+
+	EXPECT_TRUE(distance_mat1.num_rows==distance_mat2.num_rows);
+	EXPECT_TRUE(distance_mat1.num_cols==distance_mat2.num_cols);
+	for (size_t i=0; i<distance_mat1.size(); ++i)
+		EXPECT_NEAR(distance_mat1.data()[i], distance_mat2.data()[i], 1E-6);
+
+	SG_UNREF(distance);
+}
+
+TEST(TwoDistributionTest, compute_joint_distance_streaming)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+
+	auto gen_p=new CMeanShiftDataGenerator(0, dim, 0);
+	auto gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
+
+	auto mock_obj=some<CTwoDistributionTestMock>();
+	mock_obj->set_p(gen_p);
+	mock_obj->set_q(gen_q);
+	mock_obj->set_num_samples_p(m);
+	mock_obj->set_num_samples_q(n);
+
+	sg_rand->set_seed(12345);
+	auto euclidean_distance=some<CEuclideanDistance>();
+	auto distance=mock_obj->compute_joint_distance(euclidean_distance);
+	auto distance_mat1=distance->get_distance_matrix();
+
+	sg_rand->set_seed(12345);
+	auto feats_p=static_cast<CDenseFeatures<float64_t>*>(gen_p->get_streamed_features(m));
+	auto feats_q=static_cast<CDenseFeatures<float64_t>*>(gen_q->get_streamed_features(n));
+
+	SGMatrix<float64_t> data_p_and_q(dim, m+n);
+	auto data_p=feats_p->get_feature_matrix();
+	auto data_q=feats_q->get_feature_matrix();
+	std::copy(data_p.data(), data_p.data()+data_p.size(), data_p_and_q.data());
+	std::copy(data_q.data(), data_q.data()+data_q.size(), data_p_and_q.data()+data_p.size());
+	auto feats_p_and_q=new CDenseFeatures<float64_t>(data_p_and_q);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	euclidean_distance->init(feats_p_and_q, feats_p_and_q);
+	auto distance_mat2=euclidean_distance->get_distance_matrix();
+
+	EXPECT_TRUE(distance_mat1.num_rows==distance_mat2.num_rows);
+	EXPECT_TRUE(distance_mat1.num_cols==distance_mat2.num_cols);
+	for (size_t i=0; i<distance_mat1.size(); ++i)
+		EXPECT_NEAR(distance_mat1.data()[i], distance_mat2.data()[i], 1E-6);
+
+	SG_UNREF(distance);
+}
diff --git a/tests/unit/statistical_testing/internals/Block_unittest.cc b/tests/unit/statistical_testing/internals/Block_unittest.cc
new file mode 100644
index 00000000000..cbedeb90720
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/Block_unittest.cc
@@ -0,0 +1,103 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/lib/SGVector.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/statistical_testing/internals/Block.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(Block, create_blocks)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+
+	// check whether correct number of blocks has been formed
+	auto blocks=Block::create_blocks(feats_p, num_vec/blocksize, blocksize);
+	ASSERT_TRUE(blocks.size()==size_t(num_vec/blocksize));
+
+	// check const cast operator
+	for (auto it=blocks.begin(); it!=blocks.end(); ++it)
+	{
+		const Block& block=*it;
+		auto block_feats=static_cast<const CFeatures*>(block);
+		ASSERT_TRUE(block_feats->get_num_vectors()==blocksize);
+	}
+
+	// check non-const cast operator
+	for (auto it=blocks.begin(); it!=blocks.end(); ++it)
+	{
+		Block& block=*it;
+		auto block_feats=static_cast<std::shared_ptr<CFeatures>>(block);
+		ASSERT_TRUE(block_feats->get_num_vectors()==blocksize);
+	}
+
+	// check const get() method
+	for (auto it=blocks.begin(); it!=blocks.end(); ++it)
+	{
+		const Block& block=*it;
+		auto block_feats=block.get();
+		ASSERT_TRUE(block_feats->get_num_vectors()==blocksize);
+	}
+
+	// check non-const get() method
+	for (auto it=blocks.begin(); it!=blocks.end(); ++it)
+	{
+		Block& block=*it;
+		auto block_feats=block.get();
+		ASSERT_TRUE(block_feats->get_num_vectors()==blocksize);
+	}
+
+	// check for proper block-wise organizing
+	SGVector<index_t> inds(blocksize);
+	std::iota(inds.vector, inds.vector+inds.vlen, 0);
+	for (size_t i=0; i<blocks.size(); ++i)
+	{
+		feats_p->add_subset(inds);
+		SGMatrix<float64_t> subset=feats_p->get_feature_matrix();
+		SGMatrix<float64_t> blockd=static_cast<feat_type*>(blocks[i].get())->get_feature_matrix();
+		ASSERT_TRUE(subset.equals(blockd));
+		feats_p->remove_subset();
+		std::for_each(inds.vector, inds.vector+inds.vlen, [&blocksize](index_t& val) { val+=blocksize; });
+	}
+
+	// no clean-up should be required
+}
diff --git a/tests/unit/statistical_testing/internals/CrossValidationMMD_unittest.cc b/tests/unit/statistical_testing/internals/CrossValidationMMD_unittest.cc
new file mode 100644
index 00000000000..9e15ca824d2
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/CrossValidationMMD_unittest.cc
@@ -0,0 +1,323 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/base/some.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/labels/BinaryLabels.h>
+#include <shogun/distance/CustomDistance.h>
+#include <shogun/distance/EuclideanDistance.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/evaluation/CrossValidationSplitting.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/mmd/CrossValidationMMD.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+
+TEST(CrossValidationMMD, biased_full)
+{
+	const index_t n=24;
+	const index_t m=15;
+	const index_t dim=2;
+	const index_t num_null_samples=5;
+	const index_t num_folds=3;
+	const index_t num_runs=2;
+	const index_t num_kernels=4;
+	const index_t cache_size=10;
+	const float64_t difference=0.5;
+	const float64_t alpha=0.05;
+	const auto stype=ST_BIASED_FULL;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(n);
+	auto feats_q=gen_q->get_streamed_features(m);
+	auto merged_feats=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+
+	KernelManager kernel_mgr;
+	for (auto i=0; i<num_kernels; ++i)
+	{
+		auto width=pow(2, i);
+		auto kernel=new CGaussianKernel(cache_size, width);
+		kernel_mgr.push_back(kernel);
+	}
+	auto distance_instance=kernel_mgr.get_distance_instance();
+	distance_instance->init(merged_feats, merged_feats);
+	auto precomputed_distance=some<CCustomDistance>();
+	auto distance_matrix=distance_instance->get_distance_matrix<float32_t>();
+	precomputed_distance->set_triangle_distance_matrix_from_full(distance_matrix.data(), n+m, n+m);
+	SG_UNREF(distance_instance);
+
+	kernel_mgr.set_precomputed_distance(precomputed_distance);
+	auto cv=CrossValidationMMD(n, m, num_folds, num_null_samples);
+	cv.m_stype=stype;
+	cv.m_alpha=alpha;
+	cv.m_num_runs=num_runs;
+	cv.m_rejections=SGMatrix<float64_t>(num_runs*num_folds, num_kernels);
+	sg_rand->set_seed(12345);
+	cv(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	SGVector<int64_t> dummy_labels_p(n);
+	SGVector<int64_t> dummy_labels_q(m);
+
+	auto kfold_p=some<CCrossValidationSplitting>(new CBinaryLabels(dummy_labels_p), num_folds);
+	auto kfold_q=some<CCrossValidationSplitting>(new CBinaryLabels(dummy_labels_q), num_folds);
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	for (auto k=0; k<num_kernels; ++k)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(k);
+		for (auto current_run=0; current_run<num_runs; ++current_run)
+		{
+			kfold_p->build_subsets();
+			kfold_q->build_subsets();
+
+			for (auto current_fold=0; current_fold<num_folds; ++current_fold)
+			{
+				auto current_train_subset_p=kfold_p->generate_subset_inverse(current_fold);
+				auto current_train_subset_q=kfold_q->generate_subset_inverse(current_fold);
+
+				feats_p->add_subset(current_train_subset_p);
+				feats_q->add_subset(current_train_subset_q);
+
+				permutation_mmd.m_n_x=feats_p->get_num_vectors();
+				permutation_mmd.m_n_y=feats_q->get_num_vectors();
+
+				auto current_merged_feats=static_cast<CDenseFeatures<float64_t>*>
+					(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+
+				kernel->init(current_merged_feats, current_merged_feats);
+				auto p_value=permutation_mmd.p_value(kernel->get_kernel_matrix<float32_t>());
+
+				EXPECT_EQ(cv.m_rejections(current_run*num_folds+current_fold, k), p_value<alpha);
+
+				kernel->remove_lhs_and_rhs();
+				feats_p->remove_subset();
+				feats_q->remove_subset();
+			}
+		}
+	}
+}
+
+TEST(CrossValidationMMD, unbiased_full)
+{
+	const index_t n=24;
+	const index_t m=15;
+	const index_t dim=2;
+	const index_t num_null_samples=5;
+	const index_t num_folds=3;
+	const index_t num_runs=2;
+	const index_t num_kernels=4;
+	const index_t cache_size=10;
+	const float64_t difference=0.5;
+	const float64_t alpha=0.05;
+	const auto stype=ST_UNBIASED_FULL;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(n);
+	auto feats_q=gen_q->get_streamed_features(m);
+	auto merged_feats=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+
+	KernelManager kernel_mgr;
+	for (auto i=0; i<num_kernels; ++i)
+	{
+		auto width=pow(2, i);
+		auto kernel=new CGaussianKernel(cache_size, width);
+		kernel_mgr.push_back(kernel);
+	}
+	auto distance_instance=kernel_mgr.get_distance_instance();
+	distance_instance->init(merged_feats, merged_feats);
+	auto precomputed_distance=some<CCustomDistance>();
+	auto distance_matrix=distance_instance->get_distance_matrix<float32_t>();
+	precomputed_distance->set_triangle_distance_matrix_from_full(distance_matrix.data(), n+m, n+m);
+	SG_UNREF(distance_instance);
+
+	kernel_mgr.set_precomputed_distance(precomputed_distance);
+	auto cv=CrossValidationMMD(n, m, num_folds, num_null_samples);
+	cv.m_stype=stype;
+	cv.m_alpha=alpha;
+	cv.m_num_runs=num_runs;
+	cv.m_rejections=SGMatrix<float64_t>(num_runs*num_folds, num_kernels);
+	sg_rand->set_seed(12345);
+	cv(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	SGVector<int64_t> dummy_labels_p(n);
+	SGVector<int64_t> dummy_labels_q(m);
+
+	auto kfold_p=some<CCrossValidationSplitting>(new CBinaryLabels(dummy_labels_p), num_folds);
+	auto kfold_q=some<CCrossValidationSplitting>(new CBinaryLabels(dummy_labels_q), num_folds);
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	for (auto k=0; k<num_kernels; ++k)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(k);
+		for (auto current_run=0; current_run<num_runs; ++current_run)
+		{
+			kfold_p->build_subsets();
+			kfold_q->build_subsets();
+
+			for (auto current_fold=0; current_fold<num_folds; ++current_fold)
+			{
+				auto current_train_subset_p=kfold_p->generate_subset_inverse(current_fold);
+				auto current_train_subset_q=kfold_q->generate_subset_inverse(current_fold);
+
+				feats_p->add_subset(current_train_subset_p);
+				feats_q->add_subset(current_train_subset_q);
+
+				permutation_mmd.m_n_x=feats_p->get_num_vectors();
+				permutation_mmd.m_n_y=feats_q->get_num_vectors();
+
+				auto current_merged_feats=static_cast<CDenseFeatures<float64_t>*>
+					(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+
+				kernel->init(current_merged_feats, current_merged_feats);
+				auto p_value=permutation_mmd.p_value(kernel->get_kernel_matrix<float32_t>());
+
+				EXPECT_EQ(cv.m_rejections(current_run*num_folds+current_fold, k), p_value<alpha);
+
+				kernel->remove_lhs_and_rhs();
+				feats_p->remove_subset();
+				feats_q->remove_subset();
+			}
+		}
+	}
+}
+
+TEST(CrossValidationMMD, unbiased_incomplete)
+{
+	const index_t n=18;
+	const index_t m=18;
+	const index_t dim=2;
+	const index_t num_null_samples=5;
+	const index_t num_folds=3;
+	const index_t num_runs=2;
+	const index_t num_kernels=4;
+	const index_t cache_size=10;
+	const float64_t difference=0.5;
+	const float64_t alpha=0.05;
+	const auto stype=ST_UNBIASED_INCOMPLETE;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(n);
+	auto feats_q=gen_q->get_streamed_features(m);
+	auto merged_feats=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+
+	KernelManager kernel_mgr;
+	for (auto i=0; i<num_kernels; ++i)
+	{
+		auto width=pow(2, i);
+		auto kernel=new CGaussianKernel(cache_size, width);
+		kernel_mgr.push_back(kernel);
+	}
+	auto distance_instance=kernel_mgr.get_distance_instance();
+	distance_instance->init(merged_feats, merged_feats);
+	auto precomputed_distance=some<CCustomDistance>();
+	auto distance_matrix=distance_instance->get_distance_matrix<float32_t>();
+	precomputed_distance->set_triangle_distance_matrix_from_full(distance_matrix.data(), n+m, n+m);
+	SG_UNREF(distance_instance);
+
+	kernel_mgr.set_precomputed_distance(precomputed_distance);
+	auto cv=CrossValidationMMD(n, m, num_folds, num_null_samples);
+	cv.m_stype=stype;
+	cv.m_alpha=alpha;
+	cv.m_num_runs=num_runs;
+	cv.m_rejections=SGMatrix<float64_t>(num_runs*num_folds, num_kernels);
+	sg_rand->set_seed(12345);
+	cv(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	SGVector<int64_t> dummy_labels_p(n);
+	SGVector<int64_t> dummy_labels_q(m);
+
+	auto kfold_p=some<CCrossValidationSplitting>(new CBinaryLabels(dummy_labels_p), num_folds);
+	auto kfold_q=some<CCrossValidationSplitting>(new CBinaryLabels(dummy_labels_q), num_folds);
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	for (auto k=0; k<num_kernels; ++k)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(k);
+		for (auto current_run=0; current_run<num_runs; ++current_run)
+		{
+			kfold_p->build_subsets();
+			kfold_q->build_subsets();
+
+			for (auto current_fold=0; current_fold<num_folds; ++current_fold)
+			{
+				auto current_train_subset_p=kfold_p->generate_subset_inverse(current_fold);
+				auto current_train_subset_q=kfold_q->generate_subset_inverse(current_fold);
+
+				feats_p->add_subset(current_train_subset_p);
+				feats_q->add_subset(current_train_subset_q);
+
+				permutation_mmd.m_n_x=feats_p->get_num_vectors();
+				permutation_mmd.m_n_y=feats_q->get_num_vectors();
+
+				auto current_merged_feats=static_cast<CDenseFeatures<float64_t>*>
+					(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+
+				kernel->init(current_merged_feats, current_merged_feats);
+				auto p_value=permutation_mmd.p_value(kernel->get_kernel_matrix<float32_t>());
+
+				EXPECT_EQ(cv.m_rejections(current_run*num_folds+current_fold, k), p_value<alpha);
+
+				kernel->remove_lhs_and_rhs();
+				feats_p->remove_subset();
+				feats_q->remove_subset();
+			}
+		}
+	}
+}
diff --git a/tests/unit/statistical_testing/internals/DataFetcherFactory_unittest.cc b/tests/unit/statistical_testing/internals/DataFetcherFactory_unittest.cc
new file mode 100644
index 00000000000..36fd29a552e
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/DataFetcherFactory_unittest.cc
@@ -0,0 +1,64 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <memory>
+#include <cstring>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/StreamingDenseFeatures.h>
+#include <shogun/statistical_testing/internals/DataFetcher.h>
+#include <shogun/statistical_testing/internals/StreamingDataFetcher.h>
+#include <shogun/statistical_testing/internals/DataFetcherFactory.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(DataFetcherFactory, get_instance)
+{
+	const index_t dim=1;
+	const index_t num_vec=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	data_p(0, 0)=0;
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+
+	std::unique_ptr<DataFetcher> fetcher(DataFetcherFactory::get_instance(feats_p));
+	ASSERT_TRUE(strcmp(fetcher->get_name(), "DataFetcher")==0);
+
+	CStreamingDenseFeatures<float64_t> *streaming_p=new CStreamingDenseFeatures<float64_t>(feats_p);
+	SG_REF(streaming_p);
+
+	std::unique_ptr<DataFetcher> streaming_fetcher(DataFetcherFactory::get_instance(streaming_p));
+	ASSERT_TRUE(strcmp(streaming_fetcher->get_name(), "StreamingDataFetcher")==0);
+}
diff --git a/tests/unit/statistical_testing/internals/DataFetcher_unittest.cc b/tests/unit/statistical_testing/internals/DataFetcher_unittest.cc
new file mode 100644
index 00000000000..d68ac582c88
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/DataFetcher_unittest.cc
@@ -0,0 +1,147 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <memory>
+#include <algorithm>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/statistical_testing/internals/DataFetcher.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(DataFetcher, full_data)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+
+	DataFetcher fetcher(feats_p);
+
+	fetcher.start();
+	auto curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+
+	auto tmp=dynamic_cast<feat_type*>(curr);
+	ASSERT_TRUE(tmp!=nullptr);
+
+	SG_UNREF(curr);
+
+	curr=fetcher.next();
+	ASSERT_TRUE(curr==nullptr);
+	fetcher.end();
+}
+
+TEST(DataFetcher, block_data)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+
+	DataFetcher fetcher(feats_p);
+
+	fetcher.fetch_blockwise()
+		.with_blocksize(blocksize)
+		.with_num_blocks_per_burst(num_blocks_per_burst);
+
+	fetcher.start();
+	auto curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+	while (curr!=nullptr)
+	{
+		auto tmp=dynamic_cast<feat_type*>(curr);
+		ASSERT_TRUE(tmp!=nullptr);
+		ASSERT_TRUE(tmp->get_num_vectors()==blocksize*num_blocks_per_burst);
+
+		SG_UNREF(curr);
+		curr=fetcher.next();
+	}
+	fetcher.end();
+}
+
+TEST(DataFetcher, reset_functionality)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+
+	DataFetcher fetcher(feats_p);
+
+	fetcher.start();
+	auto curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+
+	auto tmp=dynamic_cast<feat_type*>(curr);
+	ASSERT_TRUE(tmp!=nullptr);
+
+	SG_UNREF(curr);
+
+	curr=fetcher.next();
+	ASSERT_TRUE(curr==nullptr);
+
+	fetcher.reset();
+	fetcher.fetch_blockwise()
+		.with_blocksize(blocksize)
+		.with_num_blocks_per_burst(num_blocks_per_burst);
+
+	fetcher.start();
+	curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+	while (curr!=nullptr)
+	{
+		tmp=dynamic_cast<feat_type*>(curr);
+		ASSERT_TRUE(tmp!=nullptr);
+		ASSERT_TRUE(tmp->get_num_vectors()==blocksize*num_blocks_per_burst);
+		SG_UNREF(curr);
+		curr=fetcher.next();
+	}
+	fetcher.end();
+}
diff --git a/tests/unit/statistical_testing/internals/DataManager_unittest.cc b/tests/unit/statistical_testing/internals/DataManager_unittest.cc
new file mode 100644
index 00000000000..fab238474ae
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/DataManager_unittest.cc
@@ -0,0 +1,915 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <vector>
+#include <algorithm>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/StreamingDenseFeatures.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <shogun/statistical_testing/internals/NextSamples.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(DataManager, full_data_one_distribution_normal_feats)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t num_distributions=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	auto tmp=dynamic_cast<CDenseFeatures<float64_t>*>(next_burst[0][0].get());
+	ASSERT_TRUE(tmp!=nullptr);
+	ASSERT_TRUE(tmp->get_num_vectors()==num_vec);
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+
+	mgr.end();
+}
+
+TEST(DataManager, full_data_one_distribution_streaming_feats)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t num_distributions=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto streaming_p=new CStreamingDenseFeatures<float64_t>(feats_p);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=streaming_p;
+	mgr.num_samples_at(0)=num_vec;
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	auto tmp=dynamic_cast<CDenseFeatures<float64_t>*>(next_burst[0][0].get());
+	ASSERT_TRUE(tmp!=nullptr);
+	ASSERT_TRUE(tmp->get_num_vectors()==num_vec);
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+
+	mgr.end();
+}
+
+TEST(DataManager, full_data_two_distributions_normal_feats)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t num_distributions=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec, dim*num_vec);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+	mgr.samples_at(1)=feats_q;
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec);
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec);
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+}
+
+TEST(DataManager, full_data_two_distributions_streaming_feats)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t num_distributions=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec, dim*num_vec);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+	auto streaming_p=new CStreamingDenseFeatures<float64_t>(feats_p);
+	auto streaming_q=new CStreamingDenseFeatures<float64_t>(feats_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=streaming_p;
+	mgr.samples_at(1)=streaming_q;
+	mgr.num_samples_at(0)=num_vec;
+	mgr.num_samples_at(1)=num_vec;
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec);
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec);
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+}
+
+TEST(DataManager, block_data_one_distribution_normal_feats)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+	const index_t num_distributions=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp=dynamic_cast<CDenseFeatures<float64_t>*>(next_burst[0][i].get());
+			ASSERT_TRUE(tmp!=nullptr);
+			ASSERT_TRUE(tmp->get_num_vectors()==blocksize);
+			total+=tmp->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec);
+}
+
+TEST(DataManager, block_data_one_distribution_streaming_feats)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+	const index_t num_distributions=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto streaming_p=new CStreamingDenseFeatures<float64_t>(feats_p);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=streaming_p;
+	mgr.num_samples_at(0)=num_vec;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp=dynamic_cast<CDenseFeatures<float64_t>*>(next_burst[0][i].get());
+			ASSERT_TRUE(tmp!=nullptr);
+			ASSERT_TRUE(tmp->get_num_vectors()==blocksize);
+			total+=tmp->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec);
+}
+
+TEST(DataManager, block_data_two_distributions_normal_feats_equal_blocksize)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+	const index_t num_distributions=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec, dim*num_vec);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+	mgr.samples_at(1)=feats_q;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec);
+}
+
+TEST(DataManager, block_data_two_distributions_streaming_feats_equal_blocksize)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+	const index_t num_distributions=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec, dim*num_vec);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+	auto streaming_p=new CStreamingDenseFeatures<float64_t>(feats_p);
+	auto streaming_q=new CStreamingDenseFeatures<float64_t>(feats_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=streaming_p;
+	mgr.samples_at(1)=streaming_q;
+	mgr.num_samples_at(0)=num_vec;
+	mgr.num_samples_at(1)=num_vec;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec);
+}
+
+TEST(DataManager, block_data_two_distributions_normal_feats_different_blocksize)
+{
+	const index_t dim=3;
+	const index_t num_vec_p=8;
+	const index_t num_vec_q=12;
+	const index_t blocksize=5;
+	const index_t num_blocks_per_burst=3;
+	const index_t num_distributions=2;
+
+	auto blocksize_p=blocksize*num_vec_p/(num_vec_p+num_vec_q);
+	auto blocksize_q=blocksize*num_vec_q/(num_vec_p+num_vec_q);
+
+	SGMatrix<float64_t> data_p(dim, num_vec_p);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec_p, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec_q);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec_q, dim*num_vec_p);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+	mgr.samples_at(1)=feats_q;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total_p=0;
+	auto total_q=0;
+
+	while (!next_burst.empty())
+	{
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize_p);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize_q);
+			total_p+=tmp_p->get_num_vectors();
+			total_q+=tmp_q->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total_p==num_vec_p);
+	ASSERT_TRUE(total_q==num_vec_q);
+}
+
+TEST(DataManager, block_data_two_distributions_streaming_feats_different_blocksize)
+{
+	const index_t dim=3;
+	const index_t num_vec_p=8;
+	const index_t num_vec_q=12;
+	const index_t blocksize=5;
+	const index_t num_blocks_per_burst=3;
+	const index_t num_distributions=2;
+
+	auto blocksize_p=blocksize*num_vec_p/(num_vec_p+num_vec_q);
+	auto blocksize_q=blocksize*num_vec_q/(num_vec_p+num_vec_q);
+
+	SGMatrix<float64_t> data_p(dim, num_vec_p);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec_p, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec_q);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec_q, dim*num_vec_p);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+	auto streaming_p=new CStreamingDenseFeatures<float64_t>(feats_p);
+	auto streaming_q=new CStreamingDenseFeatures<float64_t>(feats_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=streaming_p;
+	mgr.samples_at(1)=streaming_q;
+	mgr.num_samples_at(0)=num_vec_p;
+	mgr.num_samples_at(1)=num_vec_q;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total_p=0;
+	auto total_q=0;
+
+	while (!next_burst.empty())
+	{
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize_p);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize_q);
+			total_p+=tmp_p->get_num_vectors();
+			total_q+=tmp_q->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total_p==num_vec_p);
+	ASSERT_TRUE(total_q==num_vec_q);
+}
+
+TEST(DataManager, train_test_whole_dense)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t num_distributions=2;
+	const index_t train_test_ratio=3;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec, dim*num_vec);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+	mgr.samples_at(1)=feats_q;
+
+	mgr.set_train_test_mode(true);
+	mgr.set_train_test_ratio(train_test_ratio);
+
+	// training data
+	mgr.set_train_mode(true);
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec*train_test_ratio/(train_test_ratio+1));
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec*train_test_ratio/(train_test_ratio+1));
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+	mgr.end();
+
+	// test data
+	mgr.set_train_mode(false);
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec/(train_test_ratio+1));
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec/(train_test_ratio+1));
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+	mgr.end();
+
+	// full data
+	mgr.set_train_test_mode(false);
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec);
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec);
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+	mgr.end();
+}
+
+TEST(DataManager, train_test_blockwise_dense)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+	const index_t num_distributions=2;
+	const index_t train_test_ratio=3;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	SGMatrix<float64_t> data_q(dim, num_vec);
+	std::iota(data_q.matrix, data_q.matrix+dim*num_vec, dim*num_vec);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	auto feats_q=new feat_type(data_q);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+	mgr.samples_at(1)=feats_q;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.set_train_test_mode(true);
+	mgr.set_train_test_ratio(train_test_ratio);
+
+	// train data
+	mgr.set_train_mode(true);
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec*train_test_ratio/(train_test_ratio+1));
+	mgr.end();
+
+	// test data
+	mgr.set_train_mode(false);
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec/(train_test_ratio+1));
+	mgr.end();
+
+	// full data
+	mgr.set_train_test_mode(false);
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec);
+	mgr.end();
+}
+
+TEST(DataManager, train_test_whole_streaming)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t num_distributions=2;
+	const index_t train_test_ratio=3;
+	const float64_t difference=0.5;
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=new CMeanShiftDataGenerator(0, dim, 0);
+	mgr.samples_at(1)=new CMeanShiftDataGenerator(difference, dim, 0);
+	mgr.num_samples_at(0)=num_vec;
+	mgr.num_samples_at(1)=num_vec;
+
+	typedef CDenseFeatures<float64_t> feat_type;
+
+	mgr.set_train_test_mode(true);
+	mgr.set_train_test_ratio(train_test_ratio);
+
+	// training data
+	mgr.set_train_mode(true);
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec*train_test_ratio/(train_test_ratio+1));
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec*train_test_ratio/(train_test_ratio+1));
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+	mgr.end();
+
+	// test data
+	mgr.set_train_mode(false);
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec/(train_test_ratio+1));
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec/(train_test_ratio+1));
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+	mgr.end();
+
+	// full data
+	mgr.set_train_test_mode(false);
+	mgr.reset();
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+
+	tmp_p=dynamic_cast<feat_type*>(next_burst[0][0].get());
+	tmp_q=dynamic_cast<feat_type*>(next_burst[1][0].get());
+
+	ASSERT_TRUE(tmp_p!=nullptr);
+	ASSERT_TRUE(tmp_q!=nullptr);
+	ASSERT_TRUE(tmp_p->get_num_vectors()==num_vec);
+	ASSERT_TRUE(tmp_q->get_num_vectors()==num_vec);
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+	mgr.end();
+}
+
+TEST(DataManager, train_test_blockwise_streaming)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+	const index_t num_distributions=2;
+	const index_t train_test_ratio=3;
+	const float64_t difference=0.5;
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=new CMeanShiftDataGenerator(0, dim, 0);
+	mgr.samples_at(1)=new CMeanShiftDataGenerator(difference, dim, 0);
+	mgr.num_samples_at(0)=num_vec;
+	mgr.num_samples_at(1)=num_vec;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	typedef CDenseFeatures<float64_t> feat_type;
+
+	mgr.set_train_test_mode(true);
+	mgr.set_train_test_ratio(train_test_ratio);
+
+	// train data
+	mgr.set_train_mode(true);
+	mgr.start();
+
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	auto total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec*train_test_ratio/(train_test_ratio+1));
+	mgr.end();
+
+	// test data
+	mgr.set_train_mode(false);
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec/(train_test_ratio+1));
+	mgr.end();
+
+	// full data
+	mgr.set_train_test_mode(false);
+	mgr.reset();
+	mgr.start();
+
+	next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+
+	total=0;
+
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp_p=dynamic_cast<feat_type*>(next_burst[0][i].get());
+			auto tmp_q=dynamic_cast<feat_type*>(next_burst[1][i].get());
+			ASSERT_TRUE(tmp_p!=nullptr);
+			ASSERT_TRUE(tmp_q!=nullptr);
+			ASSERT_TRUE(tmp_p->get_num_vectors()==blocksize/2);
+			ASSERT_TRUE(tmp_q->get_num_vectors()==blocksize/2);
+			total+=tmp_p->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec);
+	mgr.end();
+}
+
+TEST(DataManager, set_blockwise_on_off)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+	const index_t num_distributions=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+
+	DataManager mgr(num_distributions);
+	mgr.samples_at(0)=feats_p;
+	mgr.set_blocksize(blocksize);
+	mgr.set_num_blocks_per_burst(num_blocks_per_burst);
+
+	mgr.set_blockwise(false);
+	mgr.start();
+	auto next_burst=mgr.next();
+	ASSERT_TRUE(!next_burst.empty());
+	ASSERT_TRUE(next_burst.num_blocks()==1);
+	auto casted=dynamic_cast<CDenseFeatures<float64_t>*>(next_burst[0][0].get());
+	ASSERT_TRUE(casted!=nullptr);
+	ASSERT_TRUE(casted->get_num_vectors()==num_vec);
+	next_burst=mgr.next();
+	ASSERT_TRUE(next_burst.empty());
+	mgr.end();
+
+	mgr.reset();
+	mgr.set_blockwise(true);
+	mgr.start();
+	auto total=0;
+	next_burst=mgr.next();
+	while (!next_burst.empty())
+	{
+		ASSERT_TRUE(next_burst.num_blocks()==num_blocks_per_burst);
+		for (auto i=0; i<next_burst.num_blocks(); ++i)
+		{
+			auto tmp=dynamic_cast<CDenseFeatures<float64_t>*>(next_burst[0][i].get());
+			ASSERT_TRUE(tmp!=nullptr);
+			ASSERT_TRUE(tmp->get_num_vectors()==blocksize);
+			total+=tmp->get_num_vectors();
+		}
+		next_burst=mgr.next();
+	}
+	ASSERT_TRUE(total==num_vec);
+}
diff --git a/tests/unit/statistical_testing/internals/FeaturesUtil_unittest.cc b/tests/unit/statistical_testing/internals/FeaturesUtil_unittest.cc
new file mode 100644
index 00000000000..5d48e79c007
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/FeaturesUtil_unittest.cc
@@ -0,0 +1,141 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(FeaturesUtil, create_shallow_copy)
+{
+	const index_t dim=2;
+	const index_t num_vec=10;
+
+	SGMatrix<float64_t> data(dim, num_vec);
+	std::iota(data.matrix, data.matrix+dim*num_vec, 0);
+
+	auto feats=new CDenseFeatures<float64_t>(data);
+	SGVector<index_t> inds(5);
+	std::iota(inds.data(), inds.data()+inds.size(), 3);
+	feats->add_subset(inds);
+	SGVector<index_t> inds2(2);
+	std::iota(inds2.data(), inds2.data()+inds2.size(), 1);
+	feats->add_subset(inds2);
+
+	auto shallow_copy=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_shallow_copy(feats));
+	int32_t num_feats=0, num_vecs=0;
+	float64_t* copied_data=shallow_copy->get_feature_matrix(num_feats, num_vecs);
+	ASSERT_TRUE(data.data()==copied_data);
+	ASSERT_TRUE(dim==num_feats);
+	ASSERT_TRUE(num_vec==num_vecs);
+
+	SGMatrix<float64_t> src=feats->get_feature_matrix();
+	SGMatrix<float64_t> dst=shallow_copy->get_feature_matrix();
+	ASSERT(src.equals(dst));
+
+	shallow_copy->remove_all_subsets();
+	SG_UNREF(shallow_copy);
+
+	feats->remove_all_subsets();
+	SG_UNREF(feats);
+}
+
+TEST(FeaturesUtil, create_merged_copy)
+{
+	const index_t dim=2;
+	const index_t num_vec=3;
+
+	SGMatrix<float64_t> data(dim, num_vec);
+	std::iota(data.matrix, data.matrix+dim*num_vec, 0);
+
+	auto feats_a=new CDenseFeatures<float64_t>(data);
+	SGVector<index_t> inds_a(2);
+	inds_a[0]=1;
+	inds_a[1]=2;
+	feats_a->add_subset(inds_a);
+	SGMatrix<float64_t> data_a=feats_a->get_feature_matrix();
+
+	auto feats_b=new CDenseFeatures<float64_t>(data);
+	SGVector<index_t> inds_b(2);
+	inds_b[0]=0;
+	inds_b[1]=2;
+	feats_b->add_subset(inds_b);
+	SGMatrix<float64_t> data_b=feats_b->get_feature_matrix();
+
+	SGMatrix<float64_t> merged(dim, data_a.num_cols+data_b.num_cols);
+	std::copy(data_a.data(), data_a.data()+data_a.size(), merged.data());
+	std::copy(data_b.data(), data_b.data()+data_b.size(), merged.data()+data_a.size());
+
+	auto merged_copy=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_merged_copy(feats_a, feats_b));
+	SGMatrix<float64_t> copied(merged_copy->get_feature_matrix());
+	ASSERT_TRUE(merged.equals(copied));
+
+	SG_UNREF(merged_copy);
+	SG_UNREF(feats_a);
+	SG_UNREF(feats_b);
+}
+
+TEST(FeaturesUtil, clone_subset_stack)
+{
+	const index_t dim=2;
+	const index_t num_vec=10;
+
+	SGMatrix<float64_t> data(dim, num_vec);
+	std::iota(data.matrix, data.matrix+dim*num_vec, 0);
+
+	auto feats=new CDenseFeatures<float64_t>(data);
+	SGVector<index_t> inds(5);
+	std::iota(inds.data(), inds.data()+inds.size(), 3);
+	feats->add_subset(inds);
+	SGVector<index_t> inds2(2);
+	std::iota(inds2.data(), inds2.data()+inds2.size(), 1);
+	feats->add_subset(inds2);
+
+	auto copy=new CDenseFeatures<float64_t>(data);
+	FeaturesUtil::clone_subset_stack(feats, copy);
+
+	auto src_subset_stack=feats->get_subset_stack();
+	auto dst_subset_stack=copy->get_subset_stack();
+	ASSERT_TRUE(src_subset_stack->equals(dst_subset_stack));
+	SG_UNREF(src_subset_stack);
+	SG_UNREF(dst_subset_stack);
+
+	copy->remove_all_subsets();
+	SG_UNREF(copy);
+
+	feats->remove_all_subsets();
+	SG_UNREF(feats);
+}
diff --git a/tests/unit/statistical_testing/internals/InitPerFeature_unittest.cc b/tests/unit/statistical_testing/internals/InitPerFeature_unittest.cc
new file mode 100644
index 00000000000..22a8577336a
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/InitPerFeature_unittest.cc
@@ -0,0 +1,69 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <type_traits>
+#include <shogun/lib/common.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/statistical_testing/internals/DataManager.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(InitPerFeature, assignment_and_cast_operators)
+{
+	const index_t dim=1;
+	const index_t num_vec=1;
+	const index_t num_distributions=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	data_p(0, 0)=0;
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+
+	DataManager data_mgr(num_distributions);
+	data_mgr.samples_at(0)=feats_p;
+	const DataManager& const_data_mgr=data_mgr;
+
+	auto stored_feats=data_mgr.samples_at(0);
+	bool typecheck=std::is_same<InitPerFeature, decltype(stored_feats)>::value;
+	ASSERT_TRUE(typecheck);
+	ASSERT_TRUE(feats_p==stored_feats);
+
+	auto stored_feats2=const_data_mgr.samples_at(0);
+	typecheck=std::is_same<CFeatures*, decltype(stored_feats2)>::value;
+	ASSERT_TRUE(typecheck);
+	ASSERT_TRUE(feats_p==stored_feats2);
+
+	const CFeatures* samples=static_cast<const CFeatures*>(stored_feats);
+	ASSERT_TRUE(feats_p==samples);
+}
diff --git a/tests/unit/statistical_testing/internals/KernelManager_unittest.cc b/tests/unit/statistical_testing/internals/KernelManager_unittest.cc
new file mode 100644
index 00000000000..480c156f4e1
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/KernelManager_unittest.cc
@@ -0,0 +1,69 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(KernelManager, store_precompute_restore)
+{
+	const index_t dim=1;
+	const index_t num_vec=1;
+	const index_t num_kernels=1;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	data_p(0, 0)=0;
+
+	auto feats=new CDenseFeatures<float64_t>(data_p);
+	auto kernel=new CGaussianKernel();
+	kernel->set_width(0.5);
+
+	KernelManager kernel_mgr(num_kernels);
+	const KernelManager& const_kernel_mgr=kernel_mgr;
+
+	kernel_mgr.kernel_at(0)=kernel;
+	ASSERT_TRUE(const_kernel_mgr.kernel_at(0)->get_kernel_type()==K_GAUSSIAN);
+
+	CKernel* k=const_kernel_mgr.kernel_at(0);
+	k->init(feats, feats);
+	kernel_mgr.precompute_kernel_at(0);
+	ASSERT_TRUE(const_kernel_mgr.kernel_at(0)!=kernel);
+	ASSERT_TRUE(const_kernel_mgr.kernel_at(0)->get_kernel_type()==K_CUSTOM);
+
+	kernel_mgr.restore_kernel_at(0);
+	ASSERT_TRUE(const_kernel_mgr.kernel_at(0)==kernel);
+	ASSERT_TRUE(const_kernel_mgr.kernel_at(0)->get_kernel_type()==K_GAUSSIAN);
+}
diff --git a/tests/unit/statistical_testing/internals/Kernel_unittest.cc b/tests/unit/statistical_testing/internals/Kernel_unittest.cc
new file mode 100644
index 00000000000..0812c37948f
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/Kernel_unittest.cc
@@ -0,0 +1,63 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/base/some.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/statistical_testing/internals/Kernel.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(SelfAdjointKernelFunctor, kernel)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const float64_t sigma=0.1;
+
+	SGMatrix<float64_t> data(dim, num_vec);
+	for (auto i=0; i<dim*num_vec; ++i)
+		data.matrix[i]=sg_rand->random(0.0, 0.1);
+	auto feats=some<CDenseFeatures<float64_t> >(data);
+
+	auto kernel=some<CGaussianKernel>(10, 2*sigma*sigma);
+	kernel->init(feats, feats);
+
+	SelfAdjointPrecomputedKernel kernel_functor;
+	kernel_functor.precompute(kernel);
+
+	for (auto i=0; i<num_vec; ++i)
+	{
+		for (auto j=0; j<num_vec; ++j)
+			EXPECT_NEAR(kernel->kernel(i, j), kernel_functor(i, j), 1E-6);
+	}
+}
diff --git a/tests/unit/statistical_testing/internals/MultiKernelMMD_unittest.cc b/tests/unit/statistical_testing/internals/MultiKernelMMD_unittest.cc
new file mode 100644
index 00000000000..412483a18f2
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/MultiKernelMMD_unittest.cc
@@ -0,0 +1,260 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <shogun/base/some.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/distance/EuclideanDistance.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/mathematics/eigen3.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/TwoDistributionTest.h>
+#include <shogun/statistical_testing/internals/KernelManager.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+#include <gtest/gtest.h>
+#include <gmock/gmock.h>
+
+namespace shogun
+{
+
+class CTwoDistributionTestMock : public CTwoDistributionTest
+{
+public:
+	MOCK_METHOD0(compute_statistic, float64_t());
+	MOCK_METHOD0(sample_null, SGVector<float64_t>());
+};
+
+}
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+using Eigen::Map;
+using Eigen::MatrixXd;
+
+TEST(MultiKernelMMD, biased_full)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+	const EStatisticType stype=ST_BIASED_FULL;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(m);
+	auto feats_q=gen_q->get_streamed_features(n);
+
+	auto test=some<CTwoDistributionTestMock>();
+	test->set_p(feats_p);
+	test->set_q(feats_q);
+
+	KernelManager kernel_mgr(num_kernels);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+		kernel_mgr.kernel_at(i)=new CGaussianKernel(10, pow(2, sigma));
+	auto distance=kernel_mgr.get_distance_instance();
+	kernel_mgr.set_precomputed_distance(test->compute_joint_distance(distance));
+
+	ComputeMMD tester;
+	tester.m_n_x=m;
+	tester.m_n_y=n;
+	tester.m_stype=stype;
+	SGVector<float64_t> values=tester(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	auto data_p=static_cast<CDenseFeatures<float64_t>*>(feats_p)->get_feature_matrix();
+	auto data_q=static_cast<CDenseFeatures<float64_t>*>(feats_q)->get_feature_matrix();
+	SGMatrix<float64_t> data_p_and_q(dim, m+n);
+	std::copy(data_p.data(), data_p.data()+data_p.size(), data_p_and_q.data());
+	std::copy(data_q.data(), data_q.data()+data_q.size(), data_p_and_q.data()+data_p.size());
+	auto feats_p_and_q=new CDenseFeatures<float64_t>(data_p_and_q);
+	SG_REF(feats_p_and_q);
+
+	SGVector<float64_t> ref(kernel_mgr.num_kernels());
+	for (size_t i=0; i<kernel_mgr.num_kernels(); ++i)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(i);
+		kernel->init(feats_p_and_q, feats_p_and_q);
+		SGMatrix<float64_t> km=kernel->get_kernel_matrix();
+		Map<MatrixXd> map(km.data(), km.num_rows, km.num_cols);
+		auto term_0=map.block(0, 0, m, m).sum();
+		auto term_1=map.block(m, m, n, n).sum();
+		auto term_2=map.block(m, 0, n, m).sum();
+		term_0/=m*m;
+		term_1/=n*n;
+		term_2/=m*n;
+		ref[i]=term_0+term_1-2*term_2;
+		kernel->remove_lhs_and_rhs();
+	}
+	SG_UNREF(feats_p_and_q);
+
+	ASSERT_EQ(ref.size(), values.size());
+	for (auto i=0; i<ref.size(); ++i)
+	{
+		EXPECT_NEAR(ref[i], values[i], 1E-6);
+	}
+}
+
+TEST(MultiKernelMMD, unbiased_full)
+{
+	const index_t m=5;
+	const index_t n=10;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+	const EStatisticType stype=ST_UNBIASED_FULL;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(m);
+	auto feats_q=gen_q->get_streamed_features(n);
+
+	auto test=some<CTwoDistributionTestMock>();
+	test->set_p(feats_p);
+	test->set_q(feats_q);
+
+	KernelManager kernel_mgr(num_kernels);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+		kernel_mgr.kernel_at(i)=new CGaussianKernel(10, pow(2, sigma));
+	auto distance=kernel_mgr.get_distance_instance();
+	kernel_mgr.set_precomputed_distance(test->compute_joint_distance(distance));
+
+	ComputeMMD tester;
+	tester.m_n_x=m;
+	tester.m_n_y=n;
+	tester.m_stype=stype;
+	SGVector<float64_t> values=tester(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	auto data_p=static_cast<CDenseFeatures<float64_t>*>(feats_p)->get_feature_matrix();
+	auto data_q=static_cast<CDenseFeatures<float64_t>*>(feats_q)->get_feature_matrix();
+	SGMatrix<float64_t> data_p_and_q(dim, m+n);
+	std::copy(data_p.data(), data_p.data()+data_p.size(), data_p_and_q.data());
+	std::copy(data_q.data(), data_q.data()+data_q.size(), data_p_and_q.data()+data_p.size());
+	auto feats_p_and_q=new CDenseFeatures<float64_t>(data_p_and_q);
+	SG_REF(feats_p_and_q);
+
+	SGVector<float64_t> ref(kernel_mgr.num_kernels());
+	for (size_t i=0; i<kernel_mgr.num_kernels(); ++i)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(i);
+		kernel->init(feats_p_and_q, feats_p_and_q);
+		SGMatrix<float64_t> km=kernel->get_kernel_matrix();
+		Map<MatrixXd> map(km.data(), km.num_rows, km.num_cols);
+		auto term_0=map.block(0, 0, m, m).sum()-map.diagonal().head(m).sum();
+		auto term_1=map.block(m, m, n, n).sum()-map.diagonal().tail(n).sum();
+		auto term_2=map.block(m, 0, n, m).sum();
+		term_0/=m*(m-1);
+		term_1/=n*(n-1);
+		term_2/=m*n;
+		ref[i]=term_0+term_1-2*term_2;
+		kernel->remove_lhs_and_rhs();
+	}
+	SG_UNREF(feats_p_and_q);
+
+	ASSERT_EQ(ref.size(), values.size());
+	for (auto i=0; i<ref.size(); ++i)
+	{
+		EXPECT_NEAR(ref[i], values[i], 1E-6);
+	}
+}
+
+TEST(MultiKernelMMD, unbiased_incomplete)
+{
+	const index_t m=8;
+	const index_t n=8;
+	const index_t dim=1;
+	const float64_t difference=0.5;
+	const index_t num_kernels=10;
+	const EStatisticType stype=ST_UNBIASED_INCOMPLETE;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(m);
+	auto feats_q=gen_q->get_streamed_features(n);
+
+	auto test=some<CTwoDistributionTestMock>();
+	test->set_p(feats_p);
+	test->set_q(feats_q);
+
+	KernelManager kernel_mgr(num_kernels);
+	for (auto i=0, sigma=-5; i<num_kernels; ++i, sigma+=1)
+		kernel_mgr.kernel_at(i)=new CGaussianKernel(10, pow(2, sigma));
+	auto distance=kernel_mgr.get_distance_instance();
+	kernel_mgr.set_precomputed_distance(test->compute_joint_distance(distance));
+
+	ComputeMMD tester;
+	tester.m_n_x=m;
+	tester.m_n_y=n;
+	tester.m_stype=stype;
+	SGVector<float64_t> values=tester(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	auto data_p=static_cast<CDenseFeatures<float64_t>*>(feats_p)->get_feature_matrix();
+	auto data_q=static_cast<CDenseFeatures<float64_t>*>(feats_q)->get_feature_matrix();
+	SGMatrix<float64_t> data_p_and_q(dim, m+n);
+	std::copy(data_p.data(), data_p.data()+data_p.size(), data_p_and_q.data());
+	std::copy(data_q.data(), data_q.data()+data_q.size(), data_p_and_q.data()+data_p.size());
+	auto feats_p_and_q=new CDenseFeatures<float64_t>(data_p_and_q);
+	SG_REF(feats_p_and_q);
+
+	SGVector<float64_t> ref(kernel_mgr.num_kernels());
+	for (size_t i=0; i<kernel_mgr.num_kernels(); ++i)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(i);
+		kernel->init(feats_p_and_q, feats_p_and_q);
+		SGMatrix<float64_t> km=kernel->get_kernel_matrix();
+		Map<MatrixXd> map(km.data(), km.num_rows, km.num_cols);
+		auto term_0=map.block(0, 0, m, m).sum()-map.diagonal().head(m).sum();
+		auto term_1=map.block(m, m, n, n).sum()-map.diagonal().tail(n).sum();
+		auto term_2=map.block(m, 0, n, m).sum()-map.block(m, 0, n, m).diagonal().sum();
+		term_0/=m*(m-1);
+		term_1/=n*(n-1);
+		term_2/=m*(n-1);
+		ref[i]=term_0+term_1-2*term_2;
+		kernel->remove_lhs_and_rhs();
+	}
+	SG_UNREF(feats_p_and_q);
+
+	ASSERT_EQ(ref.size(), values.size());
+	for (auto i=0; i<ref.size(); ++i)
+	{
+		EXPECT_NEAR(ref[i], values[i], 1E-6);
+	}
+}
diff --git a/tests/unit/statistical_testing/internals/PermutationMMD_unittest.cc b/tests/unit/statistical_testing/internals/PermutationMMD_unittest.cc
new file mode 100644
index 00000000000..4bd850d5522
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/PermutationMMD_unittest.cc
@@ -0,0 +1,533 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <shogun/base/some.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/mathematics/eigen3.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/Kernel.h>
+#include <shogun/statistical_testing/internals/FeaturesUtil.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+#include <shogun/statistical_testing/internals/mmd/PermutationMMD.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+using namespace mmd;
+
+using Eigen::Map;
+using Eigen::MatrixXf;
+using Eigen::Dynamic;
+using Eigen::PermutationMatrix;
+
+TEST(PermutationMMD, biased_full_single_kernel)
+{
+	const index_t dim=2;
+	const index_t n=13;
+	const index_t m=7;
+	const index_t num_null_samples=5;
+	const auto stype=ST_BIASED_FULL;
+
+	SGMatrix<float64_t> data_p(dim, n);
+	std::iota(data_p.matrix, data_p.matrix+dim*n, 1);
+	std::for_each(data_p.matrix, data_p.matrix+dim*n, [&n](float64_t& val) { val/=n; });
+
+	SGMatrix<float64_t> data_q(dim, m);
+	std::iota(data_q.matrix, data_q.matrix+dim*m, n+1);
+	std::for_each(data_q.matrix, data_q.matrix+dim*m, [&m](float64_t& val) { val/=2*m; });
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto feats_q=new CDenseFeatures<float64_t>(data_q);
+	auto feats=feats_p->create_merged_copy(feats_q);
+	SG_REF(feats);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	auto kernel=some<CGaussianKernel>();
+	kernel->set_width(2.0);
+
+	kernel->init(feats, feats);
+	auto kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_n_x=n;
+	permutation_mmd.m_n_y=m;
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	SGVector<float32_t> result_1=permutation_mmd(kernel_matrix);
+
+	auto compute_mmd=ComputeMMD();
+	compute_mmd.m_n_x=n;
+	compute_mmd.m_n_y=m;
+	compute_mmd.m_stype=stype;
+
+	Map<MatrixXf> map(kernel_matrix.matrix, kernel_matrix.num_rows, kernel_matrix.num_cols);
+	SGVector<float32_t> result_2(num_null_samples);
+	sg_rand->set_seed(12345);
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		PermutationMatrix<Dynamic, Dynamic> perm(kernel_matrix.num_rows);
+		perm.setIdentity();
+		SGVector<int> perminds(perm.indices().data(), perm.indices().size(), false);
+		CMath::permute(perminds);
+		MatrixXf permuted = perm.transpose()*map*perm;
+		SGMatrix<float32_t> permuted_km(permuted.data(), permuted.rows(), permuted.cols(), false);
+		result_2[i]=compute_mmd(permuted_km);
+	}
+
+	SGVector<index_t> inds(kernel_matrix.num_rows);
+	SGVector<float32_t> result_3(num_null_samples);
+	sg_rand->set_seed(12345);
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		std::iota(inds.vector, inds.vector+inds.vlen, 0);
+		CMath::permute(inds);
+		feats->add_subset(inds);
+		kernel->init(feats, feats);
+		kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+		result_3[i]=compute_mmd(kernel_matrix);
+		feats->remove_subset();
+	}
+
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		EXPECT_NEAR(result_1[i], result_2[i], 1E-6);
+		EXPECT_NEAR(result_1[i], result_3[i], 1E-6);
+	}
+
+	SG_UNREF(feats);
+}
+
+TEST(PermutationMMD, unbiased_full_single_kernel)
+{
+	const index_t dim=2;
+	const index_t n=13;
+	const index_t m=7;
+	const index_t num_null_samples=5;
+	const auto stype=ST_UNBIASED_FULL;
+
+	SGMatrix<float64_t> data_p(dim, n);
+	std::iota(data_p.matrix, data_p.matrix+dim*n, 1);
+	std::for_each(data_p.matrix, data_p.matrix+dim*n, [&n](float64_t& val) { val/=n; });
+
+	SGMatrix<float64_t> data_q(dim, m);
+	std::iota(data_q.matrix, data_q.matrix+dim*m, n+1);
+	std::for_each(data_q.matrix, data_q.matrix+dim*m, [&m](float64_t& val) { val/=2*m; });
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto feats_q=new CDenseFeatures<float64_t>(data_q);
+	auto feats=feats_p->create_merged_copy(feats_q);
+	SG_REF(feats);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	auto kernel=some<CGaussianKernel>();
+	kernel->set_width(2.0);
+
+	kernel->init(feats, feats);
+	auto kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_n_x=n;
+	permutation_mmd.m_n_y=m;
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	SGVector<float32_t> result_1=permutation_mmd(kernel_matrix);
+
+	auto compute_mmd=ComputeMMD();
+	compute_mmd.m_n_x=n;
+	compute_mmd.m_n_y=m;
+	compute_mmd.m_stype=stype;
+
+	Map<MatrixXf> map(kernel_matrix.matrix, kernel_matrix.num_rows, kernel_matrix.num_cols);
+	SGVector<float32_t> result_2(num_null_samples);
+	sg_rand->set_seed(12345);
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		PermutationMatrix<Dynamic, Dynamic> perm(kernel_matrix.num_rows);
+		perm.setIdentity();
+		SGVector<int> perminds(perm.indices().data(), perm.indices().size(), false);
+		CMath::permute(perminds);
+		MatrixXf permuted = perm.transpose()*map*perm;
+		SGMatrix<float32_t> permuted_km(permuted.data(), permuted.rows(), permuted.cols(), false);
+		result_2[i]=compute_mmd(permuted_km);
+	}
+
+	SGVector<index_t> inds(kernel_matrix.num_rows);
+	SGVector<float32_t> result_3(num_null_samples);
+	sg_rand->set_seed(12345);
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		std::iota(inds.vector, inds.vector+inds.vlen, 0);
+		CMath::permute(inds);
+		feats->add_subset(inds);
+		kernel->init(feats, feats);
+		kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+		result_3[i]=compute_mmd(kernel_matrix);
+		feats->remove_subset();
+	}
+
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		EXPECT_NEAR(result_1[i], result_2[i], 1E-6);
+		EXPECT_NEAR(result_1[i], result_3[i], 1E-6);
+	}
+
+	SG_UNREF(feats);
+}
+
+TEST(PermutationMMD, unbiased_incomplete_single_kernel)
+{
+	const index_t dim=2;
+	const index_t n=10;
+	const index_t num_null_samples=5;
+	const auto stype=ST_UNBIASED_INCOMPLETE;
+
+	SGMatrix<float64_t> data_p(dim, n);
+	std::iota(data_p.matrix, data_p.matrix+dim*n, 1);
+	std::for_each(data_p.matrix, data_p.matrix+dim*n, [&n](float64_t& val) { val/=n; });
+
+	SGMatrix<float64_t> data_q(dim, n);
+	std::iota(data_q.matrix, data_q.matrix+dim*n, n+1);
+	std::for_each(data_q.matrix, data_q.matrix+dim*n, [&n](float64_t& val) { val/=2*n; });
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto feats_q=new CDenseFeatures<float64_t>(data_q);
+	auto feats=feats_p->create_merged_copy(feats_q);
+	SG_REF(feats);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	auto kernel=some<CGaussianKernel>();
+	kernel->set_width(2.0);
+
+	kernel->init(feats, feats);
+	auto kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_n_x=n;
+	permutation_mmd.m_n_y=n;
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	SGVector<float32_t> result_1=permutation_mmd(kernel_matrix);
+
+	auto compute_mmd=ComputeMMD();
+	compute_mmd.m_n_x=n;
+	compute_mmd.m_n_y=n;
+	compute_mmd.m_stype=stype;
+
+	Map<MatrixXf> map(kernel_matrix.matrix, kernel_matrix.num_rows, kernel_matrix.num_cols);
+	SGVector<float32_t> result_2(num_null_samples);
+	sg_rand->set_seed(12345);
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		PermutationMatrix<Dynamic, Dynamic> perm(kernel_matrix.num_rows);
+		perm.setIdentity();
+		SGVector<int> perminds(perm.indices().data(), perm.indices().size(), false);
+		CMath::permute(perminds);
+		MatrixXf permuted = perm.transpose()*map*perm;
+		SGMatrix<float32_t> permuted_km(permuted.data(), permuted.rows(), permuted.cols(), false);
+		result_2[i]=compute_mmd(permuted_km);
+	}
+
+	SGVector<index_t> inds(kernel_matrix.num_rows);
+	SGVector<float32_t> result_3(num_null_samples);
+	sg_rand->set_seed(12345);
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		std::iota(inds.vector, inds.vector+inds.vlen, 0);
+		CMath::permute(inds);
+		feats->add_subset(inds);
+		kernel->init(feats, feats);
+		kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+		result_3[i]=compute_mmd(kernel_matrix);
+		feats->remove_subset();
+	}
+
+	for (auto i=0; i<num_null_samples; ++i)
+	{
+		EXPECT_NEAR(result_1[i], result_2[i], 1E-6);
+		EXPECT_NEAR(result_1[i], result_3[i], 1E-6);
+	}
+
+	SG_UNREF(feats);
+}
+
+TEST(PermutationMMD, precomputed_vs_non_precomputed_single_kernel)
+{
+	const index_t dim=2;
+	const index_t n=8;
+	const index_t m=8;
+	const index_t num_null_samples=5;
+	const auto stype=ST_BIASED_FULL;
+
+	SGMatrix<float64_t> data_p(dim, n);
+	std::iota(data_p.matrix, data_p.matrix+dim*n, 1);
+	std::for_each(data_p.matrix, data_p.matrix+dim*n, [&n](float64_t& val) { val/=n; });
+
+	SGMatrix<float64_t> data_q(dim, m);
+	std::iota(data_q.matrix, data_q.matrix+dim*m, n+1);
+	std::for_each(data_q.matrix, data_q.matrix+dim*m, [&m](float64_t& val) { val/=2*m; });
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto feats_q=new CDenseFeatures<float64_t>(data_q);
+	auto feats=feats_p->create_merged_copy(feats_q);
+	SG_REF(feats);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	auto kernel=some<CGaussianKernel>();
+	kernel->set_width(2.0);
+
+	kernel->init(feats, feats);
+	auto kernel_matrix=kernel->get_kernel_matrix<float32_t>();
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_n_x=n;
+	permutation_mmd.m_n_y=m;
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	SGVector<float32_t> result_1=permutation_mmd(kernel_matrix);
+
+	sg_rand->set_seed(12345);
+	SGVector<float32_t> result_2=permutation_mmd(Kernel(kernel));
+
+	EXPECT_TRUE(result_1.size()==result_2.size());
+	for (auto i=0; i<result_1.size(); ++i)
+		EXPECT_NEAR(result_1[i], result_2[i], 1E-6);
+
+	SG_UNREF(feats);
+}
+
+TEST(PermutationMMD, biased_full_multi_kernel)
+{
+	const index_t n=24;
+	const index_t m=15;
+	const index_t dim=2;
+	const index_t num_null_samples=5;
+	const index_t num_kernels=4;
+	const index_t cache_size=10;
+	const float64_t difference=0.5;
+	const auto stype=ST_BIASED_FULL;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(n);
+	auto feats_q=gen_q->get_streamed_features(m);
+	auto merged_feats=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+	SG_REF(merged_feats);
+
+	KernelManager kernel_mgr;
+	for (auto i=0; i<num_kernels; ++i)
+	{
+		auto width=CMath::pow(2, i);
+		auto kernel=new CGaussianKernel(cache_size, width);
+		kernel_mgr.push_back(kernel);
+	}
+	auto distance_instance=kernel_mgr.get_distance_instance();
+	distance_instance->init(merged_feats, merged_feats);
+	auto precomputed_distance=some<CCustomDistance>();
+	auto distance_matrix=distance_instance->get_distance_matrix<float32_t>();
+	precomputed_distance->set_triangle_distance_matrix_from_full(distance_matrix.data(), n+m, n+m);
+	SG_UNREF(distance_instance);
+	kernel_mgr.set_precomputed_distance(precomputed_distance);
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_n_x=n;
+	permutation_mmd.m_n_y=m;
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	SGMatrix<float32_t> null_samples=permutation_mmd(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	ASSERT_EQ(null_samples.num_cols, num_kernels);
+	ASSERT_EQ(null_samples.num_rows, num_null_samples);
+
+	for (auto k=0; k<num_kernels; ++k)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(k);
+		kernel->init(merged_feats, merged_feats);
+		sg_rand->set_seed(12345);
+		SGVector<float32_t> curr_null_samples=permutation_mmd(kernel->get_kernel_matrix<float32_t>());
+
+		ASSERT_EQ(curr_null_samples.size(), null_samples.num_rows);
+		for (auto i=0; i<num_null_samples; ++i)
+			EXPECT_NEAR(null_samples(i, k), curr_null_samples[i], 1E-5);
+
+		kernel->remove_lhs_and_rhs();
+	}
+	SG_UNREF(merged_feats);
+}
+
+TEST(PermutationMMD, unbiased_full_multi_kernel)
+{
+	const index_t n=24;
+	const index_t m=15;
+	const index_t dim=2;
+	const index_t num_null_samples=5;
+	const index_t num_kernels=4;
+	const index_t cache_size=10;
+	const float64_t difference=0.5;
+	const auto stype=ST_UNBIASED_FULL;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(n);
+	auto feats_q=gen_q->get_streamed_features(m);
+	auto merged_feats=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+	SG_REF(merged_feats);
+
+	KernelManager kernel_mgr;
+	for (auto i=0; i<num_kernels; ++i)
+	{
+		auto width=CMath::pow(2, i);
+		auto kernel=new CGaussianKernel(cache_size, width);
+		kernel_mgr.push_back(kernel);
+	}
+	auto distance_instance=kernel_mgr.get_distance_instance();
+	distance_instance->init(merged_feats, merged_feats);
+	auto precomputed_distance=some<CCustomDistance>();
+	auto distance_matrix=distance_instance->get_distance_matrix<float32_t>();
+	precomputed_distance->set_triangle_distance_matrix_from_full(distance_matrix.data(), n+m, n+m);
+	SG_UNREF(distance_instance);
+	kernel_mgr.set_precomputed_distance(precomputed_distance);
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_n_x=n;
+	permutation_mmd.m_n_y=m;
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	SGMatrix<float32_t> null_samples=permutation_mmd(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	ASSERT_EQ(null_samples.num_cols, num_kernels);
+	ASSERT_EQ(null_samples.num_rows, num_null_samples);
+
+	for (auto k=0; k<num_kernels; ++k)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(k);
+		kernel->init(merged_feats, merged_feats);
+		sg_rand->set_seed(12345);
+		SGVector<float32_t> curr_null_samples=permutation_mmd(kernel->get_kernel_matrix<float32_t>());
+
+		ASSERT_EQ(curr_null_samples.size(), null_samples.num_rows);
+		for (auto i=0; i<num_null_samples; ++i)
+			EXPECT_NEAR(null_samples(i, k), curr_null_samples[i], 1E-5);
+
+		kernel->remove_lhs_and_rhs();
+	}
+	SG_UNREF(merged_feats);
+}
+
+TEST(PermutationMMD, unbiased_incomplete_multi_kernel)
+{
+	const index_t n=18;
+	const index_t m=18;
+	const index_t dim=2;
+	const index_t num_null_samples=5;
+	const index_t num_kernels=4;
+	const index_t cache_size=10;
+	const float64_t difference=0.5;
+	const auto stype=ST_UNBIASED_INCOMPLETE;
+
+	auto gen_p=some<CMeanShiftDataGenerator>(0, dim, 0);
+	auto gen_q=some<CMeanShiftDataGenerator>(difference, dim, 0);
+
+	auto feats_p=gen_p->get_streamed_features(n);
+	auto feats_q=gen_q->get_streamed_features(m);
+	auto merged_feats=static_cast<CDenseFeatures<float64_t>*>(FeaturesUtil::create_merged_copy(feats_p, feats_q));
+	SG_REF(merged_feats);
+
+	KernelManager kernel_mgr;
+	for (auto i=0; i<num_kernels; ++i)
+	{
+		auto width=CMath::pow(2, i);
+		auto kernel=new CGaussianKernel(cache_size, width);
+		kernel_mgr.push_back(kernel);
+	}
+	auto distance_instance=kernel_mgr.get_distance_instance();
+	distance_instance->init(merged_feats, merged_feats);
+	auto precomputed_distance=some<CCustomDistance>();
+	auto distance_matrix=distance_instance->get_distance_matrix<float32_t>();
+	precomputed_distance->set_triangle_distance_matrix_from_full(distance_matrix.data(), n+m, n+m);
+	SG_UNREF(distance_instance);
+	kernel_mgr.set_precomputed_distance(precomputed_distance);
+
+	auto permutation_mmd=PermutationMMD();
+	permutation_mmd.m_n_x=n;
+	permutation_mmd.m_n_y=m;
+	permutation_mmd.m_stype=stype;
+	permutation_mmd.m_num_null_samples=num_null_samples;
+
+	sg_rand->set_seed(12345);
+	SGMatrix<float32_t> null_samples=permutation_mmd(kernel_mgr);
+	kernel_mgr.unset_precomputed_distance();
+
+	ASSERT_EQ(null_samples.num_cols, num_kernels);
+	ASSERT_EQ(null_samples.num_rows, num_null_samples);
+
+	for (auto k=0; k<num_kernels; ++k)
+	{
+		CKernel* kernel=kernel_mgr.kernel_at(k);
+		kernel->init(merged_feats, merged_feats);
+		sg_rand->set_seed(12345);
+		SGVector<float32_t> curr_null_samples=permutation_mmd(kernel->get_kernel_matrix<float32_t>());
+
+		ASSERT_EQ(curr_null_samples.size(), null_samples.num_rows);
+		for (auto i=0; i<num_null_samples; ++i)
+			EXPECT_NEAR(null_samples(i, k), curr_null_samples[i], 1E-5);
+
+		kernel->remove_lhs_and_rhs();
+	}
+	SG_UNREF(merged_feats);
+}
diff --git a/tests/unit/statistical_testing/internals/StreamingDataFetcher_unittest.cc b/tests/unit/statistical_testing/internals/StreamingDataFetcher_unittest.cc
new file mode 100644
index 00000000000..d6b4146af03
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/StreamingDataFetcher_unittest.cc
@@ -0,0 +1,153 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <memory>
+#include <algorithm>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/features/streaming/StreamingDenseFeatures.h>
+#include <shogun/statistical_testing/internals/StreamingDataFetcher.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace internal;
+
+TEST(StreamingDataFetcher, full_data)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	CStreamingFeatures *streaming_p = new CStreamingDenseFeatures<float64_t>(feats_p);
+
+	StreamingDataFetcher fetcher(streaming_p);
+	fetcher.set_num_samples(num_vec);
+
+	fetcher.start();
+	auto curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+
+	auto tmp=dynamic_cast<feat_type*>(curr);
+	ASSERT_TRUE(tmp!=nullptr);
+
+	SG_UNREF(curr);
+
+	curr=fetcher.next();
+	ASSERT_TRUE(curr==nullptr);
+	fetcher.end();
+}
+
+TEST(StreamingDataFetcher, block_data)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	CStreamingFeatures *streaming_p = new CStreamingDenseFeatures<float64_t>(feats_p);
+
+	StreamingDataFetcher fetcher(streaming_p);
+	fetcher.set_num_samples(num_vec);
+
+	fetcher.fetch_blockwise()
+		.with_blocksize(blocksize)
+		.with_num_blocks_per_burst(num_blocks_per_burst);
+
+	fetcher.start();
+	auto curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+	while (curr!=nullptr)
+	{
+		auto tmp=dynamic_cast<feat_type*>(curr);
+		ASSERT_TRUE(tmp!=nullptr);
+		ASSERT_TRUE(tmp->get_num_vectors()==blocksize*num_blocks_per_burst);
+		SG_UNREF(curr);
+		curr=fetcher.next();
+	}
+	fetcher.end();
+}
+
+TEST(StreamingDataFetcher, DISABLED_reset_functionality)
+{
+	const index_t dim=3;
+	const index_t num_vec=8;
+	const index_t blocksize=2;
+	const index_t num_blocks_per_burst=2;
+
+	SGMatrix<float64_t> data_p(dim, num_vec);
+	std::iota(data_p.matrix, data_p.matrix+dim*num_vec, 0);
+
+	using feat_type=CDenseFeatures<float64_t>;
+	auto feats_p=new feat_type(data_p);
+	CStreamingFeatures *streaming_p = new CStreamingDenseFeatures<float64_t>(feats_p);
+
+	StreamingDataFetcher fetcher(streaming_p);
+	fetcher.set_num_samples(num_vec);
+
+	fetcher.start();
+	auto curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+
+	auto tmp=dynamic_cast<feat_type*>(curr);
+	ASSERT_TRUE(tmp!=nullptr);
+
+	SG_UNREF(curr);
+
+	curr=fetcher.next();
+	ASSERT_TRUE(curr==nullptr);
+
+	fetcher.reset();
+	fetcher.fetch_blockwise()
+		.with_blocksize(blocksize)
+		.with_num_blocks_per_burst(num_blocks_per_burst);
+
+	fetcher.start();
+	curr=fetcher.next();
+	ASSERT_TRUE(curr!=nullptr);
+	while (curr!=nullptr)
+	{
+		tmp=dynamic_cast<feat_type*>(curr);
+		ASSERT_TRUE(tmp!=nullptr);
+		ASSERT_TRUE(tmp->get_num_vectors()==blocksize*num_blocks_per_burst);
+		SG_UNREF(curr);
+		curr=fetcher.next();
+	}
+	fetcher.end();
+}
diff --git a/tests/unit/statistical_testing/internals/WithinBlockPermutation_unittest.cc b/tests/unit/statistical_testing/internals/WithinBlockPermutation_unittest.cc
new file mode 100644
index 00000000000..eb568aa77b6
--- /dev/null
+++ b/tests/unit/statistical_testing/internals/WithinBlockPermutation_unittest.cc
@@ -0,0 +1,254 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2016 Soumyajit De
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice, this
+ *    list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and documentation are those
+ * of the authors and should not be interpreted as representing official policies,
+ * either expressed or implied, of the Shogun Development Team.
+ */
+
+#include <algorithm>
+#include <shogun/base/some.h>
+#include <shogun/lib/SGMatrix.h>
+#include <shogun/lib/SGVector.h>
+#include <shogun/features/Features.h>
+#include <shogun/features/DenseFeatures.h>
+#include <shogun/kernel/Kernel.h>
+#include <shogun/kernel/GaussianKernel.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/mathematics/eigen3.h>
+#include <shogun/statistical_testing/MMD.h>
+#include <shogun/statistical_testing/TestEnums.h>
+#include <shogun/statistical_testing/internals/mmd/ComputeMMD.h>
+#include <shogun/statistical_testing/internals/mmd/WithinBlockPermutation.h>
+#include <gtest/gtest.h>
+
+using namespace shogun;
+using namespace Eigen;
+
+TEST(WithinBlockPermutation, biased_full)
+{
+	const index_t dim=2;
+	const index_t n=13;
+	const index_t m=7;
+
+	using operation=std::function<float32_t(SGMatrix<float32_t>)>;
+
+	SGMatrix<float64_t> data_p(dim, n);
+	std::iota(data_p.matrix, data_p.matrix+dim*n, 1);
+	std::for_each(data_p.matrix, data_p.matrix+dim*n, [&n](float64_t& val) { val/=n; });
+
+	SGMatrix<float64_t> data_q(dim, m);
+	std::iota(data_q.matrix, data_q.matrix+dim*m, n+1);
+	std::for_each(data_q.matrix, data_q.matrix+dim*m, [&m](float64_t& val) { val/=2*m; });
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto feats_q=new CDenseFeatures<float64_t>(data_q);
+	auto feats=feats_p->create_merged_copy(feats_q);
+	SG_REF(feats);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	auto kernel=some<CGaussianKernel>();
+	kernel->set_width(2.0);
+
+	kernel->init(feats, feats);
+	auto mat=kernel->get_kernel_matrix<float32_t>();
+
+	// compute using within-block-permutation functor
+    operation compute=shogun::internal::mmd::WithinBlockPermutation(n, m, ST_BIASED_FULL);
+	sg_rand->set_seed(12345);
+	auto result_1=compute(mat);
+
+	auto mmd=shogun::internal::mmd::ComputeMMD();
+	mmd.m_n_x=n;
+	mmd.m_n_y=m;
+	mmd.m_stype=ST_BIASED_FULL;
+	compute=mmd;
+
+	// compute a row-column permuted temporary matrix first
+	// then compute a biased-full statistic on this matrix
+	Map<MatrixXf> map(mat.matrix, mat.num_rows, mat.num_cols);
+	PermutationMatrix<Dynamic, Dynamic> perm(mat.num_rows);
+	perm.setIdentity();
+	SGVector<int> perminds(perm.indices().data(), perm.indices().size(), false);
+	sg_rand->set_seed(12345);
+	CMath::permute(perminds);
+	MatrixXf permuted = perm.transpose()*map*perm;
+	SGMatrix<float32_t> permuted_km(permuted.data(), permuted.rows(), permuted.cols(), false);
+	auto result_2=compute(permuted_km);
+
+	// shuffle the features first, recompute the kernel matrix using
+	// shuffled samples, then compute a biased-full statistic on this matrix
+	SGVector<index_t> inds(mat.num_rows);
+	std::iota(inds.vector, inds.vector+inds.vlen, 0);
+	sg_rand->set_seed(12345);
+	CMath::permute(inds);
+	feats->add_subset(inds);
+	kernel->init(feats, feats);
+	mat=kernel->get_kernel_matrix<float32_t>();
+	auto result_3=compute(mat);
+
+	EXPECT_NEAR(result_1, result_2, 1E-6);
+	EXPECT_NEAR(result_1, result_3, 1E-6);
+
+	SG_UNREF(feats);
+}
+
+TEST(WithinBlockPermutation, unbiased_full)
+{
+	const index_t dim=2;
+	const index_t n=13;
+	const index_t m=7;
+
+	using operation=std::function<float32_t(SGMatrix<float32_t>)>;
+
+	SGMatrix<float64_t> data_p(dim, n);
+	std::iota(data_p.matrix, data_p.matrix+dim*n, 1);
+	std::for_each(data_p.matrix, data_p.matrix+dim*n, [&n](float64_t& val) { val/=n; });
+
+	SGMatrix<float64_t> data_q(dim, m);
+	std::iota(data_q.matrix, data_q.matrix+dim*m, n+1);
+	std::for_each(data_q.matrix, data_q.matrix+dim*m, [&m](float64_t& val) { val/=2*m; });
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto feats_q=new CDenseFeatures<float64_t>(data_q);
+	auto feats=feats_p->create_merged_copy(feats_q);
+	SG_REF(feats);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	auto kernel=some<CGaussianKernel>();
+	kernel->set_width(2.0);
+
+	kernel->init(feats, feats);
+	auto mat=kernel->get_kernel_matrix<float32_t>();
+
+	// compute using within-block-permutation functor
+    operation compute=shogun::internal::mmd::WithinBlockPermutation(n, m, ST_UNBIASED_FULL);
+	sg_rand->set_seed(12345);
+	auto result_1=compute(mat);
+
+	auto mmd=shogun::internal::mmd::ComputeMMD();
+	mmd.m_n_x=n;
+	mmd.m_n_y=m;
+	mmd.m_stype=ST_UNBIASED_FULL;
+	compute=mmd;
+
+	// compute a row-column permuted temporary matrix first
+	// then compute unbiased-full statistic on this matrix
+	Map<MatrixXf> map(mat.matrix, mat.num_rows, mat.num_cols);
+	PermutationMatrix<Dynamic, Dynamic> perm(mat.num_rows);
+	perm.setIdentity();
+	SGVector<int> perminds(perm.indices().data(), perm.indices().size(), false);
+	sg_rand->set_seed(12345);
+	CMath::permute(perminds);
+	MatrixXf permuted = perm.transpose()*map*perm;
+	SGMatrix<float32_t> permuted_km(permuted.data(), permuted.rows(), permuted.cols(), false);
+	auto result_2=compute(permuted_km);
+
+	// shuffle the features first, recompute the kernel matrix using
+	// shuffled samples, then compute unbiased-full statistic on this matrix
+	SGVector<index_t> inds(mat.num_rows);
+	std::iota(inds.vector, inds.vector+inds.vlen, 0);
+	sg_rand->set_seed(12345);
+	CMath::permute(inds);
+	feats->add_subset(inds);
+	kernel->init(feats, feats);
+	mat=kernel->get_kernel_matrix<float32_t>();
+	auto result_3=compute(mat);
+
+	EXPECT_NEAR(result_1, result_2, 1E-6);
+	EXPECT_NEAR(result_1, result_3, 1E-6);
+
+	SG_UNREF(feats);
+}
+
+TEST(WithinBlockPermutation, unbiased_incomplete)
+{
+	const index_t dim=2;
+	const index_t n=10;
+
+	using operation=std::function<float32_t(SGMatrix<float32_t>)>;
+
+	SGMatrix<float64_t> data_p(dim, n);
+	std::iota(data_p.matrix, data_p.matrix+dim*n, 1);
+	std::for_each(data_p.matrix, data_p.matrix+dim*n, [&n](float64_t& val) { val/=n; });
+
+	SGMatrix<float64_t> data_q(dim, n);
+	std::iota(data_q.matrix, data_q.matrix+dim*n, n+1);
+	std::for_each(data_q.matrix, data_q.matrix+dim*n, [&n](float64_t& val) { val/=2*n; });
+
+	auto feats_p=new CDenseFeatures<float64_t>(data_p);
+	auto feats_q=new CDenseFeatures<float64_t>(data_q);
+	auto feats=feats_p->create_merged_copy(feats_q);
+	SG_REF(feats);
+	SG_UNREF(feats_p);
+	SG_UNREF(feats_q);
+
+	auto kernel=some<CGaussianKernel>();
+	kernel->set_width(2.0);
+
+	kernel->init(feats, feats);
+	auto mat=kernel->get_kernel_matrix<float32_t>();
+
+	// compute using within-block-permutation functor
+    operation compute=shogun::internal::mmd::WithinBlockPermutation(n, n, ST_UNBIASED_INCOMPLETE);
+	sg_rand->set_seed(12345);
+	auto result_1=compute(mat);
+
+	auto mmd=shogun::internal::mmd::ComputeMMD();
+	mmd.m_n_x=n;
+	mmd.m_n_y=n;
+	mmd.m_stype=ST_UNBIASED_INCOMPLETE;
+	compute=mmd;
+
+	// compute a row-column permuted temporary matrix first
+	// then compute unbiased-incomplete statistic on this matrix
+	Map<MatrixXf> map(mat.matrix, mat.num_rows, mat.num_cols);
+	PermutationMatrix<Dynamic, Dynamic> perm(mat.num_rows);
+	perm.setIdentity();
+	SGVector<int> perminds(perm.indices().data(), perm.indices().size(), false);
+	sg_rand->set_seed(12345);
+	CMath::permute(perminds);
+	MatrixXf permuted = perm.transpose()*map*perm;
+	SGMatrix<float32_t> permuted_km(permuted.data(), permuted.rows(), permuted.cols(), false);
+	auto result_2=compute(permuted_km);
+
+	// shuffle the features first, recompute the kernel matrix using
+	// shuffled samples, then compute uniased-incomplete statistic on this matrix
+	SGVector<index_t> inds(mat.num_rows);
+	std::iota(inds.vector, inds.vector+inds.vlen, 0);
+	sg_rand->set_seed(12345);
+	CMath::permute(inds);
+	feats->add_subset(inds);
+	kernel->init(feats, feats);
+	mat=kernel->get_kernel_matrix<float32_t>();
+	auto result_3=compute(mat);
+
+	EXPECT_NEAR(result_1, result_2, 1E-6);
+	EXPECT_NEAR(result_1, result_3, 1E-6);
+
+	SG_UNREF(feats);
+}
diff --git a/tests/unit/statistics/HSIC_unittest.cc b/tests/unit/statistics/HSIC_unittest.cc
deleted file mode 100644
index 756039245ce..00000000000
--- a/tests/unit/statistics/HSIC_unittest.cc
+++ /dev/null
@@ -1,166 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2012-2013 Heiko Strathmann, pl8787
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/HSIC.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/mathematics/Statistics.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-void create_fixed_data_kernel_small(CFeatures*& features_p,
-		CFeatures*& features_q, CKernel*& kernel_p, CKernel*& kernel_q)
-{
-	index_t m=2;
-	index_t d=3;
-
-	SGMatrix<float64_t> p(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		p.matrix[i]=i;
-
-	SGMatrix<float64_t> q(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		q.matrix[i]=i+10;
-
-	features_p=new CDenseFeatures<float64_t>(p);
-	features_q=new CDenseFeatures<float64_t>(q);
-
-	float64_t sigma_x=2;
-	float64_t sigma_y=3;
-	float64_t sq_sigma_x_twice=sigma_x*sigma_x*2;
-	float64_t sq_sigma_y_twice=sigma_y*sigma_y*2;
-
-	/* shoguns kernel width is different */
-	kernel_p=new CGaussianKernel(10, sq_sigma_x_twice);
-	kernel_q=new CGaussianKernel(10, sq_sigma_y_twice);
-}
-
-void create_fixed_data_kernel_big(CFeatures*& features_p,
-		CFeatures*& features_q, CKernel*& kernel_p, CKernel*& kernel_q)
-{
-	index_t m=10;
-	index_t d=7;
-
-	SGMatrix<float64_t> p(d,m);
-	for (index_t i=0; i<d*m; ++i)
-		p.matrix[i]=(i+8)%3;
-
-	SGMatrix<float64_t> q(d,m);
-	for (index_t i=0; i<d*m; ++i)
-		q.matrix[i]=((i+10)*(i%4+2))%4;
-
-	features_p=new CDenseFeatures<float64_t>(p);
-	features_q=new CDenseFeatures<float64_t>(q);
-
-	float64_t sigma_x=2;
-	float64_t sigma_y=3;
-	float64_t sq_sigma_x_twice=sigma_x*sigma_x*2;
-	float64_t sq_sigma_y_twice=sigma_y*sigma_y*2;
-
-	/* shoguns kernel width is different */
-	kernel_p=new CGaussianKernel(10, sq_sigma_x_twice);
-	kernel_q=new CGaussianKernel(10, sq_sigma_y_twice);
-}
-
-/** tests the hsic statistic for a single fixed data case and ensures
- * equality with sma implementation */
-TEST(HSIC, hsic_fixed)
-{
-	CFeatures* features_p=NULL;
-	CFeatures* features_q=NULL;
-	CKernel* kernel_p=NULL;
-	CKernel* kernel_q=NULL;
-	create_fixed_data_kernel_small(features_p, features_q, kernel_p, kernel_q);
-
-	index_t m=features_p->get_num_vectors();
-
-	CHSIC* hsic=new CHSIC(kernel_p, kernel_q, features_p, features_q);
-
-	/* assert matlab result, note that compute statistic computes m*hsic */
-	float64_t difference=hsic->compute_statistic();
-
-	EXPECT_NEAR(difference, m*0.164761446385339, 1e-15);
-
-	SG_UNREF(hsic);
-}
-
-// disabled as I think previous inverse_gamma_cdf was faulty
-// now unit test fails. Needs to be investigated statistically
-TEST(DISABLED_HSIC, hsic_gamma)
-{
-	CFeatures* features_p=NULL;
-	CFeatures* features_q=NULL;
-	CKernel* kernel_p=NULL;
-	CKernel* kernel_q=NULL;
-	create_fixed_data_kernel_big(features_p, features_q, kernel_p, kernel_q);
-
-	CHSIC* hsic=new CHSIC(kernel_p, kernel_q, features_p, features_q);
-
-	hsic->set_null_approximation_method(HSIC_GAMMA);
-	float64_t p=hsic->compute_p_value(0.05);
-
-	EXPECT_NEAR(p, 0.172182287884256, 1e-14);
-
-	SG_UNREF(hsic);
-}
-
-TEST(HSIC, hsic_sample_null)
-{
-	CFeatures* features_p=NULL;
-	CFeatures* features_q=NULL;
-	CKernel* kernel_p=NULL;
-	CKernel* kernel_q=NULL;
-	create_fixed_data_kernel_big(features_p, features_q, kernel_p, kernel_q);
-
-	CHSIC* hsic=new CHSIC(kernel_p, kernel_q, features_p, features_q);
-
-	/* do sampling null */
-	hsic->set_null_approximation_method(PERMUTATION);
-	hsic->compute_p_value(0.05);
-
-	/* ensure that sampling null of hsic leads to same results as using
-	 * CKernelIndependenceTest */
-	CMath::init_random(1);
-	float64_t mean1=CStatistics::mean(hsic->sample_null());
-	float64_t var1=CStatistics::variance(hsic->sample_null());
-
-	CMath::init_random(1);
-	float64_t mean2=CStatistics::mean(
-			hsic->CKernelIndependenceTest::sample_null());
-	float64_t var2=CStatistics::variance(hsic->sample_null());
-
-	/* assert than results are the same from bot sampling null impl. */
-	EXPECT_NEAR(mean1, mean2, 1e-7);
-	EXPECT_NEAR(var1, var2, 1e-7);
-
-	SG_UNREF(hsic);
-}
-
diff --git a/tests/unit/statistics/LinearTimeMMD_unittest.cc b/tests/unit/statistics/LinearTimeMMD_unittest.cc
deleted file mode 100644
index 8fca231d14b..00000000000
--- a/tests/unit/statistics/LinearTimeMMD_unittest.cc
+++ /dev/null
@@ -1,284 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/features/streaming/StreamingDenseFeatures.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-/** tests the linear mmd statistic for a single data case and ensures
- * equality with matlab implementation. Since data from memory is used,
- * this is rather complicated, i.e. create dense features and then create
- * streaming dense features from them. Normally, just use streaming features
- * directly. */
-TEST(LinearTimeMMD,test_linear_mmd_fixed)
-{
-	index_t m=2;
-	index_t d=3;
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	SGMatrix<float64_t> data(d, 2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create stremaing features from dense features */
-	CStreamingFeatures* streaming_p=new CStreamingDenseFeatures<float64_t>(
-			features_p);
-	CStreamingFeatures* streaming_q=new CStreamingDenseFeatures<float64_t>(
-			features_q);
-
-	/* shoguns kernel width is different */
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(kernel, streaming_p, streaming_q, m);
-
-	/* start streaming features parser */
-	streaming_p->start_parser();
-	streaming_q->start_parser();
-
-	/* assert matlab result */
-	float64_t statistic=mmd->compute_statistic();
-	//SG_SPRINT("statistic=%f\n", statistic);
-	float64_t difference=statistic-0.034218118311602;
-	EXPECT_LE(CMath::abs(difference), 10E-16);
-
-	/* start streaming features parser */
-	streaming_p->end_parser();
-	streaming_q->end_parser();
-
-	SG_UNREF(mmd);
-}
-
-TEST(LinearTimeMMD,test_linear_mmd_statistic_and_Q_fixed)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d, 2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create stremaing features from dense features */
-	CStreamingFeatures* streaming_p_1=new CStreamingDenseFeatures<float64_t>(
-			features_p);
-	CStreamingFeatures* streaming_q_1=new CStreamingDenseFeatures<float64_t>(
-			features_q);
-	CStreamingFeatures* streaming_p_2=new CStreamingDenseFeatures<float64_t>(
-			features_p);
-	CStreamingFeatures* streaming_q_2=new CStreamingDenseFeatures<float64_t>(
-			features_q);
-
-	/* create combined kernel with values 2^5 to 2^7 */
-	CCombinedKernel* kernel=new CCombinedKernel();
-	for (index_t i=5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd_1=new CLinearTimeMMD(kernel, streaming_p_1,
-			streaming_q_1, m);
-	CLinearTimeMMD* mmd_2=new CLinearTimeMMD(kernel, streaming_p_2,
-			streaming_q_2, m);
-
-	/* results only equal if blocksize is larger than number of samples (other-
-	 * wise, samples are processed in a different combination). In practice,
-	 * just use some large value */
-	mmd_1->set_blocksize(m);
-	mmd_2->set_blocksize(m);
-
-	/* start streaming features parser */
-	streaming_p_1->start_parser();
-	streaming_q_1->start_parser();
-	streaming_p_2->start_parser();
-	streaming_q_2->start_parser();
-
-	/* test method */
-	SGVector<float64_t> mmds_1;
-	SGMatrix<float64_t> Q;
-	mmd_1->compute_statistic_and_Q(mmds_1, Q);
-	SGVector<float64_t> mmds_2=mmd_2->compute_statistic(true);
-
-	/* display results */
-	//Q.display_matrix("Q");
-	//mmds_1.display_vector("mmds_1");
-	//mmds_2.display_vector("mmds_2");
-
-	/* assert that both MMD methods give the same results */
-	EXPECT_EQ(mmds_1.vlen, mmds_2.vlen);
-	for (index_t i=0; i<mmds_1.vlen; ++i)
-		EXPECT_EQ(mmds_1[i], mmds_2[i]);
-
-	/* assert actual result against fixed MATLAB code */
-//	1.0e-03 *
-//	   0.156085264965383
-//	   0.039043151854851
-//	   0.009762153067083
-	EXPECT_LE(CMath::abs(mmds_1[0]-0.000156085264965383), 10E-18);
-	EXPECT_LE(CMath::abs(mmds_1[1]-0.000039043151854851), 10E-18);
-	EXPECT_LE(CMath::abs(mmds_1[2]-0.000009762153067083), 10E-18);
-
-	/* assert correctness of Q matrix */
-//	   1.0e-07 *
-//	   0.403271337407935   0.100876370041104   0.025222752103390
-//	   0.100876370041104   0.025233734937354   0.006309349164329
-//	   0.025222752103390   0.006309349164329   0.001577566181822
-	EXPECT_LE(CMath::abs(Q(0,0)-0.403271337407935E-7), 10E-22);
-	EXPECT_LE(CMath::abs(Q(1,0)-0.100876370041104E-7), 10E-22);
-	EXPECT_LE(CMath::abs(Q(2,0)-0.025222752103390E-7), 10E-22);
-	EXPECT_LE(CMath::abs(Q(0,1)-0.100876370041104E-7), 10E-22);
-	EXPECT_LE(CMath::abs(Q(1,1)-0.025233734937354E-7) ,10E-22);
-	EXPECT_LE(CMath::abs(Q(2,1)-0.006309349164329E-7) ,10E-22);
-	EXPECT_LE(CMath::abs(Q(0,2)-0.025222752103390E-7) ,10E-22);
-	EXPECT_LE(CMath::abs(Q(1,2)-0.006309349164329E-7) ,10E-22);
-	EXPECT_LE(CMath::abs(Q(2,2)-0.001577566181822E-7) ,10E-22);
-
-	/* start streaming features parser */
-	streaming_p_1->end_parser();
-	streaming_q_1->end_parser();
-	streaming_p_2->end_parser();
-	streaming_q_2->end_parser();
-
-	SG_UNREF(mmd_1);
-	SG_UNREF(mmd_2);
-}
-
-TEST(LinearTimeMMD,test_linear_mmd_statistic_and_variance_fixed)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d, 2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create stremaing features from dense features */
-	CStreamingFeatures* streaming_p=new CStreamingDenseFeatures<float64_t>(
-			features_p);
-	CStreamingFeatures* streaming_q=new CStreamingDenseFeatures<float64_t>(
-			features_q);
-
-	/* create combined kernel with values 2^5 to 2^7 */
-	CCombinedKernel* kernel=new CCombinedKernel();
-	for (index_t i=5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(kernel, streaming_p, streaming_q, m);
-
-	/* start streaming features parser */
-	streaming_p->start_parser();
-	streaming_q->start_parser();
-
-	/* test method */
-	SGVector<float64_t> mmds;
-	SGVector<float64_t> vars;
-	mmd->compute_statistic_and_variance(mmds, vars, true);
-
-	/* display results */
-	//vars.display_vector("vars");
-	//mmds.display_vector("mmds");
-
-	/* assert actual result against fixed MATLAB code */
-//	mmds=
-//	1.0e-03 *
-//	   0.156085264965383
-//	   0.039043151854851
-//	   0.009762153067083
-	EXPECT_LE(CMath::abs(mmds[0]-0.000156085264965383), 10E-18);
-	EXPECT_LE(CMath::abs(mmds[1]-0.000039043151854851), 10E-18);
-	EXPECT_LE(CMath::abs(mmds[2]-0.000009762153067083), 10E-18);
-
-	/* assert correctness of variance estimates */
-//	vars =
-//	   1.0e-08 *
-//	   0.418667765635434
-//	   0.026197180636036
-//	   0.001637799815771
-	EXPECT_LE(CMath::abs(vars[0]-0.418667765635434E-8), 10E-23);
-	EXPECT_LE(CMath::abs(vars[1]-0.026197180636036E-8), 10E-23);
-	EXPECT_LE(CMath::abs(vars[2]-0.001637799815771E-8), 10E-23);
-
-	/* start streaming features parser */
-	streaming_p->end_parser();
-	streaming_q->end_parser();
-
-	SG_UNREF(mmd);
-}
diff --git a/tests/unit/statistics/MMDKernelSelectionCombMaxL2_unittest.cc b/tests/unit/statistics/MMDKernelSelectionCombMaxL2_unittest.cc
deleted file mode 100644
index 2b8910a885a..00000000000
--- a/tests/unit/statistics/MMDKernelSelectionCombMaxL2_unittest.cc
+++ /dev/null
@@ -1,107 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/statistics/MMDKernelSelectionCombMaxL2.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/features/streaming/StreamingDenseFeatures.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-TEST(MMDKernelSelectionCombMaxL2, select_kernel)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create stremaing features from dense features */
-	CStreamingFeatures* streaming_p=
-			new CStreamingDenseFeatures<float64_t>(features_p);
-	CStreamingFeatures* streaming_q=
-			new CStreamingDenseFeatures<float64_t>(features_q);
-
-	/* create kernels with sigmas 2^5 to 2^7 */
-	CCombinedKernel* combined_kernel=new CCombinedKernel();
-	for (index_t i=5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		combined_kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(combined_kernel, streaming_p,
-			streaming_q, m);
-
-	/* kernel selection instance */
-	CMMDKernelSelectionCombMaxL2* selection=new CMMDKernelSelectionCombMaxL2(
-			mmd);
-
-	/* start streaming features parser */
-	streaming_p->start_parser();
-	streaming_q->start_parser();
-
-	CKernel* result=selection->select_kernel();
-	CCombinedKernel* casted=dynamic_cast<CCombinedKernel*>(result);
-	ASSERT(casted);
-	SGVector<float64_t> weights=casted->get_subkernel_weights();
-	//weights.display_vector("weights");
-
-	/* assert weights against matlab */
-//	w_l2 =
-//	   0.761798188424313
-//	   0.190556119182660
-//	   0.047645692393028
-	EXPECT_LE(CMath::abs(weights[0]-0.761798188424313), 10E-15);
-	EXPECT_LE(CMath::abs(weights[1]-0.190556119182660), 10E-15);
-	EXPECT_LE(CMath::abs(weights[2]-0.047645692393028), 10E-15);
-
-	/* start streaming features parser */
-	streaming_p->end_parser();
-	streaming_q->end_parser();
-
-	SG_UNREF(selection);
-	SG_UNREF(result);
-}
-#endif //USE_GPL_SHOGUN
diff --git a/tests/unit/statistics/MMDKernelSelectionCombOpt_unittest.cc b/tests/unit/statistics/MMDKernelSelectionCombOpt_unittest.cc
deleted file mode 100644
index c207397209d..00000000000
--- a/tests/unit/statistics/MMDKernelSelectionCombOpt_unittest.cc
+++ /dev/null
@@ -1,107 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/lib/config.h>
-#ifdef USE_GPL_SHOGUN
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/statistics/MMDKernelSelectionCombOpt.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/features/streaming/StreamingDenseFeatures.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-TEST(MMDKernelSelectionCombOpt, select_kernel)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create stremaing features from dense features */
-	CStreamingFeatures* streaming_p=
-			new CStreamingDenseFeatures<float64_t>(features_p);
-	CStreamingFeatures* streaming_q=
-			new CStreamingDenseFeatures<float64_t>(features_q);
-
-	/* create kernels with sigmas 2^5 to 2^7 */
-	CCombinedKernel* combined_kernel=new CCombinedKernel();
-	for (index_t i=5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		combined_kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(combined_kernel, streaming_p,
-			streaming_q, m);
-
-	/* kernel selection instance with regularisation term */
-	CMMDKernelSelectionCombOpt* selection=new CMMDKernelSelectionCombOpt(mmd,
-			10E-5);
-
-	/* start streaming features parser */
-	streaming_p->start_parser();
-	streaming_q->start_parser();
-
-	CKernel* result=selection->select_kernel();
-	CCombinedKernel* casted=dynamic_cast<CCombinedKernel*>(result);
-	ASSERT(casted);
-	SGVector<float64_t> weights=casted->get_subkernel_weights();
-	//weights.display_vector("weights");
-
-	/* assert weights against matlab */
-//	w_opt =
-//	   0.761798190146441
-//	   0.190556117891148
-//	   0.047645691962411
-	EXPECT_LE(CMath::abs(weights[0]-0.761798190146441), 10E-15);
-	EXPECT_LE(CMath::abs(weights[1]-0.190556117891148), 10E-15);
-	EXPECT_LE(CMath::abs(weights[2]-0.047645691962411), 10E-15);
-
-
-	/* start streaming features parser */
-	streaming_p->end_parser();
-	streaming_q->end_parser();
-
-	SG_UNREF(selection);
-	SG_UNREF(result);
-}
-#endif //USE_GPL_SHOGUN
diff --git a/tests/unit/statistics/MMDKernelSelectionMax_unittest.cc b/tests/unit/statistics/MMDKernelSelectionMax_unittest.cc
deleted file mode 100644
index 232dd2c6a01..00000000000
--- a/tests/unit/statistics/MMDKernelSelectionMax_unittest.cc
+++ /dev/null
@@ -1,182 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/QuadraticTimeMMD.h>
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/statistics/MMDKernelSelectionMax.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/features/streaming/StreamingDenseFeatures.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-TEST(MMDKernelSelectionMax,select_kernel_quadratic_time_mmd)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create kernels with sigmas 2^5 to 2^7 */
-	CCombinedKernel* combined_kernel=new CCombinedKernel();
-	for (index_t i=5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		combined_kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(combined_kernel, features_p,
-			features_q);
-
-	/* kernel selection instance */
-	CMMDKernelSelectionMax* selection=
-			new CMMDKernelSelectionMax(mmd);
-
-	/* assert correct mmd values, maxmmd criterion is already checked with
-	 * linear time mmd maxmmd selection. Do biased and unbiased m*MMD */
-
-	/* unbiased m*MMD */
-	mmd->set_statistic_type(UNBIASED_DEPRECATED);
-	SGVector<float64_t> measures=selection->compute_measures();
-	//measures.display_vector("unbiased mmd");
-//	unbiased_quad_mmds =
-//	   0.001164382204818   0.000291185913881   0.000072802127661
-	EXPECT_LE(CMath::abs(measures[0]-0.001164382204818), 10E-15);
-	EXPECT_LE(CMath::abs(measures[1]-0.000291185913881), 10E-15);
-	EXPECT_LE(CMath::abs(measures[2]-0.000072802127661), 10E-15);
-
-	/* biased m*MMD */
-	mmd->set_statistic_type(BIASED_DEPRECATED);
-	measures=selection->compute_measures();
-	//measures.display_vector("biased mmd");
-//	biased_quad_mmds =
-//	   0.001534961982492   0.000383849322208   0.000095969134022
-	EXPECT_LE(CMath::abs(measures[0]-0.001534961982492), 10E-15);
-	EXPECT_LE(CMath::abs(measures[1]-0.000383849322208), 10E-15);
-	EXPECT_LE(CMath::abs(measures[2]-0.000095969134022), 10E-15);
-
-	/* since convienience constructor was use for mmd, features have to be
-	 * cleaned up by hand */
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-
-	SG_UNREF(selection);
-}
-
-TEST(MMDKernelSelectionMax,select_kernel_linear_time_mmd)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create stremaing features from dense features */
-	CStreamingFeatures* streaming_p=
-			new CStreamingDenseFeatures<float64_t>(features_p);
-	CStreamingFeatures* streaming_q=
-			new CStreamingDenseFeatures<float64_t>(features_q);
-
-	/* create kernels with sigmas 2^5 to 2^7 */
-	CCombinedKernel* combined_kernel=new CCombinedKernel();
-	for (index_t i=5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		combined_kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(combined_kernel, streaming_p,
-			streaming_q, m);
-
-	/* kernel selection instance */
-	CMMDKernelSelectionMax* selection=
-			new CMMDKernelSelectionMax(mmd);
-
-	/* start streaming features parser */
-	streaming_p->start_parser();
-	streaming_q->start_parser();
-
-	/* assert that the correct kernel is returned since I checked the MMD
-	 * already very often */
-	CKernel* result=selection->select_kernel();
-	CGaussianKernel* casted=dynamic_cast<CGaussianKernel*>(result);
-	ASSERT(casted);
-
-	/* assert weights against matlab */
-	CKernel* reference=combined_kernel->get_first_kernel();
-	ASSERT(result==reference);
-	SG_UNREF(reference);
-
-	/* start streaming features parser */
-	streaming_p->end_parser();
-	streaming_q->end_parser();
-
-	SG_UNREF(selection);
-	SG_UNREF(result);
-}
diff --git a/tests/unit/statistics/MMDKernelSelectionMedian_unittest.cc b/tests/unit/statistics/MMDKernelSelectionMedian_unittest.cc
deleted file mode 100644
index f27ca8e9dff..00000000000
--- a/tests/unit/statistics/MMDKernelSelectionMedian_unittest.cc
+++ /dev/null
@@ -1,87 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/QuadraticTimeMMD.h>
-#include <shogun/statistics/MMDKernelSelectionMedian.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-TEST(MMDKernelSelectionMedian,select_kernel)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create Gaussian kernelkernels with sigmas 2^5 to 2^7 */
-	CCombinedKernel* combined_kernel=new CCombinedKernel();
-	//SG_SPRINT("adding widths (std)(shogun): ");
-	for (index_t i=-5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2.0, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		//SG_SPRINT("(%f)(%f) ", sigma, sq_sigma_twice);
-		combined_kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-	//SG_SPRINT("\n");
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(combined_kernel, features_p,
-			features_q);
-
-	/* kernel selection instance */
-	CMMDKernelSelectionMedian* selection=
-			new CMMDKernelSelectionMedian(mmd);
-
-	/* we know that a Gaussian kernel is returned when using median, the
-	 * fifth one here one here */
-	CGaussianKernel* kernel=(CGaussianKernel*)selection->select_kernel();
-	//SG_SPRINT("median kernel width: %f\n", kernel->get_width());
-	EXPECT_EQ(kernel->get_width(), 0.5);
-
-	SG_UNREF(kernel);
-
-	/* since convienience constructor was use for mmd, features have to be
-	 * cleaned up by hand */
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-
-	SG_UNREF(selection);
-}
diff --git a/tests/unit/statistics/MMDKernelSelectionOpt_unittest.cc b/tests/unit/statistics/MMDKernelSelectionOpt_unittest.cc
deleted file mode 100644
index 510e777230f..00000000000
--- a/tests/unit/statistics/MMDKernelSelectionOpt_unittest.cc
+++ /dev/null
@@ -1,99 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/LinearTimeMMD.h>
-#include <shogun/statistics/MMDKernelSelectionOpt.h>
-#include <shogun/features/streaming/StreamingFeatures.h>
-#include <shogun/features/streaming/StreamingDenseFeatures.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CombinedKernel.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-TEST(MMDKernelSelectionOpt,select_kernel)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data to get some reasonable values for Q matrix */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	//SG_SPRINT("%f, %f\n", max_p, max_q);
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	//data_p.display_matrix("data_p");
-	//data_q.display_matrix("data_q");
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* create stremaing features from dense features */
-	CStreamingFeatures* streaming_p=
-			new CStreamingDenseFeatures<float64_t>(features_p);
-	CStreamingFeatures* streaming_q=
-			new CStreamingDenseFeatures<float64_t>(features_q);
-
-	/* create kernels with sigmas 2^5 to 2^7 */
-	CCombinedKernel* combined_kernel=new CCombinedKernel();
-	for (index_t i=5; i<=7; ++i)
-	{
-		/* shoguns kernel width is different */
-		float64_t sigma=CMath::pow(2, i);
-		float64_t sq_sigma_twice=sigma*sigma*2;
-		combined_kernel->append_kernel(new CGaussianKernel(10, sq_sigma_twice));
-	}
-
-	/* create MMD instance */
-	CLinearTimeMMD* mmd=new CLinearTimeMMD(combined_kernel, streaming_p,
-			streaming_q, m);
-
-	/* kernel selection instance with regularisation term */
-	CMMDKernelSelectionOpt* selection=
-			new CMMDKernelSelectionOpt(mmd, 10E-5);
-
-	/* start streaming features parser */
-	streaming_p->start_parser();
-	streaming_q->start_parser();
-
-	SGVector<float64_t> ratios=selection->compute_measures();
-	//ratios.display_vector("ratios");
-
-	/* assert weights against matlab */
-//	ratios =
-//	   0.947668253683719
-//	   0.336041393822230
-//	   0.093824478467851
-	EXPECT_LE(CMath::abs(ratios[0]-0.947668253683719), 10E-15);
-	EXPECT_LE(CMath::abs(ratios[1]-0.336041393822230), 10E-15);
-	EXPECT_LE(CMath::abs(ratios[2]-0.093824478467851), 10E-15);
-
-	/* start streaming features parser */
-	streaming_p->end_parser();
-	streaming_q->end_parser();
-
-	SG_UNREF(selection);
-}
diff --git a/tests/unit/statistics/NOCCO_unittest.cc b/tests/unit/statistics/NOCCO_unittest.cc
deleted file mode 100644
index 38469b3a32c..00000000000
--- a/tests/unit/statistics/NOCCO_unittest.cc
+++ /dev/null
@@ -1,183 +0,0 @@
-/*
- * Copyright (c) The Shogun Machine Learning Toolbox
- * Written (w) 2014 Soumyajit De
- * Written (w) 2012-2013 Heiko Strathmann
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * 1. Redistributions of source code must retain the above copyright notice, this
- *    list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright notice,
- *    this list of conditions and the following disclaimer in the documentation
- *    and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
- * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * The views and conclusions contained in the software and documentation are those
- * of the authors and should not be interpreted as representing official policies,
- * either expressed or implied, of the Shogun Development Team.
- */
-
-#include <shogun/statistics/NOCCO.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/mathematics/Statistics.h>
-#include <shogun/mathematics/eigen3.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-
-using namespace Eigen;
-
-/** tests the nocco statistic for a single fixed data case and ensures
- * equality with matlab implementation */
-TEST(NOCCO, compute_statistic)
-{
-	const index_t m=2;
-	const index_t d=3;
-	const float64_t epsilon=0.1;
-
-	SGMatrix<float64_t> p(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		p.matrix[i]=i;
-
-	SGMatrix<float64_t> q(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		q.matrix[i]=i+10;
-
-	CFeatures* features_p=new CDenseFeatures<float64_t>(p);
-	CFeatures* features_q=new CDenseFeatures<float64_t>(q);
-
-	float64_t sigma_x=2;
-	float64_t sigma_y=3;
-	float64_t sq_sigma_x_twice=sigma_x*sigma_x*2;
-	float64_t sq_sigma_y_twice=sigma_y*sigma_y*2;
-
-	/* shoguns kernel width is different */
-	CKernel* kernel_p=new CGaussianKernel(10, sq_sigma_x_twice);
-	CKernel* kernel_q=new CGaussianKernel(10, sq_sigma_y_twice);
-
-	CNOCCO* nocco=new CNOCCO(kernel_p, kernel_q, features_p, features_q);
-	nocco->set_epsilon(epsilon);
-
-	float64_t statistic=nocco->compute_statistic();
-
-	/* compute the statistic locally */
-	kernel_p->init(features_p, features_p);
-	kernel_q->init(features_q, features_q);
-
-	SGMatrix<float64_t> K=kernel_p->get_kernel_matrix();
-	SGMatrix<float64_t> L=kernel_q->get_kernel_matrix();
-
-	K.center();
-	L.center();
-
-	Map<MatrixXd> Km(K.matrix, K.num_rows, K.num_cols);
-	Map<MatrixXd> Lm(L.matrix, L.num_rows, L.num_cols);
-
-	const MatrixXd& Km_inv=(Km+2*m*epsilon*MatrixXd::Identity(2*m, 2*m)).inverse();
-	const MatrixXd& Lm_inv=(Lm+2*m*epsilon*MatrixXd::Identity(2*m, 2*m)).inverse();
-
-	float64_t naive=(Km*Km_inv*Lm*Lm_inv).trace();
-
-	/* assert locally computed naive result */
-	EXPECT_NEAR(statistic, naive, 1E-15);
-
-	SG_UNREF(nocco);
-}
-
-TEST(NOCCO, compute_p_value)
-{
-	const index_t m=2;
-	const index_t d=3;
-	const float64_t epsilon=0.1;
-
-	SGMatrix<float64_t> p(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		p.matrix[i]=i;
-
-	SGMatrix<float64_t> q(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		q.matrix[i]=i+10;
-
-	CFeatures* features_p=new CDenseFeatures<float64_t>(p);
-	CFeatures* features_q=new CDenseFeatures<float64_t>(q);
-
-	float64_t sigma_x=2;
-	float64_t sigma_y=3;
-	float64_t sq_sigma_x_twice=sigma_x*sigma_x*2;
-	float64_t sq_sigma_y_twice=sigma_y*sigma_y*2;
-
-	/* shoguns kernel width is different */
-	CKernel* kernel_p=new CGaussianKernel(10, sq_sigma_x_twice);
-	CKernel* kernel_q=new CGaussianKernel(10, sq_sigma_y_twice);
-
-	CNOCCO* nocco=new CNOCCO(kernel_p, kernel_q, features_p, features_q);
-	nocco->set_epsilon(epsilon);
-
-	/* compute p-value via sampling null */
-	nocco->set_null_approximation_method(PERMUTATION);
-	EXPECT_NEAR(nocco->compute_p_value(0.05), 1.0, 1E-15);
-
-	SG_UNREF(nocco);
-}
-
-TEST(NOCCO, sample_null)
-{
-	const index_t m=10;
-	const index_t d=7;
-	const float64_t epsilon=0.1;
-
-	SGMatrix<float64_t> p(d,m);
-	for (index_t i=0; i<d*m; ++i)
-		p.matrix[i]=(i+8)%3;
-
-	SGMatrix<float64_t> q(d,m);
-	for (index_t i=0; i<d*m; ++i)
-		q.matrix[i]=((i+10)*(i%4+2))%4;
-
-	CFeatures* features_p=new CDenseFeatures<float64_t>(p);
-	CFeatures* features_q=new CDenseFeatures<float64_t>(q);
-
-	float64_t sigma_x=2;
-	float64_t sigma_y=3;
-	float64_t sq_sigma_x_twice=sigma_x*sigma_x*2;
-	float64_t sq_sigma_y_twice=sigma_y*sigma_y*2;
-
-	/* shogun's kernel width is different */
-	CKernel* kernel_p=new CGaussianKernel(10, sq_sigma_x_twice);
-	CKernel* kernel_q=new CGaussianKernel(10, sq_sigma_y_twice);
-
-	CNOCCO* nocco=new CNOCCO(kernel_p, kernel_q, features_p, features_q);
-	nocco->set_epsilon(epsilon);
-
-	/* do sampling null */
-
-	/* ensure that sampling null of nocco leads to same results as using
-	 * CKernelIndependenceTest */
-	CMath::init_random(1);
-	float64_t mean1=CStatistics::mean(nocco->sample_null());
-	float64_t var1=CStatistics::variance(nocco->sample_null());
-
-	CMath::init_random(1);
-	float64_t mean2=CStatistics::mean(
-			nocco->CKernelIndependenceTest::sample_null());
-	float64_t var2=CStatistics::variance(nocco->sample_null());
-
-	/* assert than results are the same from bot sampling null impl. */
-	EXPECT_NEAR(mean1, mean2, 1E-8);
-	EXPECT_NEAR(var1, var2, 1E-8);
-
-	SG_UNREF(nocco);
-}
diff --git a/tests/unit/statistics/QuadraticTimeMMD_unittest.cc b/tests/unit/statistics/QuadraticTimeMMD_unittest.cc
deleted file mode 100644
index 5bf7d8c5d0b..00000000000
--- a/tests/unit/statistics/QuadraticTimeMMD_unittest.cc
+++ /dev/null
@@ -1,838 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * Written (W) 2012-2013 Heiko Strathmann
- */
-
-#include <shogun/statistics/QuadraticTimeMMD.h>
-#include <shogun/kernel/GaussianKernel.h>
-#include <shogun/kernel/CustomKernel.h>
-#include <shogun/features/DenseFeatures.h>
-#include <shogun/features/streaming/generators/MeanShiftDataGenerator.h>
-#include <shogun/mathematics/Statistics.h>
-#include <shogun/mathematics/eigen3.h>
-#include <shogun/mathematics/Math.h>
-#include <gtest/gtest.h>
-
-using namespace shogun;
-using namespace Eigen;
-
-TEST(QuadraticTimeMMD,test_quadratic_mmd_biased)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(BIASED);
-
-	/* assert matlab result */
-	float64_t statistic=mmd->compute_statistic();
-	//SG_SPRINT("statistic=%f\n", statistic);
-	EXPECT_NEAR(statistic, 0.17882546486779649, 1E-15);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD,test_quadratic_mmd_biased_DEPRECATED)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(BIASED_DEPRECATED);
-
-	/* assert matlab result */
-	float64_t statistic=mmd->compute_statistic();
-	//SG_SPRINT("statistic=%f\n", statistic);
-	EXPECT_NEAR(statistic, 0.357650929735592, 10E-15);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD,test_quadratic_mmd_unbiased)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(UNBIASED);
-
-	/* assert matlab result */
-	float64_t statistic=mmd->compute_statistic();
-	//SG_SPRINT("statistic=%f\n", statistic);
-	EXPECT_NEAR(statistic, 0.13440094336133723, 1E-15);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD,test_quadratic_mmd_unbiased_DEPRECATED)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(UNBIASED_DEPRECATED);
-
-	/* assert matlab result */
-	float64_t statistic=mmd->compute_statistic();
-	//SG_SPRINT("statistic=%f\n", statistic);
-	float64_t difference=statistic-0.268801886722675;
-	EXPECT_LE(CMath::abs(difference), 10E-15);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD,test_quadratic_mmd_incomplete)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(INCOMPLETE);
-
-	/* assert local machine computed result */
-	float64_t statistic=mmd->compute_statistic();
-	EXPECT_NEAR(statistic, 0.16743977201175841, 1E-15);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD, test_quadratic_mmd_unbiased_different_num_samples)
-{
-	const index_t m=5;
-	const index_t n=6;
-	const index_t d=1;
-	float64_t data[] = {0.61318059, -0.69222999, 0.94424411, -0.48769626,
-		-0.00709551,  0.35025598, 0.20741384, -0.63622519, -1.21315264,
-	   	-0.77349617, -0.42707091};
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data[0]), sizeof(float64_t)*m);
-
-	SGMatrix<float64_t> data_q(d, n);
-	memcpy(&(data_q.matrix[0]), &(data[m]), sizeof(float64_t)*n);
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	CGaussianKernel* kernel=new CGaussianKernel(10, 2);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(UNBIASED);
-
-	/* assert python result at
-	 * https://github.com/lambday/shogun-hypothesis-testing/blob/master/mmd.py */
-	float64_t statistic=mmd->compute_statistic();
-	EXPECT_NEAR(statistic, -0.037500338130199401, 1E-9);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD, test_quadratic_mmd_unbiased_different_num_samples_DEPRECATED)
-{
-	const index_t m=5;
-	const index_t n=6;
-	const index_t d=1;
-	float64_t data[] = {0.61318059, -0.69222999, 0.94424411, -0.48769626,
-		-0.00709551,  0.35025598, 0.20741384, -0.63622519, -1.21315264,
-	   	-0.77349617, -0.42707091};
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data[0]), sizeof(float64_t)*m);
-
-	SGMatrix<float64_t> data_q(d, n);
-	memcpy(&(data_q.matrix[0]), &(data[m]), sizeof(float64_t)*n);
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	CGaussianKernel* kernel=new CGaussianKernel(10, 2);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(UNBIASED_DEPRECATED);
-
-	/* assert python result at
-	 * https://github.com/lambday/shogun-hypothesis-testing/blob/master/mmd.py */
-	float64_t statistic=mmd->compute_statistic();
-	EXPECT_NEAR(statistic, -0.151251364436, 1E-9);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD, test_quadratic_mmd_biased_different_num_samples)
-{
-	const index_t m=5;
-	const index_t n=6;
-	const index_t d=1;
-	float64_t data[] = {-0.47616889, -2.1767364, -0.04185537, -1.20787529,
-		1.94875193, -0.16695709, 2.51282666, -0.58116389, 1.52366887,
-		0.18985099, 0.76120258};
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data[0]), sizeof(float64_t)*m);
-
-	SGMatrix<float64_t> data_q(d, n);
-	memcpy(&(data_q.matrix[0]), &(data[m]), sizeof(float64_t)*n);
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	CGaussianKernel* kernel=new CGaussianKernel(10, 2);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(BIASED);
-
-	/* assert python result at
-	 * https://github.com/lambday/shogun-hypothesis-testing/blob/master/mmd.py */
-	float64_t statistic=mmd->compute_statistic();
-	EXPECT_NEAR(statistic, 0.54418915736201567, 1E-8);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD, test_quadratic_mmd_biased_different_num_samples_DEPRECATED)
-{
-	const index_t m=5;
-	const index_t n=6;
-	const index_t d=1;
-	float64_t data[] = {-0.47616889, -2.1767364, -0.04185537, -1.20787529,
-		1.94875193, -0.16695709, 2.51282666, -0.58116389, 1.52366887,
-		0.18985099, 0.76120258};
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data[0]), sizeof(float64_t)*m);
-
-	SGMatrix<float64_t> data_q(d, n);
-	memcpy(&(data_q.matrix[0]), &(data[m]), sizeof(float64_t)*n);
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	CGaussianKernel* kernel=new CGaussianKernel(10, 2);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-	mmd->set_statistic_type(BIASED_DEPRECATED);
-
-	/* assert python result at
-	 * https://github.com/lambday/shogun-hypothesis-testing/blob/master/mmd.py */
-	float64_t statistic=mmd->compute_statistic();
-	EXPECT_NEAR(statistic, 2.1948962593, 1E-8);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD,compute_variance_null)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-
-	/* assert local machine computed result */
-	mmd->set_statistic_type(UNBIASED);
-	float64_t var=mmd->compute_variance_under_null();
-	EXPECT_NEAR(var, 0.0064888052500351456, 1E-10);
-
-	mmd->set_statistic_type(BIASED);
-	var=mmd->compute_variance_under_null();
-	EXPECT_NEAR(var, 0.0071464012090942663, 1E-10);
-
-	mmd->set_statistic_type(INCOMPLETE);
-	var=mmd->compute_variance_under_null();
-	EXPECT_NEAR(var, 0.0064888052500342575, 1E-10);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD,compute_variance_alternative)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, features_p, features_q);
-
-	/* assert local machine computed result */
-	mmd->set_statistic_type(UNBIASED);
-	float64_t var=mmd->compute_variance_under_alternative();
-	EXPECT_NEAR(var, 0.0065377436264417842, 1E-15);
-
-	mmd->set_statistic_type(BIASED);
-	var=mmd->compute_variance_under_alternative();
-	EXPECT_NEAR(var, 0.0065069769045954847, 1E-15);
-
-	mmd->set_statistic_type(INCOMPLETE);
-	var=mmd->compute_variance_under_alternative();
-	EXPECT_NEAR(var, 0.0080742069013913682, 1E-15);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-}
-
-TEST(QuadraticTimeMMD, null_approximation_spectrum_different_num_samples)
-{
-	const index_t m=20;
-	const index_t n=30;
-	const index_t dim=3;
-
-	/* use fixed seed */
-	sg_rand->set_seed(12345);
-
-	float64_t difference=0.5;
-
-	/* streaming data generator for mean shift distributions */
-	CMeanShiftDataGenerator* gen_p=new CMeanShiftDataGenerator(0, dim, 0);
-	CMeanShiftDataGenerator* gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
-
-	/* stream some data from generator */
-	CFeatures* feat_p=gen_p->get_streamed_features(m);
-	CFeatures* feat_q=gen_q->get_streamed_features(n);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, feat_p, feat_q);
-
-	index_t num_null_samples=250;
-	index_t num_eigenvalues=10;
-	mmd->set_num_samples_spectrum(num_null_samples);
-	mmd->set_null_approximation_method(MMD2_SPECTRUM);
-	mmd->set_num_eigenvalues_spectrum(num_eigenvalues);
-
-	/* biased case */
-
-	/* compute p-value using spectrum approximation for null distribution and
-	 * assert against local machine computed result */
-	mmd->set_statistic_type(BIASED);
-	float64_t p_value_spectrum=mmd->perform_test();
-	EXPECT_NEAR(p_value_spectrum, 0.0, 1E-10);
-
-	/* unbiased case */
-
-	/* compute p-value using spectrum approximation for null distribution and
-	 * assert against local machine computed result */
-	mmd->set_statistic_type(UNBIASED);
-	p_value_spectrum=mmd->perform_test();
-	EXPECT_NEAR(p_value_spectrum, 0.004, 1E-10);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(feat_p);
-	SG_UNREF(feat_q);
-	SG_UNREF(gen_p);
-	SG_UNREF(gen_q);
-}
-
-TEST(QuadraticTimeMMD, null_approximation_spectrum_different_num_samples_DEPRECATED)
-{
-	const index_t m=20;
-	const index_t n=30;
-	const index_t dim=3;
-
-	/* use fixed seed */
-	sg_rand->set_seed(12345);
-
-	float64_t difference=0.5;
-
-	/* streaming data generator for mean shift distributions */
-	CMeanShiftDataGenerator* gen_p=new CMeanShiftDataGenerator(0, dim, 0);
-	CMeanShiftDataGenerator* gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
-
-	/* stream some data from generator */
-	CFeatures* feat_p=gen_p->get_streamed_features(m);
-	CFeatures* feat_q=gen_q->get_streamed_features(n);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance, convienience constructor */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, feat_p, feat_q);
-
-	index_t num_null_samples=250;
-	index_t num_eigenvalues=10;
-	mmd->set_num_samples_spectrum(num_null_samples);
-	mmd->set_null_approximation_method(MMD2_SPECTRUM_DEPRECATED);
-	mmd->set_num_eigenvalues_spectrum(num_eigenvalues);
-
-	/* biased case */
-
-	/* compute p-value using spectrum approximation for null distribution and
-	 * assert against local machine computed result */
-	mmd->set_statistic_type(BIASED_DEPRECATED);
-	float64_t p_value_spectrum=mmd->perform_test();
-	EXPECT_NEAR(p_value_spectrum, 0.0, 1E-10);
-
-	/* unbiased case */
-
-	/* compute p-value using spectrum approximation for null distribution and
-	 * assert against local machine computed result */
-	mmd->set_statistic_type(UNBIASED_DEPRECATED);
-	p_value_spectrum=mmd->perform_test();
-	EXPECT_NEAR(p_value_spectrum, 0.004, 1E-10);
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(feat_p);
-	SG_UNREF(feat_q);
-	SG_UNREF(gen_p);
-	SG_UNREF(gen_q);
-}
-
-TEST(QuadraticTimeMMD,test_quadratic_mmd_precomputed_kernel)
-{
-	index_t m=8;
-	index_t d=3;
-	SGMatrix<float64_t> data(d,2*m);
-	for (index_t i=0; i<2*d*m; ++i)
-		data.matrix[i]=i;
-
-	/* create data matrix for each features (appended is not supported) */
-	SGMatrix<float64_t> data_p(d, m);
-	memcpy(&(data_p.matrix[0]), &(data.matrix[0]), sizeof(float64_t)*d*m);
-
-	SGMatrix<float64_t> data_q(d, m);
-	memcpy(&(data_q.matrix[0]), &(data.matrix[d*m]), sizeof(float64_t)*d*m);
-
-	/* normalise data */
-	float64_t max_p=data_p.max_single();
-	float64_t max_q=data_q.max_single();
-
-	for (index_t i=0; i<d*m; ++i)
-	{
-		data_p.matrix[i]/=max_p;
-		data_q.matrix[i]/=max_q;
-	}
-
-	CDenseFeatures<float64_t>* features_p=new CDenseFeatures<float64_t>(data_p);
-	CDenseFeatures<float64_t>* features_q=new CDenseFeatures<float64_t>(data_q);
-	CFeatures* p_and_q=features_p->create_merged_copy(features_q);
-	SG_REF(p_and_q);
-
-	/* shoguns kernel width is different */
-	float64_t sigma=2;
-	float64_t sq_sigma_twice=sigma*sigma*2;
-	CGaussianKernel* kernel=new CGaussianKernel(10, sq_sigma_twice);
-
-	/* create MMD instance */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, p_and_q, m);
-	mmd->set_num_null_samples(10);
-
-	/* use fixed seed */
-	sg_rand->set_seed(12345);
-	SGVector<float64_t> null_samples=mmd->sample_null();
-
-	float64_t mean=CStatistics::mean(null_samples);
-	float64_t var=CStatistics::variance(null_samples);
-
-	//SG_SPRINT("mean %f, var %f\n", mean, var);
-
-	/* now again but with a precomputed kernel, same features.
-	 * This avoids re-computing the kernel matrix in every permutation
-	 * iteration and should be num_iterations times faster */
-
-	/* re-init kernel before kernel matrix is computed: this is due to a design
-	 * error in subsets and should be worked on! */
-	kernel->init(p_and_q, p_and_q);
-	CCustomKernel* precomputed_kernel=new CCustomKernel(kernel);
-	SG_UNREF(mmd);
-	mmd=new CQuadraticTimeMMD(precomputed_kernel, p_and_q, m);
-	mmd->set_num_null_samples(10);
-	sg_rand->set_seed(12345);
-	null_samples=mmd->sample_null();
-
-	/* assert that results do not change */
-	//SG_SPRINT("mean %f, var %f\n", CStatistics::mean(null_samples),
-	//		CStatistics::variance(null_samples));
-	EXPECT_LE(CMath::abs(mean-CStatistics::mean(null_samples)), 10E-8);
-	EXPECT_LE(CMath::abs(var-CStatistics::variance(null_samples)), 10E-8);
-
-	SG_UNREF(mmd);
-	SG_UNREF(features_p);
-	SG_UNREF(features_q);
-	SG_UNREF(p_and_q);
-}
-
-TEST(QuadraticTimeMMD,custom_kernel_vs_normal_kernel_DEPRECATED)
-{
-	/* number of examples kept low in order to make things fast */
-	index_t m=20;
-	index_t dim=2;
-	float64_t difference=0.5;
-
-	/* streaming data generator for mean shift distributions */
-	CMeanShiftDataGenerator* gen_p=new CMeanShiftDataGenerator(0, dim, 0);
-	CMeanShiftDataGenerator* gen_q=new CMeanShiftDataGenerator(difference, dim, 0);
-
-	/* stream some data from generator */
-	CFeatures* feat_p=gen_p->get_streamed_features(m);
-	CFeatures* feat_q=gen_q->get_streamed_features(m);
-
-	/* set kernel a-priori. usually one would do some kernel selection. See
-	 * other examples for this. */
-	float64_t width=10;
-	CGaussianKernel* kernel=new CGaussianKernel(10, width);
-
-	/* create quadratic time mmd instance. Note that this constructor
-	 * copies p and q and does not reference them */
-	CQuadraticTimeMMD* mmd=new CQuadraticTimeMMD(kernel, feat_p, feat_q);
-
-	/* set up for a precomputed custom kernel using merged features p_and_q */
-	CGaussianKernel* kernel2=new CGaussianKernel(10, width);
-	CFeatures* p_and_q=mmd->get_p_and_q();
-	kernel2->init(p_and_q, p_and_q);
-	CCustomKernel* precomputed=new CCustomKernel(kernel2);
-	CQuadraticTimeMMD* mmd2=new CQuadraticTimeMMD(precomputed, m);
-	SG_UNREF(p_and_q);
-	SG_UNREF(kernel2);
-
-	/* perform test: compute p-value and test if null-hypothesis is rejected for
-	 * a test level of 0.05 */
-	float64_t alpha=0.05;
-
-	mmd->set_null_approximation_method(PERMUTATION);
-	mmd->set_statistic_type(BIASED_DEPRECATED);
-	mmd->set_num_null_samples(3);
-	mmd->set_num_eigenvalues_spectrum(3);
-	mmd->set_num_samples_spectrum(250);
-
-	mmd2->set_null_approximation_method(PERMUTATION);
-	mmd2->set_statistic_type(BIASED_DEPRECATED);
-	mmd2->set_num_null_samples(3);
-	mmd2->set_num_eigenvalues_spectrum(3);
-	mmd2->set_num_samples_spectrum(250);
-
-	/* compute tpye I and II error using normal and precomputed kernel */
-	index_t num_trials=3;
-
-	SGVector<index_t> inds(2*m);
-	inds.range_fill();
-
-	/* use fixed seed */
-	CMath::init_random(1);
-	for (index_t i=0; i<num_trials; ++i)
-	{
-		/* this effectively means that p=q - rejecting is tpye I error */
-		CMath::permute(inds);
-
-		/* setting seed for Gaussian samples used in spectrum approximation method */
-		sg_rand->set_seed(1);
-
-		/* first, we compute using normal kernel */
-		p_and_q->add_subset(inds);
-		float64_t type_I_mmds=mmd->compute_statistic();
-		mmd->set_null_approximation_method(PERMUTATION);
-		float64_t type_I_threshs_boot=mmd->compute_threshold(alpha);
-		mmd->set_null_approximation_method(MMD2_SPECTRUM_DEPRECATED);
-		float64_t type_I_threshs_spectrum=mmd->compute_threshold(alpha);
-		mmd->set_null_approximation_method(MMD2_GAMMA);
-		float64_t type_I_threshs_gamma=mmd->compute_threshold(alpha);
-		p_and_q->remove_subset();
-
-		float64_t type_II_mmds=mmd->compute_statistic();
-		mmd->set_null_approximation_method(PERMUTATION);
-		float64_t type_II_threshs_boot=mmd->compute_threshold(alpha);
-		mmd->set_null_approximation_method(MMD2_SPECTRUM_DEPRECATED);
-		float64_t type_II_threshs_spectrum=mmd->compute_threshold(alpha);
-		mmd->set_null_approximation_method(MMD2_GAMMA);
-		float64_t type_II_threshs_gamma=mmd->compute_threshold(alpha);
-
-		/* now compute using precomputed custom kernel */
-
-		/* setting seed for Gaussian samples used in spectrum approximation method */
-		sg_rand->set_seed(1);
-
-		precomputed->add_row_subset(inds);
-		precomputed->add_col_subset(inds);
-		float64_t type_I_mmds_pre=mmd2->compute_statistic();
-		mmd2->set_null_approximation_method(PERMUTATION);
-		float64_t type_I_threshs_boot_pre=mmd2->compute_threshold(alpha);
-		mmd2->set_null_approximation_method(MMD2_SPECTRUM_DEPRECATED);
-		float64_t type_I_threshs_spectrum_pre=mmd2->compute_threshold(alpha);
-		mmd2->set_null_approximation_method(MMD2_GAMMA);
-		float64_t type_I_threshs_gamma_pre=mmd2->compute_threshold(alpha);
-		precomputed->remove_row_subset();
-		precomputed->remove_col_subset();
-
-		float64_t type_II_mmds_pre=mmd2->compute_statistic();
-		mmd2->set_null_approximation_method(PERMUTATION);
-		float64_t type_II_threshs_boot_pre=mmd2->compute_threshold(alpha);
-		mmd2->set_null_approximation_method(MMD2_SPECTRUM_DEPRECATED);
-		float64_t type_II_threshs_spectrum_pre=mmd2->compute_threshold(alpha);
-		mmd2->set_null_approximation_method(MMD2_GAMMA);
-		float64_t type_II_threshs_gamma_pre=mmd2->compute_threshold(alpha);
-
-		/* assert results from both */
-		EXPECT_NEAR(type_I_mmds, type_I_mmds_pre, 1E-6);
-		EXPECT_NEAR(type_I_threshs_boot, type_I_threshs_boot_pre, 1E-6);
-		EXPECT_NEAR(type_I_threshs_spectrum, type_I_threshs_spectrum_pre, 1E-6);
-		EXPECT_NEAR(type_I_threshs_gamma, type_I_threshs_gamma_pre, 1E-6);
-		EXPECT_NEAR(type_II_mmds, type_II_mmds_pre, 1E-5);
-		EXPECT_NEAR(type_II_threshs_boot, type_II_threshs_boot_pre, 1E-6);
-		EXPECT_NEAR(type_II_threshs_spectrum, type_II_threshs_spectrum_pre, 1E-6);
-		EXPECT_NEAR(type_II_threshs_gamma, type_II_threshs_gamma_pre, 1E-6);
-	}
-
-	/* clean up */
-	SG_UNREF(mmd);
-	SG_UNREF(mmd2);
-	SG_UNREF(gen_p);
-	SG_UNREF(gen_q);
-
-	/* convienience constructor of MMD was used, these were not referenced */
-	SG_UNREF(feat_p);
-	SG_UNREF(feat_q);
-}