In [1]:
from pylab import *
from scipy import linalg
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go

init_notebook_mode(connected=True)

Below is a simple way to generate data isotropically from K subspaces. We'll look at both normalized and unnormalized data. You should generate your data like this :)

In [2]:
D = 100
d = 10
K = 2
Nk = 200

In [3]:
Ufull = empty((K,D,d))
X = empty((D,0))
for kk in range(K):
    Ufull[kk,:,:] = linalg.orth(randn(D,d))
    Xk = Ufull[kk,:,:] @ randn(d,Nk)
    X = np.append(X, Xk, axis=1)

The data above is isotropic, since it's drawn from a $\mathcal{N}(0,1)$ distribution in each subspace. However, it does not have unit norm. For example:

In [4]:
norm(X, axis=0)[0:4]

array([ 4.76825367,  1.67160182,  3.37449801,  1.84221364])

So let's normalize the data and check our result.

In [5]:
Xn = X / norm(X, axis=0)
norm(Xn, axis=0)[0:10]

array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

Now that the columns are normalized, we can add some noise with variance $\sigma^{2}$ to them and the signal-to-noise ratio will be
\begin{equation}
    \text{SNR} = \frac{1}{\sigma^{2}}
\end{equation}

In [22]:
varn = 0.01
noise = sqrt(varn)*randn(X.shape[0],X.shape[1])
Xnoisy = Xn + noise