### When To Use Random Projection Instead of PCA
For Every high-dimentional datasets, PCA can be to slow, computational complexity is polynomial for PCA. For high-dimentional datasets (e.g., image data), training might be slower, in that case we can use random projection instead.

### Random Projection
A random projection from d dimensions to d′ dimensions is a linear transformation represented by a d × d′matrix R, which is generated by first setting each en-try of the matrix to a value drawn from an i.i.d N(0,1) distribution and then normalizing the columns to unit length. Given a d-dimensional data set represented as an n × d matrix X, where n is the number of data points in X, the mapping X × R results in a reduced-dimension data set X′.
Ref-https://web.engr.oregonstate.edu/~xfern/rpm_icml03.pdf

In [1]:
from sklearn.random_projection import johnson_lindenstrauss_min_dim

m, epsilon = 5000, 0.1
d = johnson_lindenstrauss_min_dim(m, eps=epsilon)
d

7300

In [2]:
# randomly generated matrix P[d,n]
import numpy as np

n = 20_000
np.random.seed(42)
P = np.random.randn(d,n)  # fake data
P.shape

(7300, 20000)

In [3]:
X = np.random.randn(m,n)
X_reduced = X @ P.T
X_reduced.shape

(5000, 7300)

### Scikit-Learn's GaussianRandomProjection Class

In [4]:
from sklearn.random_projection import GaussianRandomProjection

gaussian_rnd_proj = GaussianRandomProjection(eps=epsilon, random_state=42)
X_reduced = gaussian_rnd_proj.fit_transform(X) # same result as above
X_reduced.shape

(5000, 7300)

### SparseRandomProjection Class: Faster Than The GaussianRandomProjection Class

In [5]:
from sklearn.random_projection import SparseRandomProjection

sparse_rnd_proj = SparseRandomProjection(eps=epsilon, dense_output=True, random_state=42)
X_reduced = sparse_rnd_proj.fit_transform(X)
X_reduced.shapepe

(5000, 7300)

However, random projection is not used always to reduce dimentionality reduction of large datasets.