## 8. What strategies do you know (or can think of) in order to make PCA more robust?

## Sparce PCA

Sparse PCA is a specialized variant of Principal Component Analysis (PCA) in machine learning that is used in statistical analysis, especially when analyzing multivariate data. It is used to reduce the dimensionality of a dataset by introducing sparsity structures in the input features.

Using the standard PCA, we can only select the most important midrange features, assuming each instance can be rebuilt using the same components. But by using the sparse method, we can use a limited number of components, but without the limitation given by a dense projection matrix. This can be done using a sparse matrix, where the number of non-zero elements is quite low

It is an improved version of the PCA algorithm that retains some degrees of scattering, that is, those data that generate a greater scatter with respect to the other data. Typically, PCA sparseness is controlled by a hyperparameter called alpha, which allows you to generate linear transformations of the data based on the amount of sparseness in the data that you want to control. Unlike the classic PCA, the sparse PCA performs a finer dimension reduction, which, the consumption of RAM memory is greater with respect to the classic PCA.

Reference: 
- https://thecleverprogrammer.com/2021/05/09/sparse-pca-in-machine-learning/


Example

In [1]:
from sklearn.decomposition import SparsePCA
from sklearn.datasets import load_digits
digits = load_digits()
print(digits.data.shape)

(1797, 64)


In [2]:
sparse_pca = SparsePCA(n_components=60, alpha=0.1)
sparse_pca.fit_transform(digits.data / 255)
print(sparse_pca.components_.shape)

(60, 64)


## Incremental PCA

Incremental principal component analysis (IPCA) is typically used as a replacement for principal component analysis (PCA) when the dataset to be decomposed is too large to fit in memory. IPCA builds a low-rank approximation for the input data using an amount of memory which is independent of the number of input data samples. It is still dependent on the input data features, but changing the batch size allows for control of memory usage.

Example

In [8]:
from sklearn.decomposition import IncrementalPCA
from sklearn.datasets import load_digits
digits = load_digits()
print(digits.data.shape)

(1797, 64)


In [9]:
incremental_pca = IncrementalPCA(n_components=5, batch_size=10)
incremental_pca.fit_transform(digits.data / 255)
print(incremental_pca.components_.shape)

(5, 64)
