# How to Scikit-Learn

## Supervised Learning

### PCA

https://scikit-learn.org/stable/modules/decomposition.html#pca
The point is to find the successive orthogonal components that explain most of the variance of the centered data set.
Here is a very simple video on the Topic https://www.youtube.com/watch?v=FgakZw6K1QQ

Here is the scikit-learn documentation
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html#sklearn.decomposition.PCA

    from sklearn.decomposition import PCA
    pca = PCA(n_components=[# , 'mle', %])
    pca.fit(X)

you can specify in n_components
* number of features to keep
* 'mle' to let Minka's MLE algorithm fit it for you https://vismod.media.mit.edu/tech-reports/TR-514.pdf
* a percentage between 0 and 1 that represents the amount of total variance that should be explained by your features

Useful attributes
* components_ : array, shape (n_components, n_features) -- Gives you the n_components components (rows) and the contribution of each feature (columns)
* explained_variance_ (ratio_) : array, shape (n_components,) -- Gives you the variance explained by each component

Some Methods
* fit(X) : fits the model with X
* fit_transform(X) : fits AND returns the transformed data
* transform(X) : returns the transformed data using the fitted model
* inverse_transform(X) : transform your data back to the original space
* get_covariance() : computes the covariance matrix $cov \in \mathscr{M}_{n_{features}}$  
$$cov =  components^T * S^2 * components + \boldsymbol{\sigma_2} * I_{n_{features}}$$ 
where $S^2$ contains the explained variances, and $\boldsymbol{\sigma_2}$ contains the noise variances.
* get_precision() : computes the precision (inverse of the covariance)

If you're inteerested in only a certain part of the whole dataset you can use the 
* svd_solver='randomized' : it only uses the right amount of data to predict the n_features wanted

#### Incremental PCA

For big sized data you would want to use chunks of data.
It computes estimates of components and noise variances from a batch and then updates them with the next batch <br>
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.IncrementalPCA.html

#### Kernel PCA

You can use a special kernel to separate non linear datasets :
* kernel : “linear” | “poly” | “rbf” | “sigmoid” | “cosine” | “precomputed” : those are all the different kernel available <br>
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.KernelPCA.html#sklearn.decomposition.KernelPCA


#### Sparse PCA

You can use Sparse PCA to yield sparse component, this is used via a Lasso ($l_1$) regularization
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.SparsePCA.html#sklearn.decomposition.SparsePCA

#### Truncated SVD

If you have a large sparse dataset that you don't want to center (because of Out Of Memory Error) use this algorithm (ex : tf-idf count matrices)
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html