# Ridge & Lasso & Elastic Net

## Ridge

In [1]:
from sklearn.linear_model import Ridge

Linear Least Squares with L2 regularization and minimizes the objective function:
$$||y - Xw||^2_2 + alpha * ||w||^2_2$$

#### Parameters
- alpah: regularization strength and must be a positive float. This reduces the variance of the estimates. If an array is passed, penalties are expected to be specific to targets. 
- fit_intercept: whether to calculate the intercept for this model. 
- normalize: ignored when fit_intercept is False. If True regressors will be normalized before regression by subtracting the mean and dividing by the L2-Norm. To standardize use sklearn.preprocessing.StandardScaler.
- copy_X: if True X will be copied and not overwritten.
- max-iter: number of iterations for gradient solver. 
- tol: precision of the solution
- solver:
    - auto: chooses solver automatically based on data
    - svd: uses Singular Value Decomposition of X. This is more appropriate than cholesky for singular matrices
    - cholesky: uses scipy.linalg.solve to obtain closed-form solution
    - sparse_cg: uses conjugate gradient solver. This is more appropriate than cholesky for large-scale data.
   - lsqr: uses least squares (fastest)
   - saga: uses Stochastic Gradient descent. Usually fastest when n_features and n_samples is large.
- random_state

#### Attributes
- coef_: weight vectors
- intercept_: independent term in decision function
- n_iter_: actual number of iterations for each target. This is available for only sag and lsqr solvers. 

#### Methods
- fit
- get_parames
- predict
- score
- set_params

## Lasso

In [3]:
from sklearn.linear_model import Lasso

#### Parameters
- alpah: regularization strength and must be a positive float. This reduces the variance of the estimates. If an array is passed, penalties are expected to be specific to targets. 
- fit_intercept: whether to calculate the intercept for this model. 
- normalize: ignored when fit_intercept is False. If True regressors will be normalized before regression by subtracting the mean and dividing by the L2-Norm. To standardize use sklearn.preprocessing.StandardScaler.
- precompute: whether to use a precomputed Gram matrix to speed up calculations.
- copy_X: if True X will be copied and not overwritten.
- max-iter: number of iterations for gradient solver. 
- tol: precision of the solution
- warm_start: reuse the solution of the previos call to fit as init
- positive: forces coefficients to be positive
- selection: if 'random' a random coefficient is updated every iteration rather than looping over features sequentially. 

#### Attributes
- n_features
- intercept_
- n_iter_

#### Methods
- fit()
- get_params()
- path(X, y[, l1_ratio, eps, n_alphas, …]): Compute elastic net path with coordinate descent
    - X: training data
    - y: target values
    - l1-ratio: float between 0 and 1 passed to elastic net
    - eps: length of the path
    - n_alphas: list of alphas where to compute the models
    - precompute: use Gram matrix
    - Xy: X transpose y (dot multiplication). Useful when Gram matrix is computed
    - copy_X
    - coef_init: initial coefficient values
    - return_n_iter
    - positive
    - check_input
- predict()
- score()
- set_params()

## Elastic Net

In [5]:
from sklearn.linear_model import ElasticNet

#### Parameters
- alpha
- l1_ratio
- fit_intercept
- normalize
- precompute
- max_iter
- copy_X
- tol
- warm_start
- positive
- random_state
- selection

#### Attributes
- coef_
- intercept_
- n_iter_

#### Methods
- fit
- get_params
- path
- score
- set_params