# Coordinate descent

CuML's `lasso` and `elastic net` implementations are both able to use the coordinate descent solver. Lasso extends linear regression by providing L1 regularization and elastic net extends linear regression by providing a combination of L1 and L2 regularizers.

A tremendous speed up can be demonstrated for datasets with a large number of rows and fewer columns. Furthermore, the mean squared error (MSE) value for cuML's implementation is much smaller than the Scikit-learn implementation on very small datasets.

The model can take array-like objects, either in host as NumPy arrays or in device (as Numba or cuda_array_interface-compliant), as well  as cuDF DataFrames. 

For information about cuDF, refer to the [cuDF documentation](https://rapidsai.github.io/projects/cudf/en/latest/) 

For information about cuML's lasso implementation: https://rapidsai.github.io/projects/cuml/en/latest/api.html#lasso-regression

For information about cuML's elastic net implementation: https://rapidsai.github.io/projects/cuml/en/latest/api.html#elasticnet-regression

In [None]:
import os

import numpy as np

import pandas as pd
import cudf as gd

from sklearn.datasets import make_regression
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

from cuml.linear_model import Lasso as cuLasso
from sklearn.linear_model import Lasso as skLasso

from cuml.linear_model import ElasticNet as cuElasticNet
from sklearn.linear_model import ElasticNet as skElasticNet

## Define Parameters

In [None]:
n_samples = 2**17
n_features = 500

learning_rate = 0.001
algorithm = "cyclic"

## Generate Data

### Host

In [None]:
%%time
X,y = make_regression(n_samples=n_samples, n_features=n_features, random_state=0)

X = pd.DataFrame(X)
y = pd.DataFrame(y)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=0)

### GPU

In [None]:
%%time
X_cudf = gd.DataFrame.from_pandas(X_train)
X_cudf_test = gd.DataFrame.from_pandas(X_test)

y_cudf = gd.Series(y_train.values[:, 0])

## Lasso

### Scikit-learn Model

#### Fit

In [None]:
%%time
ols_sk = skLasso(alpha=np.array([learning_rate]), 
                 fit_intercept = True, 
                 normalize = False,
                 max_iter = 1000,
                 selection=algorithm,
                 tol=1e-10)

ols_sk.fit(X_train, y_train)

#### Predict

In [None]:
%%time
predict_sk = ols_sk.predict(X_test)

#### Evaluate

In [None]:
error_sk = mean_squared_error(y_test, predict_sk)

### cuML Model

#### Fit

In [None]:
%%time
ols_cuml = cuLasso(alpha=np.array([learning_rate]),
                   fit_intercept = True,
                   normalize = False,
                   max_iter = 1000,
                   selection=algorithm,
                   tol=1e-10)

ols_cuml.fit(X_cudf, y_cudf)

#### Predict

In [None]:
%%time
predict_cuml = ols_cuml.predict(X_cudf_test).to_array()

#### Evaluate

In [None]:
error_cuml = mean_squared_error(y_test, predict_cuml)

### Compare Results

In [None]:
print("SKL MSE(y): %s" % error_sk)
print("CUML MSE(y): %s" % error_cuml)

## Elastic Net

The elastic net model implemented in cuml contains the same parameters as the lasso model.
In addition to the variable values that can be altered in lasso, elastic net has another variable who's value can be changed: `l1_ratio` decides the ratio of amount of L1 and L2 regularization that would be applied to the model. When `l1_ratio = 0`, the model will have only L2 reqularization shall be applied to the model. (default = 0.5)

### Scikit-learn Model

#### Fit

In [None]:
%%time
elastic_sk = skElasticNet(alpha=np.array([learning_rate]), 
                          fit_intercept = True, 
                          normalize = False, 
                          max_iter = 1000, 
                          selection=algorithm, 
                          tol=1e-10)

elastic_sk.fit(X_train, y_train)

#### Predict

In [None]:
%%time
predict_elas_sk = elastic_sk.predict(X_test)

#### Evaluate

In [None]:
error_elas_sk = mean_squared_error(y_test, predict_elas_sk)

### CuML Model

#### Fit

In [None]:
%%time
elastic_cuml = cuElasticNet(alpha=np.array([learning_rate]), 
                          fit_intercept = True, 
                          normalize = False, 
                          max_iter = 1000, 
                          selection=algorithm, 
                          tol=1e-10)

elastic_cuml.fit(X_cudf, y_cudf)

#### Predict

In [None]:
%%time
predict_elas_cuml = elastic_cuml.predict(X_cudf_test).to_array()

#### Evaluate

In [None]:
error_elas_cuml = mean_squared_error(y_test, predict_elas_cuml)

### Evaluate Results

In [None]:
print("SKL MSE(y): %s" % error_elas_sk)
print("CUML MSE(y): %s" % error_elas_cuml)