# Ridge Regression

Ridge Regression 實施了L1正規化，使用GPU訓練時，用eig演算法訓練會比較快速，使用svd演算法會較準確。
不過使用svd時，會吃較多記憶體，相對來說也較慢。


## 1. Import 相關套件

In [1]:
import cudf
from ncue import make_regression, train_test_split
from ncue.metrics.regression import r2_score
from ncue.linear_model import Ridge as cuRidge
from sklearn.linear_model import Ridge as skRidge

## 2. 定義 Parameters

In [2]:
n_samples = 2**20
n_features = 399

random_state = 23

## 3. 產生測試資料

In [3]:
%%time
X, y = make_regression(n_samples=n_samples, n_features=n_features, random_state=0)

X = cudf.DataFrame.from_gpu_matrix(X)
y = cudf.DataFrame.from_gpu_matrix(y)[0]

X_cudf, X_cudf_test, y_cudf, y_cudf_test = train_test_split(X, y, test_size = 0.2, random_state=random_state)

CPU times: user 3.31 s, sys: 1.13 s, total: 4.43 s
Wall time: 5.65 s


In [4]:
# 將資料從GPU MEMORY複製到RAM，方便sklearn使用，以利最後結果的比對
X_train = X_cudf.to_pandas()
X_test = X_cudf_test.to_pandas()
y_train = y_cudf.to_pandas()
y_test = y_cudf_test.to_pandas()

## 4. Scikit-learn 模型(CPU)

### 訓練模型

In [5]:
%%time
ridge_sk = skRidge(fit_intercept=False, normalize=True, alpha=0.1)

ridge_sk.fit(X_train, y_train)

CPU times: user 7.15 s, sys: 2.05 s, total: 9.2 s
Wall time: 2.79 s


### 預測

In [6]:
%%time
predict_sk= ridge_sk.predict(X_test)

CPU times: user 396 ms, sys: 826 ms, total: 1.22 s
Wall time: 172 ms


### 評估

In [7]:
%%time
r2_score_sk = r2_score(y_cudf_test, predict_sk)

CPU times: user 12.3 ms, sys: 91.9 ms, total: 104 ms
Wall time: 7.94 ms


## 5. NCUE 模型(GPU)

### 訓練模型

In [8]:
%%time
ridge_ncue = cuRidge(fit_intercept=False, normalize=True, solver='eig', alpha=0.1)

ridge_ncue.fit(X_cudf, y_cudf)

CPU times: user 459 ms, sys: 709 ms, total: 1.17 s
Wall time: 508 ms


### 預測

In [9]:
%%time
predict_ncue = ridge_ncue.predict(X_cudf_test)

CPU times: user 194 ms, sys: 1.7 ms, total: 196 ms
Wall time: 195 ms


### 評估

In [10]:
%%time
r2_score_ncue = r2_score(y_cudf_test, predict_ncue)

CPU times: user 962 µs, sys: 0 ns, total: 962 µs
Wall time: 969 µs


## 6. 比對運算結果(CPU vs. GPU)

In [11]:
print("R^2 score (CPU):  %s" % r2_score_sk)
print("R^2 score (GPU): %s" % r2_score_ncue)

R^2 score (SKL):  1.0
R^2 score (ncue): 1.0
