# SGD Classification

SGD(Stochastic Gradient Descent) Classification

Stochastic Gradient Descent (SGD) 은 convex 형태의 loss 함수를 통한 최적화 문제에 매우 효율적인 접근방법입니다. SGD는 큰 규모이면서 희소한 형태의 데이터(예로, 10^5 이상의 Feature를 가진 10^5 개 이상의 학습 데이터)에서도 좋은 성과를 나타냅니다. 

장점:
* 효율적이며, 구현이 쉬움


단점:
* 효율적 학습을 위해 hyperparameters 에 대한 tunning 이 필요
* Feature scaling 에 민감


SGD에서 살펴볼 주요 파라미터는 loss, penalty, l1_ratio 입니다.

loss
* loss="hinge": (soft-margin) linear Support Vector Machine,
* loss="modified_huber": smoothed hinge loss,
* loss="log": logistic regression,
* ‘squared_hinge’, ‘perceptron’, or a regression loss: ‘squared_loss’, ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’


penalty
* penalty="l2": L2 norm penalty on coef_.
* penalty="l1": L1 norm penalty on coef_.
* penalty="elasticnet": Convex combination of L2 and L1; (1 - l1_ratio) * L2 + l1_ratio * L1.


l1_ratio
* The Elastic Net mixing parameter, with 0 <= l1_ratio <= 1. l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1. Defaults to 0.15.




In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

import numpy as np
from sklearn import datasets, model_selection, linear_model, metrics

# 데이터
np.random.seed(0)
n_samples = 100000
np_data_xs, np_data_ys = datasets.make_classification(
    n_samples=n_samples, # 데이터 수
    n_features=10, # X feature 수
    n_informative=3,
    n_classes=3, # Y class 수
    random_state=0) # 난수 발생용 Seed 값
print("data shape: np_data_xs={}, np_data_ys={}".format(np_data_xs.shape, np_data_ys.shape))
np_train_xs, np_test_xs, np_train_ys, np_test_ys = model_selection.train_test_split(
    np_data_xs, np_data_ys, 
    test_size=0.3, shuffle=True, random_state=2)
print("train shape: np_train_xs={}, np_train_ys={}".format(np_train_xs.shape, np_train_ys.shape))
print("test shape: np_test_xs={}, np_test_ys={}".format(np_test_xs.shape, np_test_ys.shape))

# 모델
models = [
    linear_model.SGDClassifier()
]

for model in models:
    # 학습
    print("model={}".format(model))
    model.fit(np_train_xs, np_train_ys)

    # 평가
    np_pred_ys = model.predict(np_test_xs)

    # 선형 회귀 모델링을 통해 얻은 coefficient, intercept 입니다.
    print("coefficient={}".format(model.coef_))
    print("intercept={}".format(model.intercept_))

    # 평가: 테스트 데이터에 대해서 Accuracy 값을 구합니다.
    acc = metrics.accuracy_score(np_test_ys, np_pred_ys)
    print("acc={:.5f}".format(acc))

    cr = metrics.classification_report(np_test_ys, np_pred_ys)
    print("classification_report\n", cr)

data shape: np_data_xs=(100000, 10), np_data_ys=(100000,)
train shape: np_train_xs=(70000, 10), np_train_ys=(70000,)
test shape: np_test_xs=(30000, 10), np_test_ys=(30000,)
model=SGDClassifier(alpha=0.0001, average=False, class_weight=None,
       early_stopping=False, epsilon=0.1, eta0=0.0, fit_intercept=True,
       l1_ratio=0.15, learning_rate='optimal', loss='hinge', max_iter=None,
       n_iter=None, n_iter_no_change=5, n_jobs=None, penalty='l2',
       power_t=0.5, random_state=None, shuffle=True, tol=None,
       validation_fraction=0.1, verbose=0, warm_start=False)
coefficient=[[ 0.86574911 -0.35596145 -0.20190227  0.1375304   0.45958955  0.06000423
  -0.06038292  0.26419997 -0.2317192  -0.17356518]
 [-1.19738988 -0.14076485  0.96455876  0.07127453  0.00822414  0.06779517
   0.11478303 -0.16507975 -0.10597724 -0.25717476]
 [ 0.28599846  0.20450951 -0.95752513  0.0636483  -0.68330475 -0.05369482
  -0.14439814 -0.15918971  0.2032468   0.60044455]]
intercept=[-1.12265416 -1.340428