### OvR & OvO 모듈 활용
- OvR = OneVsRest
- OvO = OneVsOne

(1) 모듈 로딩 & 데이터 준비

In [108]:
## 모듈 로딩
from sklearn.multiclass import OneVsOneClassifier, OneVsRestClassifier
from sklearn.linear_model import LogisticRegression
import pandas as pd
import numpy as np

In [109]:
# 데이터 준비
file = '../data/fish.csv'
fishDF = pd.read_csv(file)
fishDF.head()

Unnamed: 0,Species,Weight,Length,Diagonal,Height,Width
0,Bream,242.0,25.4,30.0,11.52,4.02
1,Bream,290.0,26.3,31.2,12.48,4.3056
2,Bream,340.0,26.5,31.1,12.3778,4.6961
3,Bream,363.0,29.0,33.5,12.73,4.4555
4,Bream,430.0,29.0,34.0,12.444,5.134


(2) 데이터셋 준비

(2-1) 피처/타겟 분리

In [110]:
featureDF = fishDF[fishDF.columns[1:]]
targetDF = fishDF[fishDF.columns[0]]

In [111]:
print(f"featureDF : {featureDF.shape}")
print(f"targetDF : {targetDF.shape}")

featureDF : (159, 5)
targetDF : (159,)


In [112]:
targetDF.unique()

array(['Bream', 'Roach', 'Whitefish', 'Parkki', 'Perch', 'Pike', 'Smelt'],
      dtype=object)

(2-2) 학습용/테스트용 데이터셋 준비 

In [113]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(featureDF, targetDF, stratify=targetDF, random_state=11)
print(f'[train dataset] {X_train.shape}, {y_train.shape}')
print(f'[test dataset] {X_test.shape}, {y_test.shape}')

[train dataset] (119, 5), (119,)
[test dataset] (40, 5), (40,)


(3) 학습 진행

In [114]:
## OVO/OVR에서 사용할 관측치 (Estimator) 생성
model = LogisticRegression(solver = 'liblinear')

(3-1) OvO 기반 학습 진행

In [115]:
ovoModel = OneVsOneClassifier(model)
ovoModel.fit(X_train, y_train)

In [116]:
# 모델 파라미터 확인
print(f"classes_ : {ovoModel.classes_}")
print(f"feature_names_in : {ovoModel.feature_names_in_}")
print(f"estimators_ : {len(ovoModel.estimators_)}개")     #7*6/2

classes_ : ['Bream' 'Parkki' 'Perch' 'Pike' 'Roach' 'Smelt' 'Whitefish']
feature_names_in : ['Weight' 'Length' 'Diagonal' 'Height' 'Width']
estimators_ : 21개


In [117]:
# 평가
print(f"[Train Score] {ovoModel.score(X_train, y_train)}")
print(f"[Test Score] {ovoModel.score(X_test, y_test)}")

[Train Score] 0.957983193277311
[Test Score] 0.925


In [118]:
# 예측
ovoModel.predict(X_test[:2])

array(['Bream', 'Parkki'], dtype=object)

In [119]:
ovoModel.decision_function(X_test[:2])
#얘가 왜 Bream이라고 결과를 내렸는지 그 과정을 보여줌

array([[ 6.32094951,  5.32872468,  2.32890163,  0.68506766,  3.322758  ,
        -0.33168462,  4.3140798 ],
       [ 4.26849104,  6.32543178,  2.3234672 ,  0.67951149,  5.319289  ,
         0.67104504,  1.85564622]])

In [120]:
ovoModel.decision_function(X_test)

array([[ 6.32094951,  5.32872468,  2.32890163,  0.68506766,  3.322758  ,
        -0.33168462,  4.3140798 ],
       [ 4.26849104,  6.32543178,  2.3234672 ,  0.67951149,  5.319289  ,
         0.67104504,  1.85564622],
       [ 6.32287166,  4.32937271,  2.33045826,  1.04130317,  3.32286917,
        -0.33217509,  5.32135743],
       [-0.31698922,  4.30727888,  6.32181067,  0.69272385,  5.31510032,
         2.68531819,  1.70018312],
       [-0.3193889 ,  4.29876919,  6.3233354 ,  0.70691145,  5.31506751,
         2.68334065,  1.69981813],
       [-0.31992394,  2.70465809,  5.26935084,  1.72912082,  4.30701097,
         6.32315231,  0.69302848],
       [ 4.32872803,  1.33115335,  6.33234265,  3.32967376,  2.31700958,
        -0.33289414,  5.32965875],
       [ 0.69526383,  2.32450671,  6.33001521,  3.30819361,  5.31761676,
        -0.33133912,  4.29676335],
       [ 2.32295666,  1.33062092,  5.33243552,  6.33077265,  3.32026527,
        -0.33289516,  4.32935424],
       [ 2.29564076,  3.3271

(3-2) OvR 기반 학습 진행

In [121]:
ovrModel = OneVsRestClassifier(model)
ovrModel.fit(X_train, y_train)

In [122]:
# 모델 파라미터 확인
print(f"classes_ : {ovrModel.classes_}")
print(f"feature_names_in : {ovrModel.feature_names_in_}")
print(f"estimators_ : {len(ovrModel.estimators_)}개")

classes_ : ['Bream' 'Parkki' 'Perch' 'Pike' 'Roach' 'Smelt' 'Whitefish']
feature_names_in : ['Weight' 'Length' 'Diagonal' 'Height' 'Width']
estimators_ : 7개


In [123]:
# 평가
print(f"[Train Score] {ovrModel.score(X_train, y_train)}")
print(f"[Test Score] {ovrModel.score(X_test, y_test)}")

[Train Score] 0.9495798319327731
[Test Score] 0.975


In [124]:
# 예측
ovrModel.predict(X_test[:2])

array(['Bream', 'Parkki'], dtype='<U9')

In [125]:
ovrModel.decision_function(X_test[:2])

array([[  1.87053679,   0.13665969,  -7.34472734, -14.76498298,
         -0.86086327, -27.57113603,  -3.90345836],
       [ -1.40152254,   2.39014045,  -2.83220689, -12.23098559,
         -2.55867317, -15.03484394,  -4.32514035]])

In [126]:
ovrModel.decision_function(X_test)

In [127]:
ovrModel.predict(X_test)

array(['Bream', 'Parkki', 'Bream', 'Perch', 'Perch', 'Smelt', 'Perch',
       'Perch', 'Pike', 'Roach', 'Smelt', 'Perch', 'Roach', 'Smelt',
       'Pike', 'Parkki', 'Bream', 'Perch', 'Perch', 'Bream', 'Bream',
       'Bream', 'Bream', 'Perch', 'Perch', 'Parkki', 'Pike', 'Perch',
       'Perch', 'Bream', 'Roach', 'Pike', 'Perch', 'Bream', 'Perch',
       'Roach', 'Perch', 'Roach', 'Smelt', 'Perch'], dtype='<U9')