<a href="https://colab.research.google.com/github/PyBlin/Study/blob/main/PyML/Chapter3_Evaluation/Chap3_4_F1Score.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

F1 스코어는 정밀도와 재현율을 결합한 지표입니다.
* F1 = 2 / {(1/precision) + (1/recall)}
* F1 = 2 * precision * recall / (precision + recall)
* f1_score() API를 이용해 F1 스코어를 구해봅시다!

In [1]:
# transform_features() 생성

from sklearn.preprocessing import LabelEncoder

# Null 처리 함수
def titanic_fillna(df):
    df['Age'].fillna(df['Age'].mean(), inplace=True)
    df['Cabin'].fillna('N', inplace=True)
    df['Embarked'].fillna('N', inplace=True)
    df['Fare'].fillna(0, inplace=True)
    return df

# ML 알고리즘에 불필요한 속성 제거
def titanic_drop(df):
    df.drop(['PassengerId', 'Name', 'Ticket'], axis=1, inplace=True)
    return df

# label encoding
def titanic_le(df):
    df['Cabin'] = df['Cabin'].str[:1]
    features = ['Cabin', 'Sex', 'Embarked']
    for feature in features:
        le = LabelEncoder()
        le = le.fit(df[feature])
        df[feature] = le.transform(df[feature])
    return df

# 위에서 설정한 데이터 전처리 함수 종합
def transform_features(df):
    df = titanic_fillna(df)
    df = titanic_drop(df)
    df = titanic_le(df)
    return df

In [2]:
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# 원본 데이터 로딩, 데이터 가공, 학습/테스트 데이터 분할
titanic_df = pd.read_csv('./train.csv')
y_titanic_df = titanic_df['Survived']
X_titanic_df = titanic_df.drop('Survived', axis=1)

# transform_features() 설정 필요
X_titanic_df = transform_features(X_titanic_df)
X_train, X_test, y_train, y_test = train_test_split(X_titanic_df, y_titanic_df, 
                                                    test_size=0.2, 
                                                    random_state=11)

lr_clf = LogisticRegression()
lr_clf.fit(X_train, y_train)
pred = lr_clf.predict(X_test)

pred_proba = lr_clf.predict_proba(X_test)

In [3]:
from sklearn.metrics import f1_score

f1 = f1_score(y_test, pred)
print(f"F1 Score : {f1:.4f}")

F1 Score : 0.7805


* 이번에는 임계값을 변화시키면서 F1 스코어를 포함한 평가 지표를 구해봅시다!

In [4]:
# 평가를 한번에 호출하는 get_clf_eval() 함수 생성
from sklearn.metrics import accuracy_score, precision_score, recall_score, \
confusion_matrix

def get_clf_eval(y_test, pred):
    confusion = confusion_matrix(y_test, pred)
    acc = accuracy_score(y_test, pred)
    precision = precision_score(y_test, pred)
    recall = recall_score(y_test, pred)

    # F1 스코어 추가
    f1 = f1_score(y_test, pred)

    print(f"Confusion Matrix : \n{confusion}")
    print(f"\nAccuracy  : {acc:.4f}")
    print(f"Precision : {precision:.4f}")
    print(f"Recall    : {recall:.4f}")
    print(f"F1 Score  : {f1:.4f}")

In [5]:
from sklearn.preprocessing import Binarizer

thresholds = [0.4, 0.45, 0.5, 0.55, 0.6]

def get_eval_by_threshold(y_test, pred_proba_c1, thresholds):
    # thresholds list 객체 내의 값을 차례로 반복하면서 평가 수행
    for custom_threshold in thresholds:
        binarizer = Binarizer(threshold=custom_threshold).fit(pred_proba_c1)
        custom_predict = binarizer.transform(pred_proba_c1)
        print('-'*30)
        print(f"임계값 : {custom_threshold}")
        get_clf_eval(y_test, custom_predict)

get_eval_by_threshold(y_test, pred_proba[:, 1].reshape(-1, 1), thresholds)

------------------------------
임계값 : 0.4
Confusion Matrix : 
[[98 20]
 [10 51]]

Accuracy  : 0.8324
Precision : 0.7183
Recall    : 0.8361
F1 Score  : 0.7727
------------------------------
임계값 : 0.45
Confusion Matrix : 
[[103  15]
 [ 12  49]]

Accuracy  : 0.8492
Precision : 0.7656
Recall    : 0.8033
F1 Score  : 0.7840
------------------------------
임계값 : 0.5
Confusion Matrix : 
[[104  14]
 [ 13  48]]

Accuracy  : 0.8492
Precision : 0.7742
Recall    : 0.7869
F1 Score  : 0.7805
------------------------------
임계값 : 0.55
Confusion Matrix : 
[[109   9]
 [ 15  46]]

Accuracy  : 0.8659
Precision : 0.8364
Recall    : 0.7541
F1 Score  : 0.7931
------------------------------
임계값 : 0.6
Confusion Matrix : 
[[112   6]
 [ 16  45]]

Accuracy  : 0.8771
Precision : 0.8824
Recall    : 0.7377
F1 Score  : 0.8036


* F1 스코어는 임계값이 0.6일 때 가장 좋습니다.
* 하지만 재현율이 크게 감소하므로 주의하시기 바랍니다.