<a href="https://colab.research.google.com/github/hui509/titanic_analysis/blob/main/%E5%B0%88%E9%A1%8C%E5%AF%A6%E4%BD%9C%EF%BD%9C07_%E9%90%B5%E9%81%94%E5%B0%BC%E8%99%9F%E5%AD%98%E6%B4%BB%E9%A0%90%E6%B8%AC_%E5%8F%83%E6%95%B8%E8%AA%BF%E6%95%B4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 專題實作 #07 鐵達尼號存活預測_參數調整

# **載入Titanic資料集的 `train.csv` 資料集**
（資料網址：https://raw.githubusercontent.com/dsindy/kaggle-titanic/master/data/train.csv）

In [55]:
import pandas as pd
url = 'https://raw.githubusercontent.com/dsindy/kaggle-titanic/master/data/train.csv'
df = pd.read_csv(url)

# **資料概況**

1. Titanic 各欄位的定義
* PassengerId：乘客ID
* Survived：生存狀態（0=No，1=Yes）  
* Pclass：艙等 （1=1st, 2=2nd, 3=3rd）
* Name：姓名  
* Sex：性別  
* Age：年紀
* SibSp：兄弟姐妹/配偶的数量
* Parch：父母/子女的数量
* Ticket：船票號碼
* Fare：船票價格
* Cabin：艙位號碼
* Embarked：登船港口（C=Cherbourg,　Q=Queenstown,　S=Southampton）




2. 基本資料

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
df.isnull().sum()

# **資料探索與分析 (EDA)**

**1. 初步發現：排除 Name、Sex、Ticket、Cabin、Embarked 五項非數值資料下，**
**可見「Fare」與「Survived」的正相關性最高**


In [None]:
# 使用 Pearson 係數計算相關性
cor = df.corr()

In [None]:
# 使用seaborn和matplotlib，視覺化呈現相關性
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(10,8))
sns.heatmap(cor,annot=True,linewidth=0.5,cmap='coolwarm')
plt.title('Correlation Matrix of Titanic')
plt.show()

**2. 進階討論：低票價的乘客，存活機會較低**



In [61]:
# 定義distribution函數，以KDE圖觀察欄位與Survived之間的關係

def distribution(df,var,target,**kwargs):
  # 設定row和col參數
  row = kwargs.get('row',None)
  col = kwargs.get('col',None)

  # 使用seaborn函數製圖
  facet = sns.FacetGrid(df,row=row,col=col,hue=target,aspect=3)
  facet.map(sns.kdeplot,var,fill=True)
  facet.set(xlim=(0,df[var].max()))
  facet.add_legend()

In [None]:
# 觀察Fare和Survived的關係
distribution(df,var='Fare',target='Survived')
plt.show()

**2. 其他發現：**

*   **Age 中，15歲以下的少年存活機率較高**
*   **Sex 中，女性乘客的存活機會高於男性**
*   **Pclass 中，一等艙的乘客有較高的存活機會**
*   **Embarked 中，從 Cherbourg 登船者的存活機會較高**



In [None]:
# 觀察Age和Survived的關係
distribution(df,var='Age',target='Survived')
plt.show()

In [64]:
# 定義categories函數，以長條圖觀察欄位與Survived之間的關係

def categories(df,cat,target,**kwargs):
  # 設定row和col參數
  row = kwargs.get('row',None)
  col = kwargs.get('col',None)

  # 使用seaborn函數製圖
  facet = sns.FacetGrid(df,row=row,col=col)
  facet.map(sns.barplot,cat,target,color='lightblue')
  facet.add_legend()

In [None]:
# 觀察Sex和Survived的關係
categories(df,cat='Sex',target='Survived')
plt.show()

In [None]:
# 觀察Pclass和Survived的關係
categories(df,cat='Pclass',target='Survived')
plt.show()

In [None]:
# 觀察Embarked和Survived的關係
categories(df,cat='Embarked',target='Survived')
plt.show()

# **資料清理與型態轉換**

1. 刪除多餘的欄位
* PassengerId：僅用以辨識乘客，無關生存
* Name：危機時期，比較不可能出現唱名救援
* Ticket：船票編號僅代表購買順序或核對資訊
* Cabin：欄位中的缺失值(687)佔全體(891)中的77%


In [68]:
titanic = df.copy()
titanic.drop(columns=['PassengerId','Name','Ticket','Cabin'],axis=1,inplace=True)

2. 轉換類別欄位：Sex、Embarked
* LabelEncoder：因欄位資料並非有序類別，故不考慮使用
* OneHotEncoder：拆分Embarked為「Embarked_C、Embarked_Q、Embarked_S」


In [69]:
# Sex直接轉換
titanic['Sex'] = titanic['Sex'].replace({'male':1,'female':0})

# Embarked使用OneHotEncoder
titanic = pd.get_dummies(titanic,columns=['Embarked'],dtype=int)

# **資料缺失值處理**

多種策略處理 Age 缺失值
* titanic_del：全部刪除
* titanic_mean：平均數填補缺
* titanic_median：中位數填補
* titanic_mode：眾數填補




In [70]:
# 全部刪除
titanic_del = titanic.copy()
titanic_del.dropna(subset=['Age'],inplace=True)

# 使用平均數填補
titanic_mean = titanic.copy()
titanic_mean['Age'].fillna(titanic_mean['Age'].mean(),inplace=True)

# 使用中位數填補
titanic_median = titanic.copy()
titanic_median['Age'].fillna(titanic_median['Age'].median(),inplace=True)

# 使用眾數填補
titanic_mode = titanic.copy()
titanic_mode['Age'].fillna(titanic_mode['Age'].mode()[0],inplace=True)


# **特徵工程**

1. **船上親屬人數的多寡，是否會影響生存機會？**

In [None]:
#新增FamilySize欄位統整乘客在船上的親屬人數
df['FamilySize'] = df['SibSp']+df['Parch']+1

#使用distribution函數，以KDE圖觀察與Survived之間的關係
distribution(df,var='FamilySize',target='Survived')
plt.show()

In [72]:
# 定義family函數，將親屬人數進行分類
def family(size):
  if size == 1:
    return 'Single'
  elif 2<=size<=4:
    return 'Small'
  else:
    return 'Large'

# 新增Family_Type說明所屬分類：單身、小家庭、大家庭
df['Family_Type'] = df['FamilySize'].map(family)

In [73]:
# 使用OneHotEncoder轉換類別欄位
df_Family_ohe = pd.get_dummies(df['Family_Type'], prefix='Family_')

# 加入不同缺失值處理的變數中，以利後續模型比較
titanic_del = titanic_del.join(df_Family_ohe)
titanic_mean = titanic_mean.join(df_Family_ohe)
titanic_median = titanic_median.join(df_Family_ohe)
titanic_mode = titanic_mode.join(df_Family_ohe)

2. **乘客頭銜隱含社會地位，是否會影響生存機會？**



In [74]:
# 定義extraction函數，抽取乘客頭銜
def extraction(name):
  title = name.split(',')[1].split('.')[0].strip()
  return title

# 新增Raw Title欄位放置抽取出的乘客頭銜
df['Raw Title'] = df['Name'].map(extraction)

In [75]:
# 觀察Name中頭銜的種類
title_set = set()
for name in df['Name']:
  title_set.add(extraction(name))

# 製作title對應的dictionary

title_dict = {
'Col':      'Officer',
'Major':     'Officer',
'Capt':      'Officer',
'Jonkheer':    'Royalty',
'Don':      'Royalty',
'Dona':      'Royalty',
'Sir':      'Royalty',
'the Countess': 'Royalty',
'Lady':      'Royalty',
'Dr':       'Royalty',
'Rev':      'Royalty',
'Mr':       'Mr',
'Ms':       'Ms',
'Miss':      'Miss',
'Mlle':      'Miss',
'Mrs':       'Mrs',
'Mme':       'Mrs',
'Master':     'Master'
}

# 新增Title欄位放置經轉換的頭銜
df['Title'] = df['Raw Title'].map(title_dict)

In [76]:
# 使用OneHotEncoder轉換類別欄位
df_Title_ohe = pd.get_dummies(df['Title'], prefix='Title_')

# 加入不同缺失值處理的變數中，以利後續模型比較
titanic_del = titanic_del.join(df_Title_ohe)
titanic_mean = titanic_mean.join(df_Title_ohe)
titanic_median = titanic_median.join(df_Title_ohe)
titanic_mode = titanic_mode.join(df_Title_ohe)

# **模型分析**

1. 採用 scikit-learn 五種基本模型進行分析與比較
2. 使用 sklearn.model_selection 下的 GridSearchCV( )調整參數

* **邏輯回歸 Logistic Regression**
* **支持向量機 Support Vector Machines**
* **決策樹 Decision Tree Classifier**
* **隨機森林 Random Forest Classifier**
* **K-近鄰演算法 K Nearest Neighbor**

In [77]:
# 載入模型
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier

(1) 以不同模型，評估 titanic_del 的準確度

In [78]:
# 定義特徵欄位X、目標欄位Y，並完成訓練集資料
columns_X = set(titanic_del.columns) - {'Survived'}
columns_y = ['Survived']

train_X = titanic_del[list(columns_X)]
train_y = titanic_del[columns_y]

# Logistic Regression
log = LogisticRegression(random_state=0,max_iter=3000)
scores_del_1 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
scores_del_2 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
scores_del_3 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
scores_del_4 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# KNN
knn = KNeighborsClassifier()
scores_del_5 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

In [None]:
#超參數調整

from sklearn.model_selection import GridSearchCV

# Logistic Regression
log = LogisticRegression()
clf = GridSearchCV(log,
    {'C': [0.01, 0.1, 1, 10],'solver': ['liblinear','lbfgs']}, cv=5)
scores_del_6 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
clf = GridSearchCV(svc,
    {'C': [0.1,1,10],'kernel':['rbf'],'gamma':[0.05, 0.1]},cv=5,n_jobs=6)
scores_del_7 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
clf = GridSearchCV(decision_tree,
    {'criterion': ['gini', 'entropy'],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]},cv=5)
scores_del_8 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
clf = GridSearchCV(random_forest,
    { 'n_estimators': [10, 50],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['auto', 'sqrt']},cv=5)
scores_del_9 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# K Nearest Neighbor
knn = KNeighborsClassifier()
clf = GridSearchCV(knn,
    {'n_neighbors': [3, 5, 7, 9],
     'weights': ['uniform', 'distance']},cv=5)
scores_del_10 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

(2) 以不同模型，評估 titanic_mean 的準確度

In [80]:
# 定義特徵欄位X、目標欄位Y，並完成訓練集資料
columns_X = set(titanic_mean.columns) - {'Survived'}
columns_y = ['Survived']

train_X = titanic_mean[list(columns_X)]
train_y = titanic_mean[columns_y]

# Logistic Regression
log = LogisticRegression(random_state=0,max_iter=3000)
scores_mean_1 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
scores_mean_2 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
scores_mean_3 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
scores_mean_4 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# KNN
knn = KNeighborsClassifier()
scores_mean_5 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

In [81]:
#超參數調整

from sklearn.model_selection import GridSearchCV

# Logistic Regression
log = LogisticRegression(random_state=0,max_iter=3000)
clf = GridSearchCV(log,
    {'C': [0.01, 0.1, 1, 10],'solver': ['liblinear','lbfgs']}, cv=5)
scores_mean_6 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
clf = GridSearchCV(svc,
    {'C': [0.1,1,10],'kernel':['rbf'],'gamma':[0.05, 0.1]},cv=5,n_jobs=6)
scores_mean_7 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
clf = GridSearchCV(decision_tree,
    {'criterion': ['gini', 'entropy'],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]},cv=5)
scores_mean_8 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
clf = GridSearchCV(random_forest,
    { 'n_estimators': [10, 50],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['auto', 'sqrt']},cv=5)
scores_mean_9 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# K Nearest Neighbor
knn = KNeighborsClassifier()
clf = GridSearchCV(knn,
    {'n_neighbors': [3, 5, 7, 9],
     'weights': ['uniform', 'distance']},cv=5)
scores_mean_10 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

(3) 以不同模型，評估 titanic_median 的準確度

In [82]:
# 定義特徵欄位X、目標欄位Y，並完成訓練集資料
columns_X = set(titanic_median.columns) - {'Survived'}
columns_y = ['Survived']

train_X = titanic_median[list(columns_X)]
train_y = titanic_median[columns_y]

# Logistic Regression
log = LogisticRegression(random_state=0,max_iter=3000)
scores_median_1 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
scores_median_2 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
scores_median_3 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
scores_median_4 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# KNN
knn = KNeighborsClassifier()
scores_median_5 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

In [83]:
#超參數調整

from sklearn.model_selection import GridSearchCV

# Logistic Regression
log = LogisticRegression(random_state=0,max_iter=3000)
clf = GridSearchCV(log,
    {'C': [0.01, 0.1, 1, 10],'solver': ['liblinear','lbfgs']}, cv=5)
scores_median_6 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
clf = GridSearchCV(svc,
    {'C': [0.1,1,10],'kernel':['rbf'],'gamma':[0.05, 0.1]},cv=5,n_jobs=6)
scores_median_7 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
clf = GridSearchCV(decision_tree,
    {'criterion': ['gini', 'entropy'],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]},cv=5)
scores_median_8 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
clf = GridSearchCV(random_forest,
    { 'n_estimators': [10, 50],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['auto', 'sqrt']},cv=5)
scores_median_9 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# K Nearest Neighbor
knn = KNeighborsClassifier()
clf = GridSearchCV(knn,
    {'n_neighbors': [3, 5, 7, 9],
     'weights': ['uniform', 'distance']},cv=5)
scores_median_10 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

(4) 以不同模型，評估 titanic_mode 的準確度

In [84]:
# 定義特徵欄位X、目標欄位Y，並完成訓練集資料
columns_X = set(titanic_mode.columns) - {'Survived'}
columns_y = ['Survived']

train_X = titanic_mode[list(columns_X)]
train_y = titanic_mode[columns_y]

# Logistic Regression
log = LogisticRegression(random_state=0,max_iter=3000)
scores_mode_1 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
scores_mode_2 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
scores_mode_3 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
scores_mode_4 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# KNN
knn = KNeighborsClassifier()
scores_mode_5 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

In [85]:
#超參數調整

from sklearn.model_selection import GridSearchCV

# Logistic Regression
log = LogisticRegression(random_state=0,max_iter=3000)
clf = GridSearchCV(log,
    {'C': [0.01, 0.1, 1, 10],'solver': ['liblinear','lbfgs']}, cv=5)
scores_mode_6 = cross_val_score(log,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# SVM
svc = SVC()
clf = GridSearchCV(svc,
    {'C': [0.1,1,10],'kernel':['rbf'],'gamma':[0.05, 0.1]},cv=5,n_jobs=6)
scores_mode_7 = cross_val_score(svc,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Decision Tree Classifier
decision_tree = DecisionTreeClassifier()
clf = GridSearchCV(decision_tree,
    {'criterion': ['gini', 'entropy'],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]},cv=5)
scores_mode_8 = cross_val_score(decision_tree,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# Random Forest Classifier
random_forest = RandomForestClassifier()
clf = GridSearchCV(random_forest,
    { 'n_estimators': [10, 50],
    'max_depth': [4, 6, 8],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['auto', 'sqrt']},cv=5)
scores_mode_9 = cross_val_score(random_forest,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

# K Nearest Neighbor
knn = KNeighborsClassifier()
clf = GridSearchCV(knn,
    {'n_neighbors': [3, 5, 7, 9],
     'weights': ['uniform', 'distance']},cv=5)
scores_mode_10 = cross_val_score(knn,train_X,train_y.values.ravel(),cv=5,scoring='accuracy').mean()

(5) **整理與比較：採用 Logistic Regression 模型分析的 Accuracy 最高**

In [86]:
scores = {'model':['Logistic Regression','SVM','Decision Tree Classifier','Random Forest Classifier','KNN','Logistic Regression(Opt)','SVM(Opt)','Decision Tree Classifier(Opt)','Random Forest Classifier(Opt)','KNN(Opt)'],
        'accuracy_del':[scores_del_1,scores_del_2,scores_del_3,scores_del_4,scores_del_5,scores_del_6,scores_del_7,scores_del_8,scores_del_9,scores_del_10],
        'accuracy_mean':[scores_mean_1,scores_mean_2,scores_mean_3,scores_mean_4,scores_mean_5,scores_mean_6,scores_mean_7,scores_mean_8,scores_mean_9,scores_mean_10],
        'accuracy_median':[scores_median_1,scores_median_2,scores_median_3,scores_median_4,scores_median_5,scores_median_6,scores_median_7,scores_median_8,scores_median_9,scores_median_10],
        'accuracy_mode':[scores_mode_1,scores_mode_2,scores_mode_3,scores_mode_4,scores_mode_5,scores_mode_6,scores_mode_7,scores_mode_8,scores_mode_9,scores_mode_10]}
compared = pd.DataFrame(scores)
compared.sort_values(by='accuracy_del',ascending=False)

Unnamed: 0,model,accuracy_del,accuracy_mean,accuracy_median,accuracy_mode
5,Logistic Regression(Opt),0.815207,0.823784,0.823784,0.82266
0,Logistic Regression,0.813809,0.823784,0.823784,0.82266
8,Random Forest Classifier(Opt),0.799783,0.809215,0.806942,0.805838
3,Random Forest Classifier,0.79279,0.806974,0.796861,0.800207
2,Decision Tree Classifier,0.759135,0.764315,0.76992,0.783385
7,Decision Tree Classifier(Opt),0.759125,0.771038,0.768803,0.782261
4,KNN,0.697577,0.716101,0.711606,0.701494
9,KNN(Opt),0.697577,0.716101,0.711606,0.701494
1,SVM,0.670895,0.675733,0.675733,0.673498
6,SVM(Opt),0.670895,0.675733,0.675733,0.673498


# **結論：**

## **1. 模型分析｜以 GridSearchCV 調整參數後**
> ### ● 各種缺失值處理策略中，仍以平均數填補的方式較為適當
> ### ● Logistic Regression 的 Accuracy 仍位居首位，約為0.82
> ### ● Random Forest Classifier 的 Accuracy 有略為提升
> ### ● 多數模型並沒有明顯差異，參數調整對準確率的影響不大
