## 1. 下載資料集並觀察資料維度
本資料集共有768筆資料, 每筆資料有8個特徵; 資料集最後一欄位為5年後不發病(0)與發病(1)的標籤

In [2]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
import numpy as np

# 可從kaggle網站下載此資料集 https://www.kaggle.com/datasets/kumargh/pimaindiansdiabetescsv?resource=download
dataset = np.loadtxt("./pima_indians_diabetes.csv", delimiter=",")
data = dataset[:, 0:8]
label = dataset[:, 8]

print("data.shape:", data.shape)
print("label.shape:", label.shape)

data.shape: (768, 8)
label.shape: (768,)


## 2. 創造網路模型

In [22]:
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

print(model.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_3 (Dense)              (None, 12)                108       
_________________________________________________________________
dense_4 (Dense)              (None, 8)                 104       
_________________________________________________________________
dense_5 (Dense)              (None, 4)                 36        
_________________________________________________________________
dense_6 (Dense)              (None, 1)                 5         
Total params: 253
Trainable params: 253
Non-trainable params: 0
_________________________________________________________________
None


## 3. 編譯與訓練模型

In [23]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

history = model.fit(data, label, epochs=150, batch_size=10, 
                    validation_split = 0.2,  # 劃分資料集的20%作為驗證集用 
                    verbose = 2)

Epoch 1/150
62/62 - 1s - loss: 2.0577 - accuracy: 0.5749 - val_loss: 1.0439 - val_accuracy: 0.6104
Epoch 2/150
62/62 - 0s - loss: 0.8918 - accuracy: 0.6450 - val_loss: 0.9238 - val_accuracy: 0.5974
Epoch 3/150
62/62 - 0s - loss: 0.7947 - accuracy: 0.6384 - val_loss: 0.8354 - val_accuracy: 0.5649
Epoch 4/150
62/62 - 0s - loss: 0.7493 - accuracy: 0.6482 - val_loss: 0.7968 - val_accuracy: 0.6234
Epoch 5/150
62/62 - 0s - loss: 0.7161 - accuracy: 0.6401 - val_loss: 0.7410 - val_accuracy: 0.6558
Epoch 6/150
62/62 - 0s - loss: 0.6857 - accuracy: 0.6629 - val_loss: 0.7100 - val_accuracy: 0.6104
Epoch 7/150
62/62 - 0s - loss: 0.6603 - accuracy: 0.6645 - val_loss: 0.7033 - val_accuracy: 0.5649
Epoch 8/150
62/62 - 0s - loss: 0.6520 - accuracy: 0.6629 - val_loss: 0.7320 - val_accuracy: 0.5195
Epoch 9/150
62/62 - 0s - loss: 0.6490 - accuracy: 0.6775 - val_loss: 0.6952 - val_accuracy: 0.5455
Epoch 10/150
62/62 - 0s - loss: 0.6507 - accuracy: 0.6661 - val_loss: 0.6685 - val_accuracy: 0.6818
Epoch 11/

62/62 - 0s - loss: 0.5145 - accuracy: 0.7492 - val_loss: 0.5912 - val_accuracy: 0.7013
Epoch 84/150
62/62 - 0s - loss: 0.5118 - accuracy: 0.7541 - val_loss: 0.6010 - val_accuracy: 0.6948
Epoch 85/150
62/62 - 0s - loss: 0.5159 - accuracy: 0.7606 - val_loss: 0.6479 - val_accuracy: 0.6753
Epoch 86/150
62/62 - 0s - loss: 0.5294 - accuracy: 0.7362 - val_loss: 0.5712 - val_accuracy: 0.7208
Epoch 87/150
62/62 - 0s - loss: 0.5162 - accuracy: 0.7687 - val_loss: 0.5860 - val_accuracy: 0.7013
Epoch 88/150
62/62 - 0s - loss: 0.5169 - accuracy: 0.7606 - val_loss: 0.5687 - val_accuracy: 0.7078
Epoch 89/150
62/62 - 0s - loss: 0.5218 - accuracy: 0.7508 - val_loss: 0.5969 - val_accuracy: 0.6948
Epoch 90/150
62/62 - 0s - loss: 0.5173 - accuracy: 0.7541 - val_loss: 0.5764 - val_accuracy: 0.7078
Epoch 91/150
62/62 - 0s - loss: 0.5099 - accuracy: 0.7508 - val_loss: 0.5901 - val_accuracy: 0.6948
Epoch 92/150
62/62 - 0s - loss: 0.5126 - accuracy: 0.7638 - val_loss: 0.5762 - val_accuracy: 0.7403
Epoch 93/150


In [24]:
print("history:", history.history)

history: {'loss': [2.0576703548431396, 0.8917844891548157, 0.7947136163711548, 0.7492638230323792, 0.7160550951957703, 0.685688853263855, 0.6603339314460754, 0.6520437598228455, 0.6490398645401001, 0.6507017016410828, 0.6348869204521179, 0.6338370442390442, 0.6247356534004211, 0.618000328540802, 0.6256620287895203, 0.615386426448822, 0.611086905002594, 0.6023130416870117, 0.6039235591888428, 0.5996115207672119, 0.596635639667511, 0.5958499908447266, 0.593828558921814, 0.602578341960907, 0.5888012647628784, 0.578639030456543, 0.5754804015159607, 0.5784035325050354, 0.5759625434875488, 0.5676376819610596, 0.5692019462585449, 0.5815424919128418, 0.5604845285415649, 0.562128484249115, 0.5707480907440186, 0.5727440714836121, 0.5704574584960938, 0.5564420819282532, 0.5681372284889221, 0.5519600510597229, 0.5561438202857971, 0.5578890442848206, 0.5611892938613892, 0.5518606305122375, 0.5560075044631958, 0.5540204644203186, 0.5452053546905518, 0.538526713848114, 0.5502395629882812, 0.549614965

## 4. 評估與預測

In [25]:
loss, accuracy = model.evaluate(data, label)
print("\nLoss: {:.2}, Accuracy: {:.2%}".format(loss, accuracy))

probabilities = model.predict(data)

# 將 probabilities 的輸出值透過np.round()做四捨五入, 會得到 0 或 1
predictions = [float(np.round(x)) for x in probabilities]

# 計算預測結果跟真實結果的平均差距
accuracy = np.mean(predictions == label)
print("Prediction Accuracy: {:.2%}".format(accuracy))


Loss: 0.49, Accuracy: 76.43%
Prediction Accuracy: 76.43%


### note: 

將網路隱藏層加深一層model.add(Dense(4, activation='relu'))

評估損失值及準確度由

Loss: 0.54, Accuracy: 74.22%

優化至

Loss: 0.49, Accuracy: 76.43%

將epochs從150提升為200, 評估損失值及準確度優化至

Loss: 0.44, Accuracy: 78.78%