# 上机实验1：基于前馈神经网络的分类任务设计
计卓2101 高僖 U202115285
## 网络设计：
设计了隐藏层尺寸分别为[32,16]，[64,32,16]，[128,64,32,16] 三种前馈神经网络，激活函数分别选择Relu，cos和sigmoid三种函数进行实验。

## 其他训练参数

使用Adam优化器（参数为tensorflow默认设置），交叉熵作为损失函数；batch size设为32；epoch数为10。

## 实验结果与分析
### 不同隐藏层的对比实验

实验中激活函数均为Relu

隐藏层为[32,16]
<img src="(32, 16)_neuron_configuration_training_plot.png" style="zoom:100%;" />
每一轮mini-batch训练后的模型在训练集和测试集上的损失见附录代码运行结果 <br/>
最终训练集准确度为0.9237654209136963 <br/>
测试集准确度为0.9275000095367432


隐藏层为[64,32,16]
<img src="(64, 32, 16)_neuron_configuration_training_plot.png" style="zoom:100%;" />
每一轮mini-batch训练后的模型在训练集和测试集上的损失见附录代码运行结果 <br/>
最终训练集准确度为0.9228395223617554 <br/>
测试集准确度为0.9300000071525574


隐藏层为[128,64,32,16] 
<img src="(128, 64, 32, 16)_neuron_configuration_training_plot.png" style="zoom:100%;" />
每一轮mini-batch训练后的模型在训练集和测试集上的损失见附录代码运行结果 <br/>
最终训练集准确度为0.9259259104728699 <br/>
测试集准确度为0.925000011920929


#### 分析

可以看到，对于简单的拟合任务，网络基本不存在过拟合现象，在同样的训练参数设置下，更复杂的神经网络在有限的训练iteration下拟合函数的能力更强。
但是准确率差不多，甚至出现了更复杂的神经网络准确率有所下降的情况。

### 不同激活函数的对比实验

实验中网络隐藏层尺寸均为[64, 32]

使用Relu激活函数：
<img src="relu_training_plot.png" style="zoom:100%;" />
每一轮mini-batch训练后的模型在训练集和测试集上的损失见附录代码运行结果 <br/>
最终训练集准确度为0.9209876656532288 <br/>
测试集准确度为0.9024999737739563

使用Sigmoid激活函数：
<img src="sigmoid_training_plot.png" style="zoom:100%;" />
每一轮mini-batch训练后的模型在训练集和测试集上的损失见附录代码运行结果 <br/>
最终训练集准确度为0.9138888716697693 <br/>
测试集准确度为0.8999999761581421

使用Tanh激活函数：
<img src="Tanh_training_plot.png" style="zoom:100%;" />
每一轮mini-batch训练后的模型在训练集和测试集上的损失见附录代码运行结果 <br/>
最终训练集准确度为0.9268518686294556 <br/>
测试集准确度为0.9075000286102295


#### 分析

可以看到，在Relu，Sigmoid和Tanh三种激活函数中，Relu激活函数在这一函数拟合任务上的表现最好，而其余两种激活函数的训练效率都要低很多。

## 实验代码

In [12]:
import numpy as np
import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib auto

Using matplotlib backend: <object object at 0x0000021242C6AD90>


不同激活函数

In [13]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler


dataset_path = "dataset.csv"
data = pd.read_csv(dataset_path)

# 随机排序
data = data.sample(frac=1).reset_index(drop=True)

features = data[['data1', 'data2']]
labels = data['label']

labels_onehot = pd.get_dummies(labels)

X_train, X_test, y_train, y_test = train_test_split(features, labels_onehot, test_size=0.1, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

activation_functions = ['relu', 'sigmoid', 'tanh']

for activation_function in activation_functions:
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(64, activation=activation_function, input_shape=(2,)),
        tf.keras.layers.Dense(32, activation=activation_function),
        tf.keras.layers.Dense(4, activation='softmax')
    ])


    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])


    history = model.fit(X_train_scaled, y_train, epochs=10, batch_size=32, validation_split=0.1)


    test_loss, test_acc = model.evaluate(X_test_scaled, y_test)
    print(f'Test Accuracy ({activation_function}): {test_acc}')
    final_train_accuracy = history.history['accuracy'][-1]
    print(f'Final Training Accuracy ({activation_function} Neuron Configuration): {final_train_accuracy}')

    plt.figure(figsize=(12, 4))

    plt.subplot(1, 2, 1)
    plt.plot(history.history['loss'], label='Training Loss')
    plt.plot(history.history['val_loss'], label='Validation Loss')
    plt.title(f'Training and Validation Loss with {activation_function.capitalize()}')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(history.history['accuracy'], label='Training Accuracy')
    plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
    plt.title(f'Training and Validation Accuracy with {activation_function.capitalize()}')
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.legend()

    plt.savefig(f'{activation_function}_training_plot.png')





Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Accuracy (relu): 0.9300000071525574
Final Training Accuracy (relu Neuron Configuration): 0.9231481552124023
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Accuracy (sigmoid): 0.9300000071525574
Final Training Accuracy (sigmoid Neuron Configuration): 0.9200617074966431
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Accuracy (tanh): 0.9325000047683716
Final Training Accuracy (tanh Neuron Configuration): 0.9243826866149902


不同隐藏层

In [14]:
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt


dataset_path = "dataset.csv"
data = pd.read_csv(dataset_path)

# 随机排序
data = data.sample(frac=1).reset_index(drop=True)

features = data[['data1', 'data2']]
labels = data['label']

labels_onehot = pd.get_dummies(labels)

X_train, X_test, y_train, y_test = train_test_split(features, labels_onehot, test_size=0.1, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

neuron_configurations = [(32,16), (64,32,16), (128,64,32,16)]

for configuration in neuron_configurations:
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Dense(configuration[0], activation='relu', input_shape=(2,)))

    for neurons in configuration[1:]:
        model.add(tf.keras.layers.Dense(neurons, activation='relu'))

    model.add(tf.keras.layers.Dense(4, activation='softmax'))

    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

    history = model.fit(X_train_scaled, y_train, epochs=10, batch_size=32, validation_split=0.1)

    test_loss, test_acc = model.evaluate(X_test_scaled, y_test)
    print(f'Test Accuracy ({configuration} Neuron Configuration): {test_acc}')
    
    final_train_accuracy = history.history['accuracy'][-1]
    print(f'Final Training Accuracy ({configuration} Neuron Configuration): {final_train_accuracy}')

    plt.figure(figsize=(12, 4))

    plt.subplot(1, 2, 1)
    plt.plot(history.history['loss'], label='Training Loss')
    plt.plot(history.history['val_loss'], label='Validation Loss')
    plt.title(f'Training and Validation Loss with {configuration}')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(history.history['accuracy'], label='Training Accuracy')
    plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
    plt.title(f'Training and Validation Accuracy with {configuration}')
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.legend()

    plt.savefig(f'{str(configuration)}_neuron_configuration_training_plot.png')



Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Accuracy ((32, 16) Neuron Configuration): 0.9150000214576721
Final Training Accuracy ((32, 16) Neuron Configuration): 0.9240740537643433
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Accuracy ((64, 32, 16) Neuron Configuration): 0.9175000190734863
Final Training Accuracy ((64, 32, 16) Neuron Configuration): 0.9228395223617554
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Accuracy ((128, 64, 32, 16) Neuron Configuration): 0.9075000286102295
Final Training Accuracy ((128, 64, 32, 16) Neuron Configuration): 0.9259259104728699
