# Neural Networks

## 写在最前面

当看到王老师推荐的这篇https://www.zhihu.com/question/24738573 时，思绪回到了今年1月份参加数模美赛，当时偶然间也是逛到了这篇Spring Board的文章(原文链接：https://www.springboard.com/blog/beginners-guide-neural-network-in-python-scikit-learn-0-18/
当时的我连sklearn都只是刚刚听说过，看到了这篇文章，试着第一次使用sklearn，最后在比赛中成功利用了sklearn中的神经网络模型。这篇文章是我的sklearn启蒙文章...现在看到觉得很偶然也很感谢自己当时看到了这篇文章，体会到了sklearn和python真是太棒了。

## 重温当时做过的工作...

### 神经网络

神经网络是一种机器学习框架，试图模仿自然生物神经网络的学习模式：可以将它们看作与人类大脑在学习时所做的事情的粗略近似。生物神经网络已经将神经元与接受输入的树突相互连接，然后基于这些输入，它们通过轴突产生输出信号到另一个神经元。我们将尝试通过使用人工神经网络（ANN）来模拟这个过程，将其称为神经网络。神经网络是深度学习的基础，它是机器学习的一个子集，它负责今天一些最激动人心的技术进步！在Python中创建神经网络的过程始于最基本的形式 - 单感知器。

### 感知机

<img src="https://ws1.sinaimg.cn/large/006tNc79gy1frkc212534j318g0xajwa.jpg" width="500px">

由上图可知，多层感知机有一个或多个输入（$a_1...a_k$），对应着一个或多个权重($w_1...w_k$),一个偏置(bias)，一个激活函数和一个输出

### 神经网络

<img src="https://ws2.sinaimg.cn/large/006tNc79gy1frkc8nifm9j31f00z219t.jpg" width="500px">

为了创建一个神经网络，我们只需开始将感知层添加到一起，创建一个神经网络的多层感知器模型。您将拥有一个直接接收数据的输入层和一个将创建结果输出的输出层。之间的任何图层都称为隐藏图层，因为它们不直接“查看”您输入的数据或输出内的特征输入。对于这个可视化检查上图（来源：一天读懂深度学习）。

> 请记住，由于它们的性质，神经网络在GPU上比在CPU上运行得更好。sklearn学习框架不是为GPU优化而构建的。如果您想继续使用GPU和分布式模型，请查看其他一些框架，例如Google的开源TensorFlow。
让我们继续使用Python和sklearn实际创建一个神经网络！

### 现在开始构建神经网络

对于这个分析，我们将介绍生活中最重要的主题之一：葡萄酒！除了开玩笑，葡萄酒欺诈是一件非常真实的事情。让我们看看Python中的神经网络是否可以解决这个问题！我们将使用来自UCI机器学习库的葡萄酒数据集。它具有不同葡萄酒的各种化学特征，均生长在意大利同一地区，但数据标有3种不同的品种。我们将尝试建立一个模型，根据其化学特征使用神经网络对葡萄酒所属的品种进行分类。你可以在这里获取数据。

<img src="https://ws1.sinaimg.cn/large/006tNc79gy1frkcitthi6j31520nyti7.jpg" width="500px">

In [1]:
import pandas as pd

In [2]:
wine = pd.read_csv('wine_data.csv', names = ["Cultivator", "Alchol", "Malic_Acid", "Ash", "Alcalinity_of_Ash", "Magnesium", "Total_phenols", "Falvanoids", "Nonflavanoid_phenols", "Proanthocyanins", "Color_intensity", "Hue", "OD280", "Proline"])

In [3]:
wine.head()

Unnamed: 0,Cultivator,Alchol,Malic_Acid,Ash,Alcalinity_of_Ash,Magnesium,Total_phenols,Falvanoids,Nonflavanoid_phenols,Proanthocyanins,Color_intensity,Hue,OD280,Proline
0,1,14.23,1.71,2.43,15.6,127,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065
1,1,13.2,1.78,2.14,11.2,100,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050
2,1,13.16,2.36,2.67,18.6,101,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185
3,1,14.37,1.95,2.5,16.8,113,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480
4,1,13.24,2.59,2.87,21.0,118,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735


In [5]:
np.random.randint(1,31,size=10)

array([ 7, 17, 17, 30,  2, 16, 22, 12,  9,  8])

In [4]:
wine.describe().transpose()

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Cultivator,178.0,1.938202,0.775035,1.0,1.0,2.0,3.0,3.0
Alchol,178.0,13.000618,0.811827,11.03,12.3625,13.05,13.6775,14.83
Malic_Acid,178.0,2.336348,1.117146,0.74,1.6025,1.865,3.0825,5.8
Ash,178.0,2.366517,0.274344,1.36,2.21,2.36,2.5575,3.23
Alcalinity_of_Ash,178.0,19.494944,3.339564,10.6,17.2,19.5,21.5,30.0
Magnesium,178.0,99.741573,14.282484,70.0,88.0,98.0,107.0,162.0
Total_phenols,178.0,2.295112,0.625851,0.98,1.7425,2.355,2.8,3.88
Falvanoids,178.0,2.02927,0.998859,0.34,1.205,2.135,2.875,5.08
Nonflavanoid_phenols,178.0,0.361854,0.124453,0.13,0.27,0.34,0.4375,0.66
Proanthocyanins,178.0,1.590899,0.572359,0.41,1.25,1.555,1.95,3.58


In [5]:
# 178 data points with 13 features and 1 label column
wine.shape

(178, 14)

### Extract data and labels

In [6]:
X = wine.drop('Cultivator',axis=1)
y = wine['Cultivator']

### Split train and test Dataset

In [7]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [13]:
from sklearn.model_selection import train_test_split

### Data Preprocessing(标准化)

In [8]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
# Fit only to the training data
scaler.fit(X_train)
# Now apply the transformations to the data:
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

### Model Training 

In [9]:
from sklearn.neural_network import MLPClassifier

In [10]:
# 参数：3层 每一层13个神经元 最多500次迭代
mlp = MLPClassifier(hidden_layer_sizes=(13,13,13),max_iter=500)

In [11]:
mlp.fit(X_train,y_train)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(13, 13, 13), learning_rate='constant',
       learning_rate_init=0.001, max_iter=500, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

### Predictions and Evaluation

In [12]:
predictions = mlp.predict(X_test)

In [13]:
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_test,predictions))

[[11  0  0]
 [ 2 18  0]
 [ 0  0 14]]


### classification_report:准确率 召回率 f1-score

In [14]:
print(classification_report(y_test,predictions))

             precision    recall  f1-score   support

          1       0.85      1.00      0.92        11
          2       1.00      0.90      0.95        20
          3       1.00      1.00      1.00        14

avg / total       0.96      0.96      0.96        45



### 其中的f1-score比较少见

<img src="https://ws3.sinaimg.cn/large/006tNc79gy1frkdh2eyn0j30vq0kq0uv.jpg" width="500px">

>However, if you do want to extract the MLP weights and biases after training your model, you use its public attributes `coefs_` and `intercepts_`.
`coefs_` is a list of weight matrices, where weight matrix at index i represents the weights between layer i and layer i+1.
`intercepts_` is a list of bias vectors, where the vector at index i represents the bias values added to layer i+1.

In [25]:
len(mlp.coefs_) 

4

> 为啥有4呢 不是(13,13,13)吗？？应该是3呀~这个没搞懂！

### 输出第一层的weights

In [26]:
print (mlp.coefs_[0])

[[-0.14017249  0.37151817 -0.37240336 -0.49641431 -0.2900561   0.14385554
  -0.10385999 -0.19775961 -0.08277903  0.2425968   0.38472292  0.55634046
   0.36025732]
 [-0.11552038  0.13818601  0.20629434 -0.14752564  0.07425107 -0.0645282
   0.45696352  0.1465841   0.04621567 -0.12996908 -0.03974458 -0.07578182
   0.34821485]
 [ 0.01780884  0.11766512  0.17623106  0.19465891  0.51614303  0.36036322
  -0.13836559 -0.27614582  0.32339831 -0.19955025  0.42521828  0.44255971
  -0.11468477]
 [ 0.18047039  0.09142575  0.04698464  0.10479112  0.03116768 -0.49037818
   0.15857183  0.01278012 -0.1288964  -0.50898367  0.28545686  0.40415107
   0.28167827]
 [-0.47854362  0.20077224  0.48074776 -0.17144734 -0.00736817  0.46802458
  -0.14801275 -0.07794713 -0.12581422 -0.31764913  0.00883827 -0.43987826
  -0.41259726]
 [ 0.07024488  0.36563502 -0.29457596  0.2709135   0.00460248 -0.25161491
  -0.16777263  0.13194196  0.62601558  0.32437005  0.53418675 -0.38964592
   0.04063399]
 [-0.40144028 -0.233211

### 输出第一层的bias

In [27]:
print mlp.intercepts_[0]

[ 0.49574975  0.23808403  0.31972499  0.64075982  0.6475908   0.23028299
  0.40724809 -0.36229247 -0.23609277  0.64539785  0.10505051  0.36714228
  0.30151916]
