# <a>Understanding and Implementing Neural Networks from scratch</a>
#    从头开始理解和实现神经网络

<br>

In this kernel, I have explained the intution about neural networks and how to implement neural networks from scratch in python. 
# 在这个内核中，我解释了关于神经网络的直觉以及如何在python中从头实现神经网络。

## Contents  
#  内容
<br>

**<a><i> 1. What are Neural Networks</i></a> **    # 什么是神经网络

**<a><i> 2. Implement a Neural Network - Binary classification</i></a>**   # 实现一个神经网络-二进制分类

**<a><i> 3. Implement a Neural Network - Multiclass classification</i></a>**   #实现一个神经网络-多级分类

**<a><i> 4. What are Deep Neural Networks</i></a> **     #什么是深度神经网络

**<a><i> 5. Convolutional Neural Networks Implementation</i></a>**   # 卷积神经网络实现

![](https://www.pangeanic.com/wp-content/uploads/sites/2/2017/07/neural-network-graph-624x492.jpg)

I would like to thank Andrew NG and deeplearning.ai course for their excellent material
# 我要感谢Andrew NG和deeplearning。人工智能课程为他们提供了优秀的材料



## <a>1. What are Neural Networks </a>

Neural networks are a type of machine learning models which are designed to operate similar to biological neurons and human nervous system. These models are used to recognize complex patterns and relationships that exists within a labelled dataset. They have following properties:
# 神经网络是一种类似于生物神经元和人类神经系统的机器学习模型。这些模型用于识别标记数据集中存在的复杂模式和关系。它们有以下特性:

1. The core architecture of a Neural Network model is comprised of a large number of simple processing nodes called Neurons which are interconnected and organized in different layers.
#  神经网络模型的核心结构是由大量被称为神经元的简单处理节点组成的，这些节点在不同的层中相互连接和组织。 

2. An individual node in a layer is connected to several other nodes in the previous and the next layer. The inputs form one layer are received and processed to generate the output which is passed to the next layer.
#  层中的单个节点连接到上一层和下一层中的其他几个节点。接收并处理来自一层的输入，以生成传递到下一层的输出。

3. The first layer of this architecture is often named as input layer which accepts the inputs, the last layer is named as the output layer which produces the output and every other layer between input and output layer is named is hidden layers.
#  这种体系结构的第一层通常被称为输入层，它接受输入，最后一层被称为输出层，它产生输出，输入和输出层之间的每一层都被称为隐藏层。 

### Key concepts in a Neural Network 
#   神经网络的关键概念

#### A. Neuron:
#       神经元

A Neuron is a single processing unit of a Neural Network which are connected to different other neurons in the network. These connections repersents inputs and ouputs from a neuron. To each of its connections, the neuron assigns a “weight” (W) which signifies the importance the input and adds a bias (b) term.
# 神经元是神经网络的单一处理单元，它与网络中不同的神经元相连。这些连接代替了神经元的输入和输出。神经元给每个连接分配一个“权值”(W)，表示输入的重要性，并添加一个偏差项(b)。 

#### B. Activation Functions 
#       激活函数

The activation functions are used to apply non-linear transformation on input to map it to output. The aim of activation functions is to predict the right class of the target variable based on the input combination of variables. Some of the popular activation functions are Relu, Sigmoid, and TanH. 
# 激活函数用于对输入应用非线性变换，将其映射到输出。激活函数的目的是根据变量的输入组合来预测目标变量的正确类别。常用的激活函数有Relu、Sigmoid和TanH。

#### C. Forward Propagation 
#       正向传播
Neural Network model goes through the process called forward propagation in which it passes the computed activation outputs in the forward direction.
# 神经网络模型通过一个称为正向传播的过程，在这个过程中，它将计算出的激活输出传递给正向。 

Z = W*X + b   
A = g(Z) 

- g is the activation function                # g是激活函数
- A is the activation using the input         # A是使用输入的激活
- W is the weight associated with the input   # W是与输入相关的权值
- B is the bias associated with the node      # B是与节点相关的偏差
 
#### D. Error Computation: 
#        误差计算

The neural network learns by improving the values of weights and bias. The model computes the error in the predicted output in the final layer which is then used to make small adjustments the weights and bias. The adjustments are made such that the total error is minimized. Loss function measures the error in the final layer and cost function measures the total error of the network. 
# 神经网络通过改进权值和偏差值进行学习。该模型计算最后一层预测输出的误差，然后对权重和偏差进行小的调整。这些调整使得总误差最小化。损耗函数测量最后一层的误差，成本函数测量网络的总误差。

Loss = Actual_Value - Predicted_Value   

Cost = Summation (Loss)   

#### E. Backward Propagation:
#        反向传播 

Neural Network model undergoes the process called backpropagation in which the error is passed to backward layers so that those layers can also improve the associated values of weights and bias. It uses the algorithm called Gradient Descent in which the error is minimized and optimal values of weights and bias are obtained. This weights and bias adjustment is done by computing the derivative of error, derivative of weights, bias and subtracting them from the original values. 
# 神经网络模型经历了一个称为反向传播的过程，在这个过程中误差被传递到反向层，这样那些层也可以改进相关的权值和偏差。该算法采用梯度下降法，使误差最小化，获得最优的权值和偏差。这种权值和偏差调整是通过计算误差的导数、权值的导数、偏差并从原始值中减去它们来实现的。
<br>

## <a> 2. Implement a Neural Network - Binary Classification</a>
#          实现一个神经网络-二进制分类  

Lets implement a basic neural network in python for binary classification which is used to classify if a given image is 0 or 1.  
# 让我们实现一个基本的神经网络在python的二进制分类，这是用来分类，如果一个给定的图像是0或1。

In [26]:
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.models import Sequential
import pandas as pd 
import numpy as np 
import keras

### 2.1 Dataset Preparation
#       数据集的准备

First step is to load and prepare the dataset
# 第一步是加载和准备数据集

In [35]:
train = pd.read_csv("C:/Users/H/Desktop/AI/nurual network/实验3/1、数据/train.csv")
test = pd.read_csv("C:/Users/H/Desktop/AI/nurual network/实验3/1、数据/test.csv")

# include only the rows having label = 0 or 1 (binary classification)
# 只包含label = 0或1的行(二进制分类)
X = train[train['label'].isin([0, 1])]

# target variable
# 目标变量
Y = train[train['label'].isin([0, 1])]['label']

# remove the label from X
# 从X中移除标签
X = X.drop(['label'], axis = 1)


In [36]:
X.head()

Unnamed: 0,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### 2.2 Implementing a Activation Function 
#        实现激活函数

We will use sigmoid activation function because it outputs the values between 0 and 1 so its a good choice for a binary classification problem
# 我们将使用sigmoid激活函数，因为它输出0到1之间的值，所以它是一个很好的二进制分类问题的选择

In [37]:
# implementing a sigmoid activation function
# 实现一个sigmoid激活函数
def sigmoid(z):
    #########################需要填写，写出sigmoid的计算公式，并赋值给s，sigmoid=1/1+exp(x)
    s = 1.0/(1 + np.exp(-z))
    #########################需要填写，写出sigmoid的计算公式，并赋值给s，sigmoid=1/1+exp(x)
    return s

### 2.3 Define Neural Network Architecture
#        定义神经网络体系结构

Create a model with three layers - Input, Hidden, Output. 
# 创建一个有三层的模型——输入，隐藏，输出。

In [38]:
def network_architecture(X, Y):
    # nodes in input layer
    # 输入层节点
    
    ##########################需要填写，计算每层的个数，输入层个数赋值给n_x,输出层赋值给n_y
    n_x = X.shape[0]
    
    # nodes in hidden layer
    # 隐藏层节点
    n_h = 10  
    
    # nodes in output layer
    # 输出层节点
    n_y = Y.shape[0]
     ##########################需要填写，计算每层的个数，输入层个数赋值给n_x,输出层赋值给n_y
    return (n_x, n_h, n_y)

### 2.4 Define Neural Network Parameters 
#       定义神经网络参数
Neural Network parameters are weights and bias which we need to initialze with zero values. The first layer only contains inputs so there are no weights and bias, but the hidden layer and the output layer have a weight and bias term. (W1, b1 and W2, b2)
# 神经网络参数是权重和偏差，我们需要用零值初始化。第一层只包含输入，所以没有权值和偏差，但是隐含层和输出层有一个权值和偏差项。(W1, b1, W2, b2)

In [39]:
def define_network_parameters(n_x, n_h, n_y):
    W1 = np.random.randn(n_h,n_x) * 0.01      # random initialization
    b1 = np.zeros((n_h, 1))                   # zero initialization
    
    ##############################需要填写，请为输出层的W2和b2赋值
    W2 = np.random.randn(n_y,n_h) * 0.01
    b2 = np.zeros((n_y,1))
    ##############################需要填写，请为输出层的W2和b2赋值
    
    return {"W1": W1, "b1": b1, "W2": W2, "b2": b2}    

### 2.5 Implement Forward Propagation
#       实现向前传播

The hidden layer and output layer will compute the activations using sigmoid activation function and will pass it in the forward direction. While computing this activation, the input is multiplied with weight and added with bias before passing it to the function.
# 隐含层和输出层使用sigmoid激活函数计算激活量，并向前传递。在计算这个激活时，输入与权重相乘，并在传递给函数之前加上偏差。 

In [40]:
def forward_propagation(X, params):
    Z1 = np.dot(params['W1'], X)+params['b1']
    A1 = sigmoid(Z1)
    
    ################################需要填写，请计算Z2和A2
    Z2 = np.dot(params['W2'], A1)+params['b2']
    A2 = sigmoid(Z2)
    ################################需要填写，请计算Z2和A2
    
    return {"Z1": Z1, "A1": A1, "Z2": Z2, "A2": A2}    

### 2.6 Compute the Network Error 
#       计算网络误差

To compute the cost, one straight forward approach is to compute the absolute error among prediction and actual value. But a better loss function is the log loss function which is defines as : 
# 计算成本的一种直接方法是计算预测值与实际值之间的绝对误差。但更好的损失函数是对数损失函数，定义为:
  -Summ ( Log (Pred) * Actual + Log (1 - Pred ) * Actual ) / m

In [41]:
def compute_error(Predicted, Actual):                   #计算误差函数，可以选择不同的误差函数
    logprobs = np.multiply(np.log(Predicted), Actual)+ np.multiply(np.log(1-Predicted), 1-Actual)
    cost = -np.sum(logprobs) / Actual.shape[1] 
    return np.squeeze(cost)

### 2.7 Implement Backward Propagation
#       实现反向传播
In backward propagation function, the error is passed backward to previous layers and the derivatives of weights and bias are computed. The weights and bias are then updated using the derivatives. 
# 在反向传播函数中，将误差反向传递到前一层，计算权值和偏差的导数。然后使用导数更新权重和偏差。 

In [42]:
def backward_propagation(params, activations, X, Y):                ##########反向传播算法，计算导数############
    m = X.shape[1]
    
    # output layer
    # 输出层
    dZ2 = activations['A2'] - Y                      # compute the error derivative  计算误差导数
    dW2 = np.dot(dZ2, activations['A1'].T) / m       # compute the weight derivative 计算权重导数
    db2 = np.sum(dZ2, axis=1, keepdims=True)/m       # compute the bias derivative   计算偏导
    
    # hidden layer
    # 隐藏层
    dZ1 = np.dot(params['W2'].T, dZ2)*(1-np.power(activations['A1'], 2))
    dW1 = np.dot(dZ1, X.T)/m
    db1 = np.sum(dZ1, axis=1,keepdims=True)/m
    
    return {"dW1": dW1, "db1": db1, "dW2": dW2, "db2": db2}

def update_parameters(params, derivatives, alpha = 1.2):           ##########反向传播算法，更新参数############
    # alpha is the model's learning rate
    # 是模型的学习率
    
    params['W1'] = params['W1'] - alpha * derivatives['dW1']
    params['b1'] = params['b1'] - alpha * derivatives['db1']
    
    ####################需要填写，请计算W2和b2更新之后的值
    params['W2'] = params['W2'] - alpha * derivatives['dW2']
    params['b2'] = params['b2'] - alpha * derivatives['db2']
    ####################需要填写，请计算W2和b2更新之后的值
    
    return params

### 2.8 Compile and Train the Model
#       编译和培训模型

Create a function which compiles all the key functions and creates a neural network model. 
# 创建一个编译所有关键函数并创建一个神经网络模型的函数。

In [43]:
def neural_network(X, Y, n_h, num_iterations=100):                      ############编译网络
    n_x = network_architecture(X, Y)[0]
    n_y = network_architecture(X, Y)[2]
    
    params = define_network_parameters(n_x, n_h, n_y)                  ############初始化参数  
    for i in range(0, num_iterations):
        results = forward_propagation(X, params)                       ############前向传播 
        error = compute_error(results['A2'], Y)
        derivatives = backward_propagation(params, results, X, Y)      ############反向传播
        params = update_parameters(params, derivatives)                ############更新参数
    return params

In [44]:
y = Y.values.reshape(1, Y.size)
x = X.T.as_matrix()
model = neural_network(x, y, n_h = 10, num_iterations = 10)            ############生成一个新的网络

  


### 2.9 Predictions 
#       预测

In [45]:
def predict(parameters, X):
    results = forward_propagation(X, parameters)
    print (results['A2'][0])
    predictions = np.around(results['A2'])    
    return predictions

predictions = predict(model, x)
print ('Accuracy: %d' % float((np.dot(y,predictions.T) + np.dot(1-y,1-predictions.T))/float(y.size)*100) + '%')

[0.95650605 0.30714376 0.95650605 ... 0.95650605 0.30714376 0.95650605]
Accuracy: 78%
