# 重固彭亮network 1-3的代码

## network.py

### network.py实现的神经网络比较简单，调用起来只用两步：Network([784,30,10])和SGD 顺着这个看一下：

### 1. 构造神经网络

In [2]:
import numpy as np

In [3]:
def __init__(self, sizes):
    """The list ``sizes`` contains the number of neurons in the
    respective layers of the network.  For example, if the list
    was [2, 3, 1] then it would be a three-layer network, with the
    first layer containing 2 neurons, the second layer 3 neurons,
    and the third layer 1 neuron.  The biases and weights for the
    network are initialized randomly, using a Gaussian
    distribution with mean 0, and variance 1.  Note that the first
    layer is assumed to be an input layer, and by convention we
    won't set any biases for those neurons, since biases are only
    ever used in computing the outputs from later layers."""
    self.num_layers = len(sizes)
    self.sizes = sizes
    self.biases = [np.random.randn(y, 1) for y in sizes[1:]] ##除了输入层其他层都需要一个Biases
    self.weights = [np.random.randn(y, x)
                    for x, y in zip(sizes[:-1], sizes[1:])]  ## 分别生成（30,784）和（10，30）的矩阵

In [4]:
help(np.random.randn)

Help on built-in function randn:

randn(...) method of mtrand.RandomState instance
    randn(d0, d1, ..., dn)
    
    Return a sample (or samples) from the "standard normal" distribution.
    
    If positive, int_like or int-convertible arguments are provided,
    `randn` generates an array of shape ``(d0, d1, ..., dn)``, filled
    with random floats sampled from a univariate "normal" (Gaussian)
    distribution of mean 0 and variance 1 (if any of the :math:`d_i` are
    floats, they are first converted to integers by truncation). A single
    float randomly sampled from the distribution is returned if no
    argument is provided.
    
    This is a convenience function.  If you want an interface that takes a
    tuple as the first argument, use `numpy.random.standard_normal` instead.
    
    Parameters
    ----------
    d0, d1, ..., dn : int, optional
        The dimensions of the returned array, should be all positive.
        If no argument is given a single Python float is ret

In [5]:
np.random.randn(1,3)

array([[ 1.51348912, -1.21852393, -0.56293872]])

In [6]:
np.random.randn(2,2)

array([[-0.65213249, -0.60574028],
       [-0.6679294 , -0.20503991]])

### 结论已经很明显了，np.random.randn(x,y)是用来生成一个x行y列的array；其值是从标准正态分布中搞到的

### 2. 导入数据

#### -本来导入的是cPickle库，python3之后就不用了，而且有所改变，可参考：
#### https://blog.csdn.net/lanqiu5ge/article/details/25136909

In [7]:
import pickle
import gzip
##gzip是自带的，针对python2 和python3也有所不同，具体的可看上一篇.ipy

In [8]:
f = gzip.open('E:/Git-repository/neural-networks-and-deep-learning/data/mnist.pkl.gz', 'rb')
tr_d, va_d, te_d = pickle.load(f,encoding='latin1')
f.close()

#### Python2和python3有关cPickly包的区别用法： https://blog.csdn.net/xiaojiajia007/article/details/53707180

In [9]:
print(type(tr_d))
print(len(tr_d))
print(type(tr_d[0]))
print(type(tr_d[1]))
## tr_d是一个由两个numpy.ndarray类组成的元组
print(len(tr_d[0]))
print(len(tr_d[1]))
print(type(tr_d[0][0]))
print(type(tr_d[1][0]))
##第0个tr_d存放的是50000个numpy.ndarray；第1个tr_d存放的是50000个numpy.int64

<class 'tuple'>
2
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
50000
50000
<class 'numpy.ndarray'>
<class 'numpy.int64'>


In [10]:
training_inputs = [np.reshape(x, (784, 1)) for x in tr_d[0]]

In [11]:
print(tr_d[0][0].shape) ##可以认为是一个元组吧

(784,)


In [12]:
temp = np.reshape(tr_d[0][0],(784,1))##从一个行向量，转换成了列向量
print(temp.shape)
print(type(temp))

(784, 1)
<class 'numpy.ndarray'>


In [13]:
print(tr_d[0][0][:10])
print(type(temp))
print(temp[:10])

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
<class 'numpy.ndarray'>
[[0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]]


In [14]:
def vectorized_result(j):
    """Return a 10-dimensional unit vector with a 1.0 in the jth
    position and zeroes elsewhere.  This is used to convert a digit
    (0...9) into a corresponding desired output from the neural
    network."""
    e = np.zeros((10, 1)) ##肯定是用来生成一个10行1列的全0列向量的呀
    e[j] = 1.0            ##把指定位置为1
    return e

training_results = [vectorized_result(y) for y in tr_d[1]]
training_data = list(zip(training_inputs, training_results))

#### -因为python2 和python3的区别，所以要加个list将数据转换一下
#### https://blog.csdn.net/u012509485/article/details/78203784

### 3. 训练网络

In [15]:
import random

In [16]:
temp = [1,2,3,4,5,6,7]
random.shuffle(temp)  ##shuffle有“洗”是意思，这里为“打乱”
print(temp)

[2, 3, 5, 1, 4, 7, 6]


In [20]:
def update_mini_batch(self, mini_batch, eta):
    """Update the network's weights and biases by applying
    gradient descent using backpropagation to a single mini batch.
    The ``mini_batch`` is a list of tuples ``(x, y)``, and ``eta``
    is the learning rate."""
    nabla_b = [np.zeros(b.shape) for b in self.biases]##biases = (30,10)
    nabla_w = [np.zeros(w.shape) for w in self.weights]##weights =(（30,784）和（10，30）)
    for x, y in mini_batch:##mini_batch是截取的一段数据，和training_data有一样的形状，分别是(784,1)和(10,1)
        delta_nabla_b, delta_nabla_w = self.backprop(x, y)
        nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
        nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
    self.weights = [w-(eta/len(mini_batch))*nw
                   for w, nw in zip(self.weights, nabla_w)]
    self.biases = [b-(eta/len(mini_batch))*nb
                   for b, nb in zip(self.biases, nabla_b)]
    
def backprop(self, x, y):
    """Return a tuple ``(nabla_b, nabla_w)`` representing the
    gradient for the cost function C_x.  ``nabla_b`` and
    ``nabla_w`` are layer-by-layer lists of numpy arrays, similar
    to ``self.biases`` and ``self.weights``."""
    nabla_b = [np.zeros(b.shape) for b in self.biases]
    nabla_w = [np.zeros(w.shape) for w in self.weights]
    # feedforward
    activation = x
    activations = [x] # list to store all the activations, layer by layer
    #x是一个784行1列的列向量，当做输入层的输出
    zs = [] # list to store all the z vectors, layer by layer
    for b, w in zip(self.biases, self.weights):
        z = np.dot(w, activation)+b ##矩阵乘法，相加，能和b相加，肯定是（[30],[10]）的两个向量
        zs.append(z)
        activation = sigmoid(z)
        activations.append(activation)#也是[30],[10]）的两个向量
    # backward pass
    delta = self.cost_derivative(activations[-1], y) * \
    ##结果是一个[10]10行列向量，这就是最后一行的差
        sigmoid_prime(zs[-1])##就是求导的（sigmoid(z)*(1-sigmoid(z))）
        
    nabla_b[-1] = delta  ##根据下面的性质，可知，这是一个（10,1）的向量
    nabla_w[-1] = np.dot(delta, activations[-2].transpose())## 这是隐藏层的输出乘以权重，就是z对w的导数
    # Note that the variable l in the loop below is used a little
    # differently to the notation in Chapter 2 of the book.  Here,
    # l = 1 means the last layer of neurons, l = 2 is the
    # second-last layer, and so on.  It's a renumbering of the
    # scheme in the book, used here to take advantage of the fact
    # that Python can use negative indices in lists.
    for l in range(2, self.num_layers):##从倒数第2层向前更新，因为有个"-"号
        z = zs[-l]
        sp = sigmoid_prime(z)
        delta = np.dot(self.weights[-l+1].transpose(), delta) * sp
        nabla_b[-l] = delta
        nabla_w[-l] = np.dot(delta, activations[-l-1].transpose())
    return (nabla_b, nabla_w)

SyntaxError: invalid syntax (<ipython-input-20-32b1d51ff17d>, line 35)

In [21]:
## transpose()应该是转置的意思,应该是矩阵的运算函数，这个z到底是什么类型呢？
temp = np.random.randn(1,3)
print(type(temp))
print(temp)
print(temp.transpose())
print(type(training_data[0][0][:10]))

<class 'numpy.ndarray'>
[[-1.96191778  1.16883108  0.37254797]]
[[-1.96191778]
 [ 1.16883108]
 [ 0.37254797]]
<class 'numpy.ndarray'>


In [25]:
## 验证下numpy.ndarray这种矩阵在点乘和函数操作时候的变化
## 1. 生成一个numpy.ndarray ，更过方法可参考百度
data1 = [1,2,3]
data_1 = np.array(data1)
print(type(data_1))
print(data_1.shape)
data1_ = np.reshape(data_1,(3,1))
print(type(data1_))
print(data1_.shape)
data1_

<class 'numpy.ndarray'>
(3,)
<class 'numpy.ndarray'>
(3, 1)


array([[1],
       [2],
       [3]])

In [29]:
## 2. 点乘的运算
d1 = [[1,2,3],[4,5,6],[7,8,9]]
d_1 = np.array(d1)
print(d_1)
dot_1 = np.dot(d_1,data1_)
##可见和矩阵的乘法是一样的嘛~
dot_1

[[1 2 3]
 [4 5 6]
 [7 8 9]]


array([[14],
       [32],
       [50]])

In [34]:
## 3. 带入函数以及普通的*（乘法）
def sigmoid(z):
    return 1.0/(1.0+np.exp(-z))
s_1 = sigmoid(d_1)
print(s_1)
c_1 = dot_1* data1_
print(c_1)
s_2 = d_1 + data1_
print(s_2)
## 小结
#矩阵的一般运算（函数呀，加乘呀）都是针对每个元素分别进行的
# 需要注意3x3矩阵加3x1矩阵的情况哦

[[0.73105858 0.88079708 0.95257413]
 [0.98201379 0.99330715 0.99752738]
 [0.99908895 0.99966465 0.99987661]]
[[ 14]
 [ 64]
 [150]]
[[ 2  3  4]
 [ 6  7  8]
 [10 11 12]]


## network2.py

In [1]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autocall  %automagic  %autosave  %bookmark  %cd  %clear  %cls  %colors  %config  %connect_info  %copy  %ddir  %debug  %dhist  %dirs  %doctest_mode  %echo  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %macro  %magic  %matplotlib  %mkdir  %more  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %profile  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %ren  %rep  %rerun  %reset  %reset_selective  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%cmd  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%perl  %%prun  %%pypy  %%python  %%python2  %%python3  %%rub

In [1]:
from src import network2
from src import mnist_loader

In [3]:
##导数数据和reshape都是一样的就不看了，首先来看看网络构造函数的不同：
net = network2.Network([784,30,10],cost=network2.CrossEntropyCost)

#### 2-1 解析这个构造函数：关注：除以np.sqrt(30)就能将方差缩小嘛？

In [10]:
## 首先调用了default_weight_initializer初始化权重self.default_weight_initializer()
### np.sqrt
import numpy as np
x = [1,2,3,4]
x1 = np.array(x)
print(x1,np.sqrt(x1)) ##np.sqrt(x) ： 计算数组各元素的平方根
x = [[1,2],[1,1],[2,2]]
x1 = np.array(x)
print(x1)
print(np.sqrt(x1),np.sqrt(30))
#小结：生成一个标准正态分布之后再除以sqrt(30)，相当于把x轴压缩了，所以可以近似为减少方差的正态分布

[1 2 3 4] [1.         1.41421356 1.73205081 2.        ]
[[1 2]
 [1 1]
 [2 2]]
[[1.         1.41421356]
 [1.         1.        ]
 [1.41421356 1.41421356]] 5.477225575051661


#### 2-2 network2.py中有一个save和load函数，是用来保存参数的，非常有用，但是其数据结构是json，探究一下

In [2]:
import json

In [7]:
d_dict = {"first":[1,2,3],"second":(4,5,6)}
print(type(d_dict))
with open("test.json","w") as f1:
    print(type(f1))
    json.dump(d_dict,f1)

<class 'dict'>
<class '_io.TextIOWrapper'>


In [9]:
with open("test.json","r") as f1:
    temp = json.load(f1)
    print(type(temp))
    print(temp)
### 小结，就是存储数据的类型嘛，没什么好说的，到底有啥优点还木探究

<class 'dict'>
{'second': [4, 5, 6], 'first': [1, 2, 3]}


#### 测试下是不是和视频中所述一样，更快的收敛了，然后尝试保存数据下次继续训练

In [1]:
from src import mnist_loader
from src import network2

tra_d,val_d,tst_d = mnist_loader.load_data_wrapper()
net1 = network2.Network([784,30,10],cost=network2.CrossEntropyCost)##这里的用法也要注意一下！
net1.SGD(tra_d,20,10,1.0,lmbda=5.0,evaluation_data=val_d,monitor_evaluation_accuracy = True)
net1.save("traing_reslut.json")

Epoch 0 training complete
Accuracy on evaluation data: 9243 / 10000
Epoch 1 training complete
Accuracy on evaluation data: 9445 / 10000
Epoch 2 training complete
Accuracy on evaluation data: 9401 / 10000
Epoch 3 training complete
Accuracy on evaluation data: 9474 / 10000
Epoch 4 training complete
Accuracy on evaluation data: 9507 / 10000
Epoch 5 training complete
Accuracy on evaluation data: 9479 / 10000
Epoch 6 training complete
Accuracy on evaluation data: 9507 / 10000
Epoch 7 training complete
Accuracy on evaluation data: 9508 / 10000
Epoch 8 training complete
Accuracy on evaluation data: 9546 / 10000
Epoch 9 training complete
Accuracy on evaluation data: 9457 / 10000
Epoch 10 training complete
Accuracy on evaluation data: 9553 / 10000
Epoch 11 training complete
Accuracy on evaluation data: 9474 / 10000
Epoch 12 training complete
Accuracy on evaluation data: 9504 / 10000
Epoch 13 training complete
Accuracy on evaluation data: 9501 / 10000
Epoch 14 training complete
Accuracy on evalu

#### 网络好像是比network的快一点，可以保存参数以后从参数直接创建新的好的网络

In [None]:
net2 = network2.load("traing_reslut.json")

#### 有空的时候可以玩一玩，用训练好从参数，直接预测，今天就不搞了

## network3.py

In [1]:
from src import network3



ImportError: cannot import name 'downsample'