# Chapter 4. Neural Network Learning

Learning means getting the optimized weight values from train data by minimizing the value of loss function.

## 4.1 learn from data

In the case of image classification,  
Machine learning such as SVM, KNN, etc: train the pattern of the features extracted from the images. However, the features are still selected by human.  
Neural Network (Deep Learning): Machine chooses and extracts the important features from the images for itself.

In order to evaluate universal ability, we normally deal with learning machine by dividing the data into train data and test data. (universal ability stands for the ability that can solve the problmes machine never met before.)

## 4.2. loss function

Nerual network find the optimized weight and bias values by minimizing loss function. i.e. Mean squared error(MSE) and cross entropy error(CEE) are usually used as the loss function.  
The reason why we can get the optimized values based on accuracy is accuracy has many points where the differentiated value is 0.

- Mean squared error(MSE) for one data
  
    $ E =  \frac{1}{2} \Sigma_k (y_k - t_k)^2$

In [31]:
import numpy as np
def mean_squared_error(y, t):
    return 0.5*np.sum((y-t)**2)

# Example
y = [0.1,0.05,0.1,0.0,0.05,0.1,0.0,0.6,0.0,0.0]
t = [0,0,1,0,0,0,0,0,0,0] #one hot encoing

mean_squared_error(np.array(y), np.array(t))

0.5975

- Cross entropy error(CEE) for one data
  
    $ E = - \Sigma_k t_k log y_k$ ($log$ is natural logarithm: $log_e$)

In [34]:
def cross_entropy_error(y,t):
    delta = 1e-7 #very tiny value
    return -np.sum(y*np.log(y + delta)) #To prevent the denominator from being 0 and the result from being -inf

# Example
y = [0.1,0.05,0.1,0.0,0.05,0.1,0.0,0.6,0.0,0.0]
t = [0,0,1,0,0,0,0,0,0,0] #one hot encoing

cross_entropy_error(np.array(y),np.array(t))

1.2968435295135659

 Above formulas are only for one data. The below is loss function for the whole data.

- Cross entropy error(CEE) for whole data  
    $E = -\frac{1}{N}\Sigma_n\Sigma_kt_{nk}logy_{nk}$  
    ($log$ is natural logarithm($log_e$) and  $t_nk$ is $k^{th}$value of $n^{th}$data)

- Mini-batch
    train some of the datas in neural network learning

In [1]:
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784')
import numpy as np

X = mnist['data']
T = mnist['target']

x_train = X[:60000]
t_train = T[:60000]
x_test = X[60000:]
t_test = T[60000:]

print(x_train.shape)
print(t_train.shape)



(60000, 784)
(60000,)


In [2]:
train_size = x_train.shape[0]
batch_size = 10
batch_mask = np.random.choice(train_size, batch_size)
x_batch = x_train[batch_mask]
t_batch = t_train[batch_mask]

- Cross entropy error(CEE) for mini-batch

In [3]:
def cross_entropy_error_minibatch(y,t): #one hot encoding
    if y.ndim == 1: #make the array to 2-d
        y = y.reshape(1,y.size)
        t = t.reshape(1,t.size)
    
    delta = 1e-7
    batch_size = y.shape[0]
    return -np.sum(t*np.log(y+delta))/batch_size

In [4]:
def cross_entrpy_error_minibatch_1(y,t): #when the answer array consists of number label(not one hot enocoding)
    if y.ndim == 1: #make the array to 2-d
        y = y.reshape(1,y.size)
        t = t.reshape(1,t.size)
    
    delta = 1e-7
    batch_size = y.shape[0]
    return -np.sum(np.log(y[np.arange(batch_size),t]+delta))/batch_size

In [None]:
#p120