# Neural Networks and Deep Learning Notes

Notes and equations from [neuralnetworksanddeeplearning.com](http://neuralnetworksanddeeplearning.com/)

# Chapter 3

Techniques covered:

- Cross-entropy cost function.
- Regularization methods: L1, L2, dropout, artificially expanding training data.
- Weight initialization.
- Heuristics for hyperparameter selection.

## Cross-Entropy Cost Function

- We learn slowly when our errors are not well-defined.
- Artificial neurons can have more difficulty learning when they're badly wrong (high error) than when they're just a little wrong (low error).

### Neuron Learning Slowly

- Neuron learns by changing the weight and bias at a rate determined by the partial derivatives of the cost function.
- Learning "slow" can be attributed to small partial derivatives.
- The partial derivative of the cst w.r.t. the weight uses the derivative of the sigmoid function. 
- When an input to the sigmoid function is close to one, the sigmoid curve is close to flat, making its partial derivative small, making it learn slowly.

In [9]:
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import matplotlib.pyplot as plt

X = np.ones((5))
y = np.zeros((5))

numpy.random.seed(7)
model=Sequential()
model.add(Dense(1, input_dim=1, activation='sigmoid', weights=[np.array(0.6), np.array([[0.9]])]))
model.compile(loss='mean_squared_error', optimizer='sgd', 
              metrics=['mean_squared_error'])

fit_result = model.fit(X,y, batch_size=2, nb_epoch=100, verbose=0)

plt.title('Mean Squared Error')
plt.plot(fit_result.history['mean_squared_error'])
plt.show()

ValueError: Layer weight shape (1,) not compatible with provided weight shape (1, 1)