# Testing the feedforward network

`FeedForward.py` contains a basic feedforward neural network structure. Here I'm just going to test it.

Some imports

In [1]:
import numpy as np

## Test if it loads

In [2]:
import FeedForward as FF

Well that's a win.

## Testing addition

Train an ultra simple network that just needs to do one thing: add two numbers. 

Really this is already what a neural network does anyway, so you can get by with one single node where the weights are 1 and the  bias 0, and the activation function is linear.

### Create data for addition

In [3]:
def create_addition_data(
    n_samples,
    max_value,
    random_state_init,
):

    rng = np.random.default_rng(random_state_init)

    # random numbers up to 30
    X = rng.integers(max_value, size=(n_samples,2))
    y = X[:,0] + X[:,1]
    
    return X,y

In [4]:
X,y = create_addition_data(n_samples=5000, max_value=30, random_state_init=42)

for (a,b),c in zip(X[:5],y[:5]):
    print(f'{a} + {b} = {c}')

2 + 23 = 25
19 + 13 = 32
12 + 25 = 37
2 + 20 = 22
6 + 2 = 8


In [5]:
mininetwork = FF.Network()

In [6]:
mininetwork.add(FF.Layer(1, n_inputs=2, activation_fn='linear'))

initialising layer 0


In [7]:
print(f'weights: {mininetwork.layers[0].weights}')
print(f'biases:  {mininetwork.layers[0].biases}')

weights: [[-0.36792657  0.99124049]]
biases:  [[-0.78130533]]


In [8]:
mininetwork.train(
    X, y, 
    n_epochs=5,
    batch_size=100,
    learning_rate=0.001,
    objective_fn=FF.L2,
    random_seed=42,
    verbosity=1,
    test_data=(X[:5],y[:5])
)

5000 training samples
Training epoch 1/5
Epoch 1/5: 0.2327
Training epoch 2/5
Epoch 2/5: 0.2112
Training epoch 3/5
Epoch 3/5: 0.2104
Training epoch 4/5
Epoch 4/5: 0.1955
Training epoch 5/5
Epoch 5/5: 0.2109


### Test the network with new data

In [9]:
Xt,yt = create_addition_data(n_samples=10, max_value=50, random_state_init=15)

y_pred = mininetwork.compute_forward(Xt)
for yp, ytt in zip(y_pred, yt):
    print(f'predicted {yp:.2f}, should be {ytt} (diff {ytt-yp:.2f})')

predicted 80.92, should be 80 (diff -0.92)
predicted 75.81, should be 75 (diff -0.81)
predicted 28.88, should be 29 (diff 0.12)
predicted 11.53, should be 12 (diff 0.47)
predicted 50.30, should be 50 (diff -0.30)
predicted 55.42, should be 55 (diff -0.42)
predicted 85.00, should be 84 (diff -1.00)
predicted 40.10, should be 40 (diff -0.10)
predicted 56.43, should be 56 (diff -0.43)
predicted 61.52, should be 61 (diff -0.52)


### Check the weights & bias

weights should be ~1, bias should be ~0

In [10]:
mininetwork.layers[0].weights

array([[1.02056702, 1.02004817]])

In [11]:
mininetwork.layers[0].biases

array([[-0.71161231]])

Yeah! That seems to work!

Though interesting: the bias isn't particularly good. Kind of shows in the test data above, where there does seem to be a.. well... bias.

## Test with larger network

Can we do the same simple task with a more complicated network?

Really just checking whether my code seems to do the basic things alright.

### Prepare network

In [12]:
# prep network
network = FF.Network()

# add 5 node layer (also define inputs)
network.add(FF.Layer(5, n_inputs=2))
# add 3 node layer
network.add(FF.Layer(3))
# add 1 node output layer
network.add(FF.Layer(1, activation_fn = 'linear'))

initialising layer 0
initialising layer 1
previous layer: 0
initialising layer 2
previous layer: 1


In [13]:
print(network)

Network  with 3 layers:
  Layer 0: 5 neurons, activation function: <function sigmoid at 0x7f5446f921e0>
  Layer 1: 3 neurons, activation function: <function sigmoid at 0x7f5446f921e0>
  Layer 2: 1 neurons, activation function: <function linear at 0x7f541e69e400>



In [14]:
for layer in network.layers:
    print(layer.weights.shape)

(5, 2)
(3, 5)
(1, 3)


In [15]:
network.train(
    X, y, 
    n_epochs=10,
    batch_size=100,
    learning_rate=0.001,
    objective_fn=FF.L2,
    random_seed=42,
    verbosity=1,
    test_data=(X[:50],y[:50])
)

5000 training samples
Training epoch 1/10
Epoch 1/10: 22652.6767
Training epoch 2/10
Epoch 2/10: 17898.0325
Training epoch 3/10
Epoch 3/10: 13843.8051
Training epoch 4/10
Epoch 4/10: 10533.2328
Training epoch 5/10
Epoch 5/10: 8146.2901
Training epoch 6/10
Epoch 6/10: 6470.5852
Training epoch 7/10
Epoch 7/10: 5360.8857
Training epoch 8/10
Epoch 8/10: 4594.1763
Training epoch 9/10
Epoch 9/10: 4063.2974
Training epoch 10/10
Epoch 10/10: 3690.1128


In [16]:
Xt,yt = create_addition_data(n_samples=10, max_value=50, random_state_init=15)

y_pred = network.compute_forward(Xt)
for yp, ytt in zip(y_pred, yt):
    print(f'predicted {yp:.2f}, should be {ytt} (diff {ytt-yp:.2f})')

predicted 24.19, should be 80 (diff 55.81)
predicted 24.19, should be 75 (diff 50.81)
predicted 24.19, should be 29 (diff 4.81)
predicted 23.91, should be 12 (diff -11.91)
predicted 24.19, should be 50 (diff 25.81)
predicted 23.94, should be 55 (diff 31.06)
predicted 24.19, should be 84 (diff 59.81)
predicted 24.19, should be 40 (diff 15.81)
predicted 24.18, should be 56 (diff 31.82)
predicted 23.98, should be 61 (diff 37.02)


In [17]:
y_pred = network.compute_forward(Xt)
y_pred[:5]

array([24.18738073, 24.18739068, 24.18531796, 23.90836144, 24.18721845])

Not really impressive, is it?

### Trying different learning rates

Not that that's likely to do anything much.

In [18]:
for lr in [10**(-x) for x in [2,3,4,5]]:
    
    print(f'\n\nTraining with learning rate {lr}')
    
    network.reset_weights_and_biases()
    network.train(
        X, y, 
        n_epochs=5,
        batch_size=100,
        learning_rate=lr,
        objective_fn=FF.L2,
        random_seed=42,
        verbosity=1,
        test_data=(X[:50],y[:50])
    )
    
    y_pred = network.compute_forward(Xt)
    for yp, ytt in zip(y_pred, yt):
        print(f'predicted {yp:.2f}, should be {ytt} (diff {ytt-yp:.2f})')



Training with learning rate 0.01
5000 training samples
Training epoch 1/5
Epoch 1/5: 4468.9315
Training epoch 2/5
Epoch 2/5: 2790.5700
Training epoch 3/5
Epoch 3/5: 2422.1961
Training epoch 4/5
Epoch 4/5: 2276.7316
Training epoch 5/5
Epoch 5/5: 2478.8653
predicted 30.12, should be 80 (diff 49.88)
predicted 30.12, should be 75 (diff 44.88)
predicted 30.12, should be 29 (diff -1.12)
predicted 27.77, should be 12 (diff -15.77)
predicted 30.12, should be 50 (diff 19.88)
predicted 26.85, should be 55 (diff 28.15)
predicted 30.12, should be 84 (diff 53.88)
predicted 30.12, should be 40 (diff 9.88)
predicted 30.12, should be 56 (diff 25.88)
predicted 30.12, should be 61 (diff 30.88)


Training with learning rate 0.001
5000 training samples
Training epoch 1/5
Epoch 1/5: 19133.2422
Training epoch 2/5
Epoch 2/5: 14841.7215
Training epoch 3/5
Epoch 3/5: 11356.7346
Training epoch 4/5
Epoch 4/5: 8778.9258
Training epoch 5/5
Epoch 5/5: 6974.1732
predicted 17.45, should be 80 (diff 62.55)
predicted

Well... to conclude, you shouldn't try to train a simple problem in a complicated way. I guess.