[[Neural Networks from Scratch]]

With our model up to this point, we've validated during training, but currently have no great way to run a test on data or perform a prediction. Initially we add a new `evaluate` method to the `Model` class:

In [None]:
# Evaluates the model using passed in dataset
def evaluate(self, X_val, y_val, *, batch_size=None):

The method above takes in samples (`X_val`), target outputs (`y_val`), and an optional batch size. First, we calculate the number of steps given the length of the data and the `batch_size` argument.

This is the same as in the `train` method:

In [None]:
# Default value if batch size is not being set
validation_steps = 1 

# Calculate number of steps 
if batch_size is not None:
	validation_steps = len(X_val) // batch_size
	# Dividing rounds down. If there are some remaining
	# data, but not a full batch, this won't include it
	# Add `1` to include this not full batch
	if validation_steps * batch_size < len(X_val):
		validation_steps += 1


Then, we want to move a chunk of code from the `Model` class' `train` method:


##### Training Loop

In [None]:
# Create dataset
X, y, X_test, y_test = create_data_mnist('fashion_mnist_images')

# Shuffle the training dataset 
keys = np.array(range(X.shape[0]))
np.random.shuffle(keys)
X = X[keys]
y = y[keys] 

# Scale and reshape samples
X = (X.reshape(X.shape[0], -1).astype(np.float32) - 127.5) / 127.5
X_test = (X_test.reshape(X_test.shape[0], -1).astype(np.float32) - 127.5) / 127.5

# Instantiate the model
model = Model()

# Add layers
model.add(Layer_Dense(X.shape[1], 128))
model.add(Activation_ReLU()) model.add(Layer_Dense(128, 128))
model.add(Activation_ReLU()) model.add(Layer_Dense(128, 10))
model.add(Activation_Softmax())

# Set loss, optimizer and accuracy objects
model.set(
		Loss=Loss_CategoricalCrossentropy(),
		optimiser=Optimiser_Adam(decay=1e-3),
		accuracy=Accuracy_Categorical()
)
		
# Finalise the model
model.finalise()

# Train the model
model.train(X, y, validation_data=(X_test, y_test), epochs=10, batch_size=128, print_every=100)


##### Testing the method:

In [None]:
model.evaluate(X_test, y_test)

Running this, we get:
>>>
...
epoch: 10
step: 0, acc: 0.891, loss: 0.263 (data_loss: 0.263, reg_loss: 0.000), lr: 0.0001915341888527102
step: 100, acc: 0.883, loss: 0.257 (data_loss: 0.257, reg_loss: 0.000), lr: 0.00018793459875963167
step: 200, acc: 0.922, loss: 0.227 (data_loss: 0.227, reg_loss: 0.000), lr: 0.00018446781036709093
step: 300, acc: 0.898, loss: 0.282 (data_loss: 0.282, reg_loss: 0.000), lr: 0.00018112660749864155
step: 400, acc: 0.914, loss: 0.299 (data_loss: 0.299, reg_loss: 0.000), lr: 0.00017790428749332856
step: 468, acc: 0.917, loss: 0.192 (data_loss: 0.192, reg_loss: 0.000), lr: 0.00017577781683951485
training, acc: 0.894, loss: 0.291 (data_loss: 0.291, reg_loss: 0.000), lr: 0.00017577781683951485
validation, acc: 0.874, loss: 0.354
validation, acc: 0.874, loss: 0.354

Next, we can also run evaluation on the training data:

In [None]:
model.evaluate(X, y)

Running this prints:
>>>
validation, acc: 0.895, loss: 0.285

"Validation" here means that we evaluated the model, but we have done this using the training data. We compare that to the result of training on this data which we have just performed:
training, acc: 0.894, loss: 0.291 (data_loss: 0.291, reg_loss: 0.000), lr: 0.00017577781683951485
There is a discrepancy between accuracy and loss values because the model prints accuracy and loss accumulated during the epoch, while the model was still learning; meaning that mean accuracy and loss differ from the evaluation on the training data that has been run after the last epoch of training.

**Running evaluation on the training data at the end of the training process will return the final accuracy and loss.**

##### Next Step
[[Saving and Loading Model and Their Parameters]]