[[Neural Networks from Scratch]]

##### Initialise the packages

In [None]:
import micropip

await micropip.install("numpy")
await micropip.install("nnfs")
await micropip.install("matplotlib")

import matplotlib.pyplot as plt
import numpy as np
import nnfs
from nnfs.datasets import sine_data


##### Introduction
We've been working with classification models up to this point, where we try to determine *what something is*. Now we're delving into determining a *specific* value based on an input. For instance, we may want to prediction what the temperature will be tomorrow at 0600 or what the price of a car should be in 2032. 

For this approach, we'll need a new way to measure loss, and a new output layer activation function. The data will also be different, we need training data that has target scalar values, not classes.

##### Producing a Sine Graph

In [None]:
nnfs.init()

X, y = sine_data()

plt.plot(X, y)
plt.show()

If we're training on the sine wave pattern above, each input `X` maps to a scalar output `y`.

##### Activation Function for Output Layer
Regression tasks don't use Softmax or Sigmoid at the output. We use a *Linear Activation*, i.e. output = input


In [None]:
class Activation_Linear:
	def forward(self, inputs):
		self.inputs = inputs
		self.output = output

	def backward(self, dvalues):
		self.dinputs = dvalues.copy()

No transformation during the forward pass. The gradient of `y=x` is 1, so during backprop, gradient passes unchanged.

##### Loss Functions for Regression
###### Loss Class which is the parent object

In [None]:
class Loss:
	def calculate(self, output, y):
		sample_losses = self.forward(output, y)
		data_loss = np.mean(sample_losses)
		return data_loss


We use two loss functions for Regression:
###### Mean Squared Error (MSE)
$$
MSE = \frac{1}{n}\sum^n_{i=1}(y_i-\tilde{y}_i)^2
$$

In [None]:
class Loss_MeanSquaredError(Loss):
	def forward(self, y_pred, y_true):
		sample_losses = np.mean((y_true - y_pred) ** 2, axis=-1)
		return sample_losses

	def backward(self, dvalues, y_true):
		samples = len(dvalues)
		outputs = len(dvalues[0])
		self.dinputs = -2 * (y_true - dvalues) / outputs
		self.dinputs = self.dinputs / samples

Harsh on large errors due to the square.

###### Mean Absolute Error (MAE)
$$
MAE = \frac{1}{n}\sum^n_{i=1}|y_i-\tilde{y}_i|
$$

In [None]:
class Loss_MeanAbsoluteError(Loss):
	def forward(self, y_pred, y_true):
		sample_losses = np.mean(np.abs(y_true - y_pred), axis=-1)
		return sample_losses

	def backward(self, dvalues, y_true):
		samples = len(dvalues)
		outputs = len(dvalues[0])
		self.dinputs = np.sign(y_true - dvalues) / outputs
		self.dinputs = self.dinputs / samples

Less sensitive to outliers, but less smooth gradient.

##### Accuracy for Regression#
A prediction in Regression is "correct" if it's within a precision range (standard deviation) of the target.

In [None]:
accuracy_precision = np.std(y) / 250
accuracy = np.mean(np.absolute(predictions - y) < accuracy_precision)


##### Typical Model Architecture:

In [None]:
dense1 = Layer_Dense(1, 64)
activation1 = Activation_ReLU()
dense2 = Layer_Dense(64, 64)
activation2 = Activation_ReLU()
dense3 = Layer_Dense(64, 1)
activation3 = Activation_Linear()

loss_function = Loss_MeanSquaredError()
optimiser = Optimiser_Adam()

accuracy_precision = np.std(y) / 250


##### Training Loop

In [None]:
for epoch in range(10001):
    dense1.forward(X)
    activation1.forward(dense1.output)
    dense2.forward(activation1.output)
    activation2.forward(dense2.output)
    dense3.forward(activation2.output)
    activation3.forward(dense3.output)

    data_loss = loss_function.calculate(activation3.output, y)
    regularisation_loss = (
        loss_function.regularisation_loss(dense1)
        + loss_function.regularisation_loss(dense2)
        + loss_function.regularisation_loss(dense3)
    )
    loss = data_loss + regularisation_loss

    predictions = activation3.output
    accuracy = np.mean(np.absolute(predictions - y) < accuracy_precision)

    if not epoch % 100:
        print(f'epoch: {epoch}, acc: {accuracy:.3f}, loss: {loss:.3f} (data_loss: {data_loss:.3f}, reg_loss: {regularisation_loss:.3f}), lr: {optimiser.current_learning_rate}')

    loss_function.backward(activation3.output, y)
    activation3.backward(loss_function.dinputs)
    dense3.backward(activation3.dinputs)
    activation2.backward(dense3.dinputs)
    dense2.backward(activation2.dinputs)
    activation1.backward(dense2.dinputs)
    dense1.backward(activation1.dinputs)

    optimiser.pre_update_params()
    optimiser.update_params(dense1)
    optimiser.update_params(dense2)
    optimiser.update_params(dense3)
    optimiser.post_update_params()


##### Evaluation

In [None]:
X_test, y_test = sine_data()

dense1.forward(X_test)
activation1.forward(dense1.output)
dense2.forward(activation1.output)
activation2.forward(dense2.output)
dense3.forward(activation2.output)
activation3.forward(dense3.output)

plt.plot(X_test, y_test)
plt.plot(X_test, activation3.output)
plt.show()

Compares true values with predictions visually.![[Screenshot_2025-06-06_11-28-23.png]]
##### Next Step
[[Model Object]]