## Neural Networks
### Practical Task: Competitive Rating System

#### Bronwyn Bowles-King

A simple multi-layer perceptron (MLP) is created in this notebook using Python and NumPy. Basic MLP architecture is defined below to evaluate the quality of a basketball player based on five inputs. These inputs are performance scores for a player's speed, jumping, shooting, intelligence and strength abilities. These scores are from 1 to 100. A player's ability is computed using the MLP and a competitive rating is output between 0 and 10.

The neural network has the following structure:

- Input layer with 'neurons' (here, simply called units) for performance scores for the attributes speed, jump, shooting, intelligence and strength.
- Hidden layer with five units to receive the input.
- Output layer of one unit, a basketball player's competitive rating.

The notebook follows the instructions given for the task and then tries additional steps to improve the model as it was performing poorly.

### 0. Import packages and set random seed

In [14]:
import numpy as np

np.random.seed(42) 

#### 1. Define helper functions: Sigmoid and forward propagation

The neuron output formula for MLP forward propagation using the sigmoid activation function is applied here:

$
\text{output} = \text{sigmoid} \left( \sum_i (\text{input}_i \times \text{weight}_i) + \text{bias} \right)
$

This shows how each unit's output in the MLP is computed. Each input is multiplied by its corresponding weight, the bias is added, and the result is passed through the sigmoid activation function. 

Two Python functions are created below for the sigmoid function and for forward propagation. Thereafter, the main MLP function is created that will call the two helper functions. Within the forward propagation function, the sigmoid is applied at two layers (A1 and A2), in line with the task instructions. However, applying it only once is more effective (see the Appendix). This code is adapted from Njoroge (2024) and Verman (2021).

In [15]:
def sigmoid(z):

    """Computes sigmoid activation function. Helper function for main MLP 
    functions."""

    return 1 / (1 + np.exp(-z))


def forward_prop(score):

    """
    Helper function for player_ratings function. Conducts forward propagation/passes in a layered 
    neural network as follows: 
    
    a.  Receives input (X), which will be: (n_samples, input_size), as sample features. 

    b.  Calculates weighted sum of inputs plus a bias for the first layer (Z1) using dot 
        notation for matrix multiplication. 

    c.  Applies sigmoid function to produce the first layer's activations (A1). 

    d.  With A1 as input, step b and c are repeated for the second layer to calculate 
        Z2. 
        
    e.  Returns the final rating after applying sigmoid function (A2). 
    
    Relies on external sigmoid function and variables W1 and W2 (weights) and b1 and b2 
    (biases). Note that the sigmoid function is applied twice (), at A1 and A2.
    """

    Z1 = np.dot(score, W1) + b1
    A1 = sigmoid(Z1)
    Z2 = np.dot(A1, W2) + b2
    A2 = sigmoid(Z2)

    return A2


#### 2. Define main function: Player competitive rating

In [16]:
def player_ratings(performance_arrays):

    """
    Predict player ratings from one or more arrays of performance scores.
    Applies sigmoid function with weights and biases. 
    Scores are normalised before being passed through forward propagation.
    
    Parameters:
    -   performance_arrays (list or np.ndarray) as a list or single 2D numpy 
        array, each of shape (1, 5) for player scores.
                                                 
    Returns:
    -   One or a list of floats as predicted player ratings, rounded to 1 
        decimal place.
    """

    # Check input is a list of arrays before processing
    if isinstance(performance_arrays, np.ndarray):
        performance_arrays = [performance_arrays]
        
    ratings = []
    for i, scores in enumerate(performance_arrays, start=1):
        norm_scores = scores / 100.0  # Normalise input scores [0, 1]
        model_output = forward_prop(norm_scores)  
        
        # Scale output sigmoid value [0, 1] to target rating and round off
        player_rating = float(np.round(model_output[0, 0] * 10, 1))
        ratings.append(player_rating)

        print(f"Competitive rating for Player {i}: {player_rating}")
        


#### 3. Network size

The input_size is set for the five performance scores (features) for speed, jump, shooting, intelligence and strength received by the input layer. The hidden layer (hidden_size) is set at five for this experiment. One competitive rating is required per player at the end of the process (output_size). Adjusting the hidden_size did not improve the model's performance.

In [17]:
input_size = 5
hidden_size = 5
output_size = 1

#### 4. Weights and biases

Weights and biases are parameters that an MLP can adjust to model relationships between the input, such as performance scores, and the output, in this case a competitive rating for a sports player, enabling it to generalise from data and make useful predictions.

Weights are numerical values for the strength of the connection between two units in the layers of a neural network. Each input is a feature (speed, ability to jump, etc., in this case) connected to a unit in the hidden layer by this weight. The weights are there to control how much impact each input has on a unit's output. For each connection between the layers (input → hidden and hidden → output), the weights multiply the input value.

Biases are constant values added to inputs at a unit before the function is applied. Biases mean that functions are adapted slightly, which improves the network's ability to model variablity and handle low or zero inputs. In the MLP here, the bias is added to the value before it is passed through the sigmoid activation function. 

The values are random but are small positive floats (between 0.01 and 0.1) to try to manage a disproportionate effect on the final outcome. However, as we shall see, the weights and biases are mostly ineffective in this experiment, regardless of the size of the values set or whether they are positive or negative. This code is adapted from Sena (2023, 2024).

In [18]:
W1 = np.random.uniform(low=0.01, high=0.1, size=(input_size, hidden_size))
W2 = np.random.uniform(low=0.01, high=0.1, size=(hidden_size, output_size))
b1 = np.random.uniform(low=0.01, high=0.1, size=(1, hidden_size))
b2 = np.random.uniform(low=0.01, high=0.1, size=(1, output_size))

#### 5. Test the sigmoid function-based MLP

The player_ratings function created (according to the task instructions) performs poorly. Regardless of whether perfect (Player 1), good, mediocre, low or all-zero scores are entered, the output is almost the same at between 5.5 and 5.6. This means that the function cannot distinguish between good and a poor performance. It also can barely distinguish between a perfect player and a hypothetical 'non-player' with no scores at all, or even if the scores are all negative.

This means that something is going wrong in the functions used and this will be investigated further below. 

In [19]:
player1 = np.array([[100, 100, 100, 100, 100]])  # Perfect scores
player2 = np.array([[79, 83, 86, 91, 75]])   # Good 
player3 = np.array([[40, 38, 51, 49, 42]])  # Mediocre
player4 = np.array([[9, 10, 7, 8, 12]])  # Low
player5 = np.array([[0, 0, 0, 0, 0]])  # Non-player
player6 = np.array([[-10, -20, -40, -60, -80]])  # Negative values test

player_ratings([
    player1, player2, player3, 
    player4, player5, player6
])

Competitive rating for Player 1: 5.6
Competitive rating for Player 2: 5.5
Competitive rating for Player 3: 5.5
Competitive rating for Player 4: 5.5
Competitive rating for Player 5: 5.5
Competitive rating for Player 6: 5.5


#### 6. Investigate performance of the sigmoid MLP

The code below (adapted from Nguyen, 2024) shows what is happening when a good (Player 2) and a low (Player 4) set of scores are run through the sigmoid MLP. Already when the values are output by A1 (where the sigmoid function is first applied), the score of 60 for the good player and the score of only 9 for a poor player are converted to ~0.99996 and ~0.85499, respectively. By the time they are rounded off and finally output in the last stage, they are both the same (5.5). 

In this case, it is the sigmoid function causing the problem because the good and low scores are still considerably different (~10.40035 and ~1.77427, respectively) when they are output by the first layer (Z1), which only applies a small, random weight and bias. 

This is a result of how the sigmoid function works. Issues such as the vanishing gradient problem and/or that the sigmoid function always changes inputs to values between 0 and 1 can be causing a problem. The output value tends to settle around 0.5 for values run through the sigmoid function twice, regardless of their size (compare outputs for A2 below). 

This rating system is unfair to good players. In the Appendix, an experiment where the sigmoid function is applied only once is tried. The results have somewhat improved, as the competitive ratings for the same players above range from 4.8 to 5.8 for the player with negative scores up to the player with perfect scores. However, the system shown in the Appendix is still not performing well enough, and it is also too simple. It is not good as a neural network, as it has no deep learning aspects. 

A new approach is needed and this will be explored in the next section.

In [20]:
for score in player2, player4:

    Z1 = np.dot(score, W1) + b1
    print("Z1:", Z1)

    A1 = sigmoid(Z1)
    print("A1:", A1)

    Z2 = np.dot(A1, W2) + b2
    print("Z2:", Z2)
    
    A2 = sigmoid(Z2)
    print("A2:", A2, "\n")

Z1: [[13.82435231 22.29924026 28.54350497 20.63626752 17.50646587]]
A1: [[0.99999901 1.         1.         1.         0.99999998]]
Z2: [[0.32517111]]
A2: [[0.58058397]] 

Z1: [[1.77427221 2.288452   3.06610798 2.42176143 2.13733759]]
A1: [[0.85498816 0.90791611 0.95547288 0.91847174 0.89447958]]
Z2: [[0.30173352]]
A2: [[0.57486623]] 



#### 7. Build MLP with ReLU function

A Rectified Linear Unit (ReLU) activation function is now applied at the hidden layer, while the sigmoid function is kept for the output layer. ReLU was chosen because it helps mitigate the vanishing gradient problem and does not squash values too tightly together, giving the network more power to perform better. The ReLU formula is:

$ ReLU(x)=max(0,x) $

Wherein, for any input value x, the output is the same (x), unless the value is less than 0, in which case it is simply returned as 0. This ensures the values are not changed too much and unnecessarily.

In the next large code block, the ReLU function is created as well new forward propagation (forward_prop_re) and player competitive rating (player_ratings_rs) functions that build on each other. This code is adapted from Nguyen (2024). The final rating is rounded to three decimal places to make the differences between scores more apparent. 

In [21]:
def relu(z):

    """
    Rectified Linear Unit (ReLU) activation function. Helper function 
    for forward_prop_re.
    """

    return np.maximum(0, z)


def forward_prop_re(score):

    """
    Forward propagation with ReLU in hidden layer and sigmoid at output.
    """    

    Z1 = np.dot(score, W1) + b1
    A1 = relu(Z1)
    Z2 = np.dot(A1, W2) + b2
    A2 = sigmoid(Z2)    

    return A2


def player_ratings_rs(performance_arrays):

    """    
    Predict player ratings from one or more arrays of performance scores.
    Applies both ReLu and sigmoid functions with weights and biases. 
    Scores are normalised before being passed through forward propagation.
    
    Parameters:
    -   performance_arrays (list or np.ndarray) as a list or single 2D numpy 
        array, each of shape (1, 5) for player scores.
                                                 
    Returns:
    -   One or a list of floats as predicted player ratings, rounded to 3 
        decimal place.
        """
    
    if isinstance(performance_arrays, np.ndarray):
        performance_arrays = [performance_arrays]

    ratings = []
    for i, scores in enumerate(performance_arrays, start=1):
        norm_scores = scores / 100.0  
        output = forward_prop_re(norm_scores)
        rating = float(np.round(output[0, 0] * 10, 3))  
        print(f"Competitive rating for Player {i}: {rating}")
        ratings.append(rating)

#### 8. Run MLP with ReLU function and investigate results

The functions above are run using the same network architecture parameters, weights and biases as before (section 3 and 4). The results show a small improvement in that we can at least tell that a player is good or poor based on the decimals. 

Ignoring the integer 5, a perfect player (Player 1) can achieve a rating with .39 and a poor player (Player 4) is rated at .256. This is somewhat more useful if one understands that this apparently small difference is actually very great. Yet, the system is still not very successful in distinguishing clearly between players with vastly different abilities, in handling a "non-player" (.242), or even when working with negative scores (.207).

This is a somewhat more workable system, but still unfair to good players who would have to work very hard to see a difference in their rating compared to a player who is significantly less competitive, especially if these scores are rounded up further. The next step is to try training the network.

In [None]:
player_ratings_rs([  
    player1, player2, player3,  # Same performance scores as before
    player4, player5, player6,
])

Competitive rating for Player 1: 5.39
Competitive rating for Player 2: 5.364
Competitive rating for Player 3: 5.307
Competitive rating for Player 4: 5.256
Competitive rating for Player 5: 5.242
Competitive rating for Player 6: 5.207


#### 9. Introduce training to the MLP

To train the MLP, a training loop is added with sample performance scores as inputs or features (X) and target scores as the true lables (y). This approach is also known as backpropagation in neural networks. First, a simple code block is written that will operate as a loss function using the mean squared error (MSE) (mse_loss) (code adapted from Alake, 2024). Then, the training data is added and normalised. 

The learning rate and number of epochs (training rounds) are set in a new training function (train_mlp) (from Stack Overflow, 2025) to give the network time to adjust. Finally, the learning process is run with derivatives to improve the system. 

Ultimately, this gives us very specific updated weights and biases that will work effectively for this network, balancing unwanted effects of other functions in the MLP.

In [None]:
def mse_loss(y_true, y_predict):

    """
    Compute the mean squared error (MSE) for loss function during MLP training
    """
    
    return np.mean((y_true - y_predict) ** 2)

In [None]:
# Training data

X = np.array([
    [100, 100, 100, 100, 100],
    [88, 89, 86, 91, 96],
    [60, 70, 63, 66, 79],
    [40, 44, 49, 55, 56],
    [22, 20, 27, 35, 24],
    [9, 5, 2, 13, 8],
    [0, 0, 0, 0, 0],
    [-10, -60, -30, -50, -20]
])

X_norm = X / 100.0

# True labels

y = np.array([[10], [9], [6.8], [4.9],
    [2.6], [0.7], [0], [-3.4]
])

In [None]:
def train_mlp(X_norm, y, W1, b1, W2, b2, epochs=1000, learning_rate=0.01):
    
    """
    Trains a simple MLP using one hidden layer (ReLU) and a sigmoid output.

    Parameters:
    - X_norm: np.ndarray, shape (n_samples, input_features), normalised inputs
    - y: np.ndarray, shape (n_samples, 1), target ratings (scaled, e.g. 0-10)
    - W1, b1, W2, b2: initial weights and biases (numpy arrays)
    - epochs: int, number of training iterations
    - learning_rate: float, step size for updates

    Returns:
    - Trained W1, b1, W2, b2 as tuple
    """
    
    for epoch in range(epochs):
        
        # Forward propagation
        Z1 = np.dot(X_norm, W1) + b1
        A1 = relu(Z1)
        Z2 = np.dot(A1, W2) + b2
        A2 = sigmoid(Z2)
        
        # Loss calculation
        loss = mse_loss(y, A2 * 10)
        
        # Backward propagation
        dA2 = 2 * (A2 * 10 - y) / y.size
        dZ2 = dA2 * sigmoid(Z2) * (1 - sigmoid(Z2))
        dW2 = np.dot(A1.T, dZ2)
        db2 = np.sum(dZ2, axis=0, keepdims=True)
        
        dA1 = np.dot(dZ2, W2.T)
        dZ1 = dA1 * (Z1 > 0)  # Derivative of ReLU
        dW1 = np.dot(X_norm.T, dZ1)
        db1 = np.sum(dZ1, axis=0, keepdims=True)
        
        # Parameter updates (weights and biases)
        W1 -= learning_rate * dW1
        b1 -= learning_rate * db1
        W2 -= learning_rate * dW2
        b2 -= learning_rate * db2


# Run training loop
 
train_mlp(X_norm, y, W1, b1, W2, b2)

#### 10. Rerun MLP with ReLU after training

All we need to do now is rerun the most recently created function for MLP (player_ratings_rs) for the six test players and the updated weights and biases will be applied. The system is working much better now as our perfect player (Player 1) has a near-perfect competitive rating, while a non-player (Player 5) and one with hypothetical negative scores (Player 6) have a very low rating of less than 1 out of 10. 

The good player (Player 2) gets a suitably good rating and the mediocre (Player 3) and weak player (Player 4) also receive suitable ratings more reflective of their original performance scores. Even when rounded up to 1 decimal place as below, the competitive ratings are still acceptable:

* Player 1: 9.4

* Player 2: 8.5

* Player 3: 4.2

* Player 4: 1.0

* Player 5: 0.8

* Player 6: 0.8

Training has been essential to getting this system to work effectively and there are advantages to such a system, including that it is more robust and reliable.

In [26]:
player_ratings_rs([
    player1, player2, player3,
    player4, player5, player6
])

Competitive rating for Player 1: 9.355
Competitive rating for Player 2: 8.519
Competitive rating for Player 3: 4.208
Competitive rating for Player 4: 1.027
Competitive rating for Player 5: 0.774
Competitive rating for Player 6: 0.774


### References

Alake, R. (2024). Loss Functions in Machine Learning Explained. DataCamp. https://www.datacamp.com/tutorial/loss-function-in-machine-learning

HyperionDev. (2025). Neural Networks. Private repository, GitHub.

Nguyen, H. (2024). Code a 2-layer Neural Network from Scratch. Medium. https://medium.com/@hoangngbot/code-a-2-layer-neural-network-from-scratch-33d7db0f0e5f

Njoroge, G. (2024). Implementing Neural Networks from Scratch: A Step-by-Step Guide. Medium. https://medium.com/@njorogeofrey73/implementing-neural-networks-from-scratch-a-step-by-step-guide-478d58f02a59

NumPy Developers. (n.d.). np.random.uniform. https://numpy.org/doc/stable/reference/random/generated/numpy.random.uniform.html

Sena, M. (2024). A Step-by-Step Guide to Implementing Multi-Layer Perceptrons in Python. Medium. https://python.plainenglish.io/building-the-foundations-a-step-by-step-guide-to-implementing-multi-layer-perceptrons-in-python-51ebd9d7ecbe

Sena, M. (2023). Building a Perceptron from Scratch: A Step-by-Step Guide with Python. Medium. https://python.plainenglish.io/building-a-perceptron-from-scratch-a-step-by-step-guide-with-python-6b8722807b2e

Stack Overflow. (2025). Neural Network built from scratch using numpy isn't learning. https://stackoverflow.com/questions/79687024/neural-network-built-from-scratch-using-numpy-isnt-learning

Stack Overflow. (2020). Python Numpy: How does numpy.exp(x) coerce the return value of this sigmoid function into an ndarray? https://stackoverflow.com/questions/50594876/python-numpy-how-does-numpy-expx-coerce-the-return-value-of-this-sigmoid-func

Verman, S. (2021). Logistic Regression From Scratch in Python. Towards Data Science. https://towardsdatascience.com/logistic-regression-from-scratch-in-python-ec66603592e2

W3schools. (n.d.). Python isinstance() Function. https://www.w3schools.com/python/ref_func_isinstance.asp