# Neural Networks example

### Importing Libraries

In [1]:
import numpy as np

### Define the sigmoid activation function

In [2]:
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

### Define the derivative of sigmoid function

In [3]:
def sigmoid_derivative(z):
    return z * (1 - z)

### Initialize parameters (weights and biases)

- Initialize the weights (W1 and W2) and biases (b1 and b2) for the neural network.
- These parameters are initialized with small random values to start training

In [4]:
def initialize_parameters(input_size, hidden_size, output_size):
    np.random.seed(0)
    W1 = np.random.randn(hidden_size, input_size) * 0.01
    b1 = np.zeros((hidden_size, 1))
    W2 = np.random.randn(output_size, hidden_size) * 0.01
    b2 = np.zeros((output_size, 1))
    
    parameters = {
        'W1': W1,
        'b1': b1,
        'W2': W2,
        'b2': b2
    }
    
    return parameters

### Forward propagation

- Perform forward propagation through the neural network.
- Calculate the activations (A1 and A2) in the hidden and output layers.
- Use the tanh activation function for the hidden layer and sigmoid for the output layer.
- Store intermediate values (Z1, A1, Z2, A2) in a cache for later use in backpropagation.

In [5]:
def forward_propagation(X, parameters):
    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    
    Z1 = np.dot(W1, X) + b1
    A1 = np.tanh(Z1)  # Corrected activation function
    Z2 = np.dot(W2, A1) + b2
    A2 = sigmoid(Z2)
    
    cache = {
        'Z1': Z1,
        'A1': A1,
        'Z2': Z2,
        'A2': A2
    }
    
    return A2, cache

### Compute the cross-entropy loss

- Compute the cross-entropy loss between the predicted values (A2) and the true labels (Y).
- This loss measures how well the network is performing and is used to update the parameters during training.

In [6]:
def compute_loss(A2, Y):
    m = Y.shape[1]
    loss = -1/m * np.sum(Y * np.log(A2) + (1 - Y) * np.log(1 - A2))
    return loss

### Backward propagation

- Perform backward propagation to compute gradients.
- Calculate gradients of the cost function with respect to the parameters (dW1, db1, dW2, db2) using the chain rule.
- The gradients indicate how much each parameter should be adjusted to minimize the loss.

In [7]:
def backward_propagation(parameters, cache, X, Y):
    m = X.shape[1]
    
    A1 = cache['A1']
    A2 = cache['A2']
    W2 = parameters['W2']
    
    dZ2 = A2 - Y
    dW2 = 1/m * np.dot(dZ2, A1.T)
    db2 = 1/m * np.sum(dZ2, axis=1, keepdims=True)
    dZ1 = np.dot(W2.T, dZ2) * (1 - np.power(A1, 2))  # Corrected derivative
    dW1 = 1/m * np.dot(dZ1, X.T)
    db1 = 1/m * np.sum(dZ1, axis=1, keepdims=True)
    
    gradients = {
        'dW1': dW1,
        'db1': db1,
        'dW2': dW2,
        'db2': db2
    }
    
    return gradients

### Update parameters using gradient descent

- Update the parameters (W1, b1, W2, b2) using gradient descent.
- Adjust the parameters in the direction that reduces the loss.
- The learning rate (learning_rate) controls the step size for updates.

In [8]:
def update_parameters(parameters, gradients, learning_rate):
    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    
    dW1 = gradients['dW1']
    db1 = gradients['db1']
    dW2 = gradients['dW2']
    db2 = gradients['db2']
    
    W1 -= learning_rate * dW1
    b1 -= learning_rate * db1
    W2 -= learning_rate * dW2
    b2 -= learning_rate * db2
    
    updated_parameters = {
        'W1': W1,
        'b1': b1,
        'W2': W2,
        'b2': b2
    }
    
    return updated_parameters

### Train the neural network

- Initialize parameters and hyperparameters (input size, hidden size, output size, number of iterations, and learning rate).
- Iterate through training for a specified number of iterations.
- In each iteration, perform forward and backward propagation, compute the loss, and update parameters.
- Print the loss at regular intervals to monitor training progress.

In [9]:
def train_neural_network(X, Y, hidden_size, num_iterations, learning_rate):
    input_size = X.shape[0]
    output_size = Y.shape[0]
    
    parameters = initialize_parameters(input_size, hidden_size, output_size)
    
    for i in range(num_iterations):
        A2, cache = forward_propagation(X, parameters)
        loss = compute_loss(A2, Y)
        gradients = backward_propagation(parameters, cache, X, Y)
        parameters = update_parameters(parameters, gradients, learning_rate)
        
        if i % 100 == 0:
            print(f"Iteration {i}: Loss = {loss:.4f}")
    
    return parameters

### Make predictions using the trained model

- Use the trained model to make predictions on new data.
- Apply the forward propagation to the new data to get predictions.
- Classify predictions based on a threshold of 0.5 (sigmoid output).

In [10]:
def predict(parameters, X):
    A2, _ = forward_propagation(X, parameters)
    predictions = (A2 > 0.5).astype(int)
    return predictions

### Generate sample data

- Generate synthetic data (X and Y) for a 2-class classification problem.
- The data consists of 2 features (X[0] and X[1]) and binary labels (0 or 1).

In [11]:
np.random.seed(1)
X = np.random.randn(2, 400)
Y = ((X[0, :] ** 2 + X[1, :] ** 2) < 1).astype(int)
Y = Y.reshape(1, -1)  # Reshape Y to be a 1D array

### Train the neural network

- Set the hyperparameters (hidden size, number of iterations, and learning rate).
- Call train_neural_network to train the neural network using the generated data.
- The network learns to classify points inside a circle as 1 and points outside as 0.

In [12]:
hidden_size = 4
num_iterations = 1000
learning_rate = 1.2
trained_parameters = train_neural_network(X, Y, hidden_size, num_iterations, learning_rate)

Iteration 0: Loss = 0.6932
Iteration 100: Loss = 0.6730
Iteration 200: Loss = 0.5470
Iteration 300: Loss = 0.2143
Iteration 400: Loss = 0.1451
Iteration 500: Loss = 0.1251
Iteration 600: Loss = 0.1126
Iteration 700: Loss = 0.1033
Iteration 800: Loss = 0.0961
Iteration 900: Loss = 0.0903


### Make predictions on new data

- Create a new dataset (new_data) for making predictions.
- Call the predict function to classify the new data points based on the trained model

In [13]:
new_data = np.array([[-0.8, -0.8], [0.8, 0.8]])
predictions = predict(trained_parameters, new_data)

print("Predictions for new data:")
print(predictions)

Predictions for new data:
[[0 0]]


- [0 0]: This means that both of the new data points were predicted to belong to class 0.
- 0 usually represents the negative class or "no" class
- So, in your case, the neural network predicted that both of the new data points do not belong to the positive class (class 1) and are assigned to the negative class (class 0).