# Homework #2: Simple Neural Network Implementation using Numpy

Today we will build a simple neural network from scratch in Python using only the numpy library. We will follow the instructions from the following link: https://iamtrask.github.io/2015/07/12/basic-python-network/
Basic Python Network
(Note: The code in the website is written in Python 2, and not Pytho, try writing the code yourself before reverting to the online examplen 
#1: Import Librarit numpy as np


### Define the Sigmoid Function and its Derivative
- Construct a function returning a sigmoid function:
$ \sigma(x) = \frac{1}{1 + e^{-x}} $
- Construct a function returning the derivative of a sigmoid function:
$ \frac{d\sigma(x)}{dx} = \sigma(x)(1 - \sigma(x)) $

### Initialize Weights
Build an array of three weights (3x1 array – think why these dimensions!) and initialize their value randomly. (It is good practice to use weights with normal distribution of $ \mu = 0 $ and  $ \sigma = \frac{1}{3}  $ )

### Training the Neural Network
Create a loop, iterating 1000 times (equal to the desired number of learning steps). For each iteration, calculate the difference between the network prediction and the real value of y. Multiply that difference with the sigmoid derivative and use the dot product of this number with the input layer to update your weights for the next iteration.
- Input and Output Data Sets
``` python
X = np.array([[0, 0, 1],
              [0, 1, 1],
              [1, 0, 1],
              [1, 1, 1]])

y = np.array([[0],
              [0],
              [1],
              [1]])on. 	


### Submission

1. Upload the file to your Google Colab or GitHub
2. Add your code, run it and test for correcrtness
3. Submit a link on moodle to Google Colab or GitHub. Please do not send me the file or a link by mail.
4. Make sure to share the link to your notebook with idan.tobis@gmail.com (or make it public)

# Submission


In [22]:
#imports
import numpy as np


## Define the Sigmoid Function and its Derivative


In [23]:
def sigmoid(x):
    """
    Sigmoid activation function.
    
    Formula: σ(x) = 1 / (1 + e^(-x))
    
    Args:
        x: Input value or array (can be a scalar or numpy array)
    
    Returns:
        The sigmoid of x
    """
    return 1 / (1 + np.exp(-x))


def sigmoid_derivative(x):
    """
    Derivative of the sigmoid function.
    
    Formula: dσ(x)/dx = σ(x) * (1 - σ(x))
    
    Args:
        x: Input value or array (can be a scalar or numpy array)
    
    Returns:
        The derivative of sigmoid at x
    """
    s = sigmoid(x)
    return s * (1 - s)

## Initialize Weights


In [24]:
def initialize_weights(seed=None):
    """
    Initialize weights for the neural network.
    
    Creates a 3x1 array of weights initialized with a normal distribution.
    The dimensions are 3x1 because:
    - Each input sample has 3 features (including bias term)
    - We need one weight per input feature
    - Output is a single value
    
    Args:
        seed: Optional random seed for reproducibility
    
    Returns:
        A 3x1 numpy array of weights with normal distribution (μ=0, σ=1/3)
    """
    if seed is not None:
        np.random.seed(seed)
    
    # Normal distribution: μ = 0, σ = 1/3
    # Shape: (3, 1) - 3 rows, 1 column
    weights = np.random.normal(0, 1/3, size=(3, 1))
    
    return weights

## Training the Neural Network


In [25]:

def train_neural_network(X, y, weights, iterations=1000, print_progress=False):
    """
    Train the neural network using gradient descent.
    
    For each iteration:
    1. Forward pass: Calculate network prediction
    2. Calculate error: Difference between prediction and actual value
    3. Calculate delta: Multiply error by sigmoid derivative
    4. Update weights: Use dot product of input layer with delta
    
    Args:
        X: Input data (n_samples x 3 array)
        y: Target output (n_samples x 1 array)
        weights: Initial weights (3 x 1 array)
        iterations: Number of training iterations (default: 1000)
        print_progress: Whether to print progress every 100 iterations
    
    Returns:
        Trained weights (3 x 1 array)
    """
    for iteration in range(iterations):
        # Forward pass: Calculate network prediction
        # X.dot(weights) gives us the weighted sum for each sample
        # Then apply sigmoid activation
        layer_output = sigmoid(np.dot(X, weights))
        
        # Calculate error: difference between prediction and actual value
        error = y - layer_output
        
        # Calculate delta: multiply error by sigmoid derivative
        # This gives us the gradient direction for weight updates
        delta = error * sigmoid_derivative(layer_output)
        
        # Update weights: use dot product of input layer (X.T) with delta
        # X.T.dot(delta) gives us the gradient for each weight
        weights += np.dot(X.T, delta)
        
        # Optional: Print progress
        if print_progress and (iteration % 100 == 0 or iteration == iterations - 1):
            print(f"Iteration {iteration}: Error = {np.mean(np.abs(error)):.6f}")
    
    return weights

# Testing and using made functions

In [26]:
# Input dataset: 4 samples, each with 3 features (2 inputs + 1 bias term)
X = np.array([[0, 0, 1],
              [0, 1, 1],
              [1, 0, 1],
              [1, 1, 1]])

# Output dataset: 4 samples, each with 1 output value
y = np.array([[0],
              [0],
              [1],
              [1]])

   # Test with a simple value
test_value = 0
print(f"Sigmoid({test_value}) = {sigmoid(test_value)}")
print(f"Sigmoid derivative({test_value}) = {sigmoid_derivative(test_value)}")

# Test with an array
test_array = np.array([-2, -1, 0, 1, 2])
print(f"\nSigmoid({test_array}) = {sigmoid(test_array)}")
print(f"Sigmoid derivative({test_array}) = {sigmoid_derivative(test_array)}")

# Test weight initialization
print("\n" + "="*60)
print("Part 2: Weight Initialization")
print("="*60)
weights = initialize_weights(seed=42)  # Using seed for reproducibility
print(f"Weights shape: {weights.shape}")
print(f"Weights:\n{weights}")
print(f"\nMean: {np.mean(weights):.6f} (should be close to 0)")
print(f"Std: {np.std(weights):.6f} (should be close to {1/3:.6f})")

# Test neural network training
print("\n" + "="*60)
print("Part 3: Training the Neural Network")
print("="*60)
print(f"Input data (X):\n{X}")
print(f"\nTarget output (y):\n{y}")

# Initialize weights for training
weights = initialize_weights(seed=42)
print(f"\nInitial weights:\n{weights}")

# Train the network
print("\nTraining for 1000 iterations...")
trained_weights = train_neural_network(X, y, weights, iterations=1000, print_progress=True)

# Test the trained network
print(f"\nTrained weights:\n{trained_weights}")
predictions = sigmoid(np.dot(X, trained_weights))
print(f"\nPredictions after training:\n{predictions}")
print(f"\nTarget values:\n{y}")
print(f"\nError (absolute difference):\n{np.abs(y - predictions)}")
print(f"\nMean absolute error: {np.mean(np.abs(y - predictions)):.6f}")

Sigmoid(0) = 0.5
Sigmoid derivative(0) = 0.25

Sigmoid([-2 -1  0  1  2]) = [0.11920292 0.26894142 0.5        0.73105858 0.88079708]
Sigmoid derivative([-2 -1  0  1  2]) = [0.10499359 0.19661193 0.25       0.19661193 0.10499359]

Part 2: Weight Initialization
Weights shape: (3, 1)
Weights:
[[ 0.16557138]
 [-0.0460881 ]
 [ 0.21589618]]

Mean: 0.111793 (should be close to 0)
Std: 0.113514 (should be close to 0.333333)

Part 3: Training the Neural Network
Input data (X):
[[0 0 1]
 [0 1 1]
 [1 0 1]
 [1 1 1]]

Target output (y):
[[0]
 [0]
 [1]
 [1]]

Initial weights:
[[ 0.16557138]
 [-0.0460881 ]
 [ 0.21589618]]

Training for 1000 iterations...
Iteration 0: Error = 0.479705
Iteration 100: Error = 0.056228
Iteration 200: Error = 0.028370
Iteration 300: Error = 0.018884
Iteration 400: Error = 0.014132
Iteration 500: Error = 0.011284
Iteration 600: Error = 0.009388
Iteration 700: Error = 0.008037
Iteration 800: Error = 0.007024
Iteration 900: Error = 0.006238
Iteration 999: Error = 0.005615

Tr