## Implementing a custom dense layer in python ##
You are provided with a base Layer class that defines the structure of a neural network layer. Your task is to implement a subclass called Dense, which represents a fully connected neural network layer. The Dense class should extend the Layer class and implement the following methods:

**Initialization (__init__):**

Define the layer with a specified number of neurons (n_units) and an optional input shape (input_shape).
Set up placeholders for the layer's weights (W), biases (w0), and optimizers.

**Weight Initialization (initialize):**

Initialize the weights W using a uniform distribution with a limit of 1 / sqrt(input_shape[0]), and bias w0 should be set to zero.
Initialize optimizers for W and w0.

**Parameter Count (parameters):**
Return the total number of trainable parameters in the layer, which includes the parameters in W and w0.

**Forward Pass (forward_pass):**
Compute the output of the layer by performing a dot product between the input X and the weight matrix W, and then adding the bias w0.
Backward Pass (backward_pass):

Calculate and return the gradient with respect to the input.
If the layer is trainable, update the weights and biases using the optimizer's update rule.
Output Shape (output_shape):

Return the shape of the output produced by the forward pass, which should be (self.n_units,).

**Objective:**
Extend the Layer class by implementing the Dense class to ensure it functions correctly within a neural network framework.

see the detais [here](https://www.deep-ml.com/problems/40)

## For exaplanation of backward pass see [this](https://chatgpt.com/c/68462133-5ce0-8006-a585-969e753b6795)

In [89]:
import numpy as np
import copy
import math

# DO NOT CHANGE SEED
np.random.seed(42)

# DO NOT CHANGE LAYER CLASS
class Layer(object):

	def set_input_shape(self, shape):
    
		self.input_shape = shape

	def layer_name(self):
		return self.__class__.__name__

	def parameters(self):
		return 0

	def forward_pass(self, X, training):
		raise NotImplementedError()

	def backward_pass(self, accum_grad):
		raise NotImplementedError()

	def output_shape(self):
		raise NotImplementedError()



In [90]:
# Your task is to implement the Dense class based on the above structure
class Dense(Layer):
	def __init__(self, n_units, input_shape=None):
		self.layer_input = None
		self.input_shape = input_shape
		self.n_units = n_units
		self.trainable = True
		self.W = None
		self.w0 = None
		self.W_optimizer = None
		self.w0_optimizer = None
		if input_shape is not None:
			self.set_input_shape(input_shape)
	
	def set_input_shape(self, shape):
		super().set_input_shape(shape)
		self.input_shape = shape

	def initialize(self, optimizer = None):
		limit = 1 / math.sqrt(self.input_shape[0])
		self.W = np.random.uniform( -limit, limit, (self.input_shape[0],  self.n_units))
		self.w0 = np.zeros((1, self.n_units))

		self.W_optimizer = copy.deepcopy(optimizer) if optimizer else None

		self.w0_optimizer = copy.deepcopy(optimizer) if optimizer else None


	def forward_pass(self, X):
		self.layer_input = X
		return X @ self.W + self.w0


	def backward_pass(self, accum_grad):
		grad_input = accum_grad @ self.W.T

		if self.trainable:
			grad_weight = self.layer_input.T @ accum_grad
			grad_bias  = np.sum(accum_grad, axis =0, keepdims=1)

			if self.W_optimizer:
				self.W_optimizer.update(self.W, grad_weight)
			else:
				self.W = self.W - 0.01 * grad_weight
			
			if self.w0_optimizer:
				self.w0_optimizer.update(self.w0, grad_bias)
			else:
				self.w0 = self.w0 - 0.01* grad_bias
		return grad_input

	def number_of_parameters():
		return np.prod(self.W.shape) + np.prod(self.w0.shape)
	
	def output_shape(self):
		return (self.n_unit,)

In [91]:
# Initialize a Dense layer with 3 neurons and input shape (2,)
dense_layer = Dense(n_units=3, input_shape=(2,))

# Define a mock optimizer with a simple update rule
class MockOptimizer:
    def update(self, weights, grad):
        return weights - 0.01 * grad

optimizer = MockOptimizer()

# Initialize the Dense layer with the mock optimizer
dense_layer.initialize(optimizer)

# Perform a forward pass with sample input data
X = np.array([[1, 2]])
output = dense_layer.forward_pass(X)
print("Forward pass output:", output)

# Perform a backward pass with sample gradient
accum_grad = np.array([[0.1, 0.2, 0.3]])
back_output = dense_layer.backward_pass(accum_grad)
print("Backward pass output:", back_output)

Forward pass output: [[ 0.10162127 -0.33551992 -0.64490545]]
Backward pass output: [[ 0.20816524 -0.22928937]]
