## Implement ANN from scratch on a [toy dataset](https://www.geeksforgeeks.org/data-science/toy-dataset-explanation-and-application/)

 [Artificial Neural Networks(ANN)](https://www.analyticsvidhya.com/blog/2021/10/implementing-artificial-neural-networkclassification-in-python-from-scratch/)  are part of supervised machine learning where we will be having input as well as corresponding output present in our dataset. Our whole aim is to figure out a way of mapping this input to the respective output. ANN can be used for solving both regression and classification problems.

In [27]:
%pip install scikit-learn

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: C:\Users\admin\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [28]:
#Importing necessary Libraries
import numpy as np
import pandas as pd

In [29]:
#Loading Dataset
data = pd.read_csv("Churn_Modelling.csv")
print(data.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  object 
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB
None


### Generating Matrix of features

The first 3 columns i.e RowNumber, CustomerId, and Surname have nothing to do with deciding whether the customer is going to exit or not..Hence we remove them.

In [30]:
X = data.iloc[:,3:-1].values
print(X)

[[619 'France' 'Female' ... 1 1 101348.88]
 [608 'Spain' 'Female' ... 0 1 112542.58]
 [502 'France' 'Female' ... 1 0 113931.57]
 ...
 [709 'France' 'Female' ... 0 1 42085.58]
 [772 'Germany' 'Male' ... 1 0 92888.52]
 [792 'France' 'Female' ... 1 0 38190.78]]


In [31]:
#Generating Dependent Variable Vectors
Y = data.iloc[:,-1].values

## Feature Engineering

### Encoding Categorical Variable Gender

In [32]:
from sklearn.preprocessing import LabelEncoder
LE1 = LabelEncoder()
X[:,2] = np.array(LE1.fit_transform(X[:,2]))

### Encoding Categorical Variable Country

In [33]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct =ColumnTransformer(transformers=[('encoder',OneHotEncoder(),[1])],remainder="passthrough")
X = np.array(ct.fit_transform(X))

### Splitting Dataset into Training and Testing Dataset

In [38]:
#Splitting dataset into training and testing dataset
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.2,random_state=0)

# Force y to be a column vector (8000, 1) instead of (8000,)
Y_train = Y_train.reshape(-1, 1)
Y_test = Y_test.reshape(-1, 1)

### Feature Scaling
- Standardization
- Normalization

In [39]:
#Performing Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

## Scratch Implementation (Hidden Layer)

This implementation uses one hidden layer. Predicting churn is too complex for a single neuron; a hidden layer allows the network to learn "features of features."

In [40]:
class ChurnANN:
    def __init__(self, input_size, hidden_size, learning_rate=0.01):
        # Weight Initialization (He Initialization)
        self.W1 = np.random.randn(input_size, hidden_size) * np.sqrt(2/input_size)
        self.b1 = np.zeros((1, hidden_size))
        self.W2 = np.random.randn(hidden_size, 1) * np.sqrt(2/hidden_size)
        self.b2 = np.zeros((1, 1))
        self.lr = learning_rate

    def sigmoid(self, z):
        return 1 / (1 + np.exp(-z))

    def sigmoid_derivative(self, a):
        return a * (1 - a)

    def forward(self, X):
        self.z1 = np.dot(X, self.W1) + self.b1
        self.a1 = self.sigmoid(self.z1)
        self.z2 = np.dot(self.a1, self.W2) + self.b2
        self.a2 = self.sigmoid(self.z2)
        return self.a2

    def backward(self, X, y, output):
        m = y.shape[0]
        
        # Error at Output Layer
        dz2 = output - y
        dW2 = np.dot(self.a1.T, dz2) / m
        db2 = np.sum(dz2, axis=0, keepdims=True) / m
        
        # Error at Hidden Layer (Backpropagated)
        dz1 = np.dot(dz2, self.W2.T) * self.sigmoid_derivative(self.a1)
        dW1 = np.dot(X.T, dz1) / m
        db1 = np.sum(dz1, axis=0, keepdims=True) / m
        
        # Update Weights
        self.W1 -= self.lr * dW1
        self.b1 -= self.lr * db1
        self.W2 -= self.lr * dW2
        self.b2 -= self.lr * db2

    def train(self, X, y, epochs=1000):
        for i in range(epochs):
            output = self.forward(X)
            self.backward(X, y, output)
            if i % 100 == 0:
                loss = np.mean(np.square(y - output))
                print(f"Epoch {i}, Loss: {loss:.4f}")



In [41]:
# Initialize and Train
input_dim = X_train.shape[1]  # Usually 12 columns
model = ChurnANN(input_size=input_dim, hidden_size=6, learning_rate=0.1)
model.train(X_train, Y_train, epochs=2000)



Epoch 0, Loss: 0.5045
Epoch 100, Loss: 0.1547
Epoch 200, Loss: 0.1505
Epoch 300, Loss: 0.1464
Epoch 400, Loss: 0.1428
Epoch 500, Loss: 0.1400
Epoch 600, Loss: 0.1382
Epoch 700, Loss: 0.1369
Epoch 800, Loss: 0.1361
Epoch 900, Loss: 0.1355
Epoch 1000, Loss: 0.1351
Epoch 1100, Loss: 0.1348
Epoch 1200, Loss: 0.1345
Epoch 1300, Loss: 0.1342
Epoch 1400, Loss: 0.1339
Epoch 1500, Loss: 0.1336
Epoch 1600, Loss: 0.1333
Epoch 1700, Loss: 0.1330
Epoch 1800, Loss: 0.1327
Epoch 1900, Loss: 0.1324


In [42]:
# Evaluate
predictions = (model.forward(X_test) > 0.5).astype(int)
accuracy = np.mean(predictions == Y_test)
print(f"Test Accuracy: {accuracy * 100:.2f}%")

Test Accuracy: 82.20%


### 