<a href="https://colab.research.google.com/github/mishad01/Data-Science-Machine-Learning/blob/main/Thesis%20topic%20practice/1_(ANN)_Artificial_Neural_Network_using_Weight_Initialization_Tricks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
#Importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

In [2]:
#Importing dataset
dataset = pd.read_csv('/content/drive/MyDrive/Thesis Practice/Churn_Modelling.csv')
dataset

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.00,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.80,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.00,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.10,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,9996,15606229,Obijiaku,771,France,Male,39,5,0.00,2,1,0,96270.64,0
9996,9997,15569892,Johnstone,516,France,Male,35,10,57369.61,1,1,1,101699.77,0
9997,9998,15584532,Liu,709,France,Female,36,7,0.00,1,0,1,42085.58,1
9998,9999,15682355,Sabbatini,772,Germany,Male,42,3,75075.31,2,1,0,92888.52,1


### Explanation:
- **`X = dataset.iloc[:, 3:13]`**
  - `iloc` is a method in Pandas used for selecting data by integer index positions.
  - `:` indicates selection of all rows.
  - `3:13` specifies the columns starting from index **3** up to, but not including, index **13**.
  - This selects the features (independent variables) for machine learning training.

- **`Y = dataset.iloc[:, 13]`**
  - Selects all rows but only the column at index **13**.
  - This is typically the target variable (dependent variable) in supervised machine learning.

### Example Breakdown (for your file, **Churn_Modelling.csv**):
If the dataset looks like this:

| RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited |
|-----------|------------|---------|-------------|-----------|--------|-----|--------|---------|---------------|-----------|----------------|-----------------|--------|
| 1         | 15634602   | Hargrave| 619         | France    | Female | 42  | 2      | 0.00    | 1             | 1         | 1              | 101348.88       | 1      |

- `X` will select columns from **CreditScore** to **EstimatedSalary** (index 3 to 12).
- `Y` will select the **Exited** column (index 13).

### Purpose:
- `X` typically represents the feature set (input data for model training).
- `Y` is the target variable (output data the model learns to predict).

Would you like me to inspect your dataset further or help visualize this?

In [3]:
X = dataset.iloc[:,3:13];
Y = dataset.iloc[:,13];

#Category Features (Geography and Gender)

### Explanation:
- **`pd.get_dummies()`**
  - This function converts categorical variables (like **Geography** and **Gender**) into dummy/indicator variables for machine learning models.
  - These dummy variables allow models to interpret categorical data as numeric features.

- **Parameters:**
  - `X['Geography']` and `X['Gender']`: Select the **Geography** and **Gender** columns from the dataset `X`.
  - `drop_first=True`: Drops one dummy variable to avoid the **dummy variable trap** (where all dummy columns add up to one, causing multicollinearity in linear models).

---

### Example Breakdown:
#### Suppose your data looks like:
| Geography | Gender |
|-----------|--------|
| France    | Female |
| Spain     | Male   |
| Germany   | Female |

After running `pd.get_dummies()`:

#### For Geography (`drop_first=True`):
| Spain | Germany |
|-------|---------|
| 0     | 0       |
| 1     | 0       |
| 0     | 1       |

**France** is dropped as the reference category.

#### For Gender (`drop_first=True`):
| Male |
|------|
| 0    |
| 1    |
| 0    |

**Female** is dropped as the reference category.

---

### Why is this necessary?
Machine learning models need numerical input, so converting categorical variables into dummy variables enables the models to interpret them effectively. Dropping the first column prevents redundancy and ensures better model performance.


In [4]:
geography = pd.get_dummies(X['Geography'], drop_first=True)
gender = pd.get_dummies(X['Gender'], drop_first=True)

In [5]:
geography

Unnamed: 0,Germany,Spain
0,False,False
1,False,True
2,False,False
3,False,False
4,False,True
...,...,...
9995,False,False
9996,False,False
9997,False,False
9998,True,False


### Explanation:
- **`pd.concat()`**
  - This function concatenates DataFrames or Series along a specific axis.
  
- **Parameters:**
  - `[X, geography, gender]`: A list of DataFrames or Series to concatenate.
    - `X`: The original feature set.
    - `geography` and `gender`: Dummy variables created earlier.
  - `axis=1`: Concatenates along columns (horizontally).

---

### Purpose:
This code combines the original features (`X`) with the dummy variables (`geography` and `gender`) to create a complete feature set for model training.

### Example:
If `X` originally looked like:

| CreditScore | Geography | Gender |
|-------------|-----------|--------|
| 619         | France    | Female |
| 700         | Spain     | Male   |

And `geography` and `gender` looked like:

| Spain | Germany |
|-------|---------|
| 0     | 0       |
| 1     | 0       |

| Male |
|------|
| 0    |
| 1    |

After concatenation (`X = pd.concat([X, geography, gender], axis=1)`), `X` will become:

| CreditScore | Spain | Germany | Male |
|-------------|-------|---------|------|
| 619         | 0     | 0       | 0    |
| 700         | 1     | 0       | 1    |

---



In [11]:
X = pd.concat([X,geography,gender],axis=1)
X

Unnamed: 0,Germany,Spain,Male
0,False,False,False
1,False,True,False
2,False,False,False
3,False,False,False
4,False,True,False
...,...,...,...
9995,False,False,True
9996,False,False,True
9997,False,False,False
9998,True,False,True


In [14]:
X = X.drop(['Geography', 'Gender'], axis=1, inplace=True) #Dropping Unnecessary Column


KeyError: "['Geography', 'Gender'] not found in axis"

In [15]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=0)
X_test


Unnamed: 0,Germany,Spain,Male
9394,True,False,False
898,False,False,False
2398,False,True,False
5906,False,False,True
2343,True,False,True
...,...,...,...
1037,False,False,False
2899,False,False,False
9549,False,True,True
2740,True,False,True


### **Why Do We Scale Features?**
- Different features in your dataset may have very different ranges.
  - Example:
    - `Age`: ranges from 18 to 70
    - `Income`: ranges from 20,000 to 500,000
- Machine learning models (like logistic regression, KNN, and neural networks) often perform better when features are on similar scales.

### **What Is Feature Scaling?**
Feature scaling brings all the features to a **similar scale** without distorting differences in their values. This helps the model learn better and faster.

---

### **What Is `StandardScaler` Doing?**
`StandardScaler` transforms your data by:
1. **Subtracting the Mean** (so the new mean becomes `0`)
2. **Dividing by the Standard Deviation** (so the standard deviation becomes `1`)

This formula is applied to each value:

\[
X_{scaled} = \frac{X - \mu}{\sigma}
\]

Where:
- \(X\) is the original value
- \(\mu\) is the mean of the feature
- \(\sigma\) is the standard deviation

---

### **Example**
Suppose `X_train` has an "Age" feature:

| Age |
|-----|
| 20  |
| 35  |
| 50  |

The mean of this column is 35, and the standard deviation is around 15.

After scaling:

| Scaled Age |
|------------|
| -1         |
| 0          |
| 1          |

---

### **Why Not Scale `Y_train`?**
`Y_train` is the target variable, not a feature. We don't scale the target because it represents the output the model needs to predict.

---

### **Output of `fit_transform()` and `transform()`**
After scaling:
- `X_train` and `X_test` become NumPy arrays with standardized values.
- They will be ready for training your machine learning model.


In [17]:
#Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [18]:
import keras  # Keras library for building neural networks
from keras.models import Sequential  # Sequential model for stacking layers linearly
from keras.layers import Dense  # Fully connected layer for the network
from keras.layers import LeakyReLU, PReLU, ELU  # Activation functions for introducing non-linearity
from keras.layers import Dropout  # Regularization technique to prevent overfitting


In [24]:
#Initializing the ANN
classifier = Sequential() #Empty Neural Network

#Adding the input layer and the first hidden layer
classifier.add(Dense(units= 6,kernel_initializer='he_uniform',activation='relu',input_dim = 11))

#Adding the second hiden layer
classifier.add(Dense(units= 6,kernel_initializer='he_uniform',activation='relu'))

#Adding the output layer
classifier.add(Dense(units= 1,kernel_initializer='glorot_uniform',activation='sigmoid'))

#Compiling the ANN
classifier.compile();

#Fitting the ANN to the Training set
classifier.fit();

AttributeError: 'NoneType' object has no attribute 'shape'