<a href="https://colab.research.google.com/github/shfaizan/GenAI/blob/main/Assignment_2_Faizan_Shaikh.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Study the back-propagation algorithm.

Based on the back propogation documents I can deduce that Backpropagation is a key algorithm in training neural networks, focusing on adjusting the weights of the network based on the error rate observed during training. It operates in two main phases: forward propagation, where the input data moves through the network to generate an output, and backward propagation, where the error from the output is traced back through the network to adjust the weights accordingly. This process helps in fine-tuning the network to achieve lower error rates, thereby enhancing its reliability and generalization capabilities.

There are overall four main steps in the backpropagation algorithm:

* Forward pass.
* Errors calculation.
* Backward pass.
* Weights update.

## Implement a classifier for the loan data with Decision as the output attribute. Prepare the data as needed. Submit the notebook file.

**Prepare the data:**

* Load the dataset.
* Handle missing values.
* Encode categorical variables.
* Split the dataset into training and testing sets.
* Build the neural network:

**Implementation of a Classifier:**

* Use a suitable algorithm like a Neural Network classifier which can be implemented using libraries like TensorFlow or PyTorch.

* Train the classifier on the training dataset.

* Evaluate the classifier on the test dataset.


### Imports and Dataset Loading:

In [8]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, classification_report

# Load the dataset
# I have named it loan.xlsx
data = pd.read_excel('loan.xlsx')


Here, we import necessary libraries for data manipulation, preprocessing, neural network modeling, and evaluation. Then, we load the dataset loan.xlsx into a Pandas DataFrame named data.

### Handling Missing Values and Encoding Categorical Variables:

In [9]:
# Handle missing values (if any)
data = data.dropna()

# Encode categorical variables
label_encoders = {}
categorical_columns = ['Sex', 'Res_status', 'Telephone', 'Occupation', 'Job_status', 'Liab_ref', 'Acc_ref', 'Decision']

for col in categorical_columns:
    le = LabelEncoder()
    data[col] = le.fit_transform(data[col].astype(str))
    label_encoders[col] = le


This part removes any rows with missing values and encodes categorical variables using LabelEncoder. It converts categorical values to numerical representations.

### Splitting Data and Standardization:

In [11]:
# Separate features and target variable
X = data.drop(columns='Decision')
y = data['Decision']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


This part splits the data into features (X) and the target variable (y). Then, it divides the dataset into training and testing sets using train_test_split(). Finally, it standardizes the features using StandardScaler().

### Neural Network Model Definition and Training:

In [13]:
# Define the neural network architecture and necessary functions
class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size

        # Initialize weights
        self.W1 = np.random.randn(self.input_size, self.hidden_size)
        self.W2 = np.random.randn(self.hidden_size, self.output_size)

    # I have used sigmoid becasuse The sigmoid and sigmoid_derivative functions
    # are crucial for the neural network's forward and backward propagation
    # processes. Here's an explanation of each function:


    def sigmoid(self, z):
        return 1 / (1 + np.exp(-z))

    def sigmoid_derivative(self, z):
        return z * (1 - z)

    def forward(self, X):
        self.Z1 = np.dot(X, self.W1)
        self.A1 = self.sigmoid(self.Z1)
        self.Z2 = np.dot(self.A1, self.W2)
        self.A2 = self.sigmoid(self.Z2)
        return self.A2

    def backward(self, X, y, output, learning_rate):
        self.error = y - output
        self.output_gradient = self.error * self.sigmoid_derivative(output)

        self.A1_error = self.output_gradient.dot(self.W2.T)
        self.A1_gradient = self.A1_error * self.sigmoid_derivative(self.A1)

        self.W2 += self.A1.T.dot(self.output_gradient) * learning_rate
        self.W1 += X.T.dot(self.A1_gradient) * learning_rate

    def train(self, X, y, learning_rate=0.01, epochs=10000):
        y = y.values.reshape(-1, 1)  # Convert Series to array and reshape
        for epoch in range(epochs):
            output = self.forward(X)
            self.backward(X, y, output, learning_rate)
            if (epoch + 1) % 1000 == 0:
                loss = np.mean(np.square(y - output))
                print(f'Epoch {epoch + 1}, Loss: {loss}')

    def predict(self, X):
        output = self.forward(X)
        return np.round(output)

# Initialize the neural network
input_size = X_train.shape[1]
hidden_size = 10  # You can change the hidden layer size
output_size = 1

nn = NeuralNetwork(input_size, hidden_size, output_size)

# Train the neural network
nn.train(X_train, y_train, learning_rate=0.01, epochs=10000)

Epoch 1000, Loss: 0.10206939770344144
Epoch 2000, Loss: 0.07309702825677016
Epoch 3000, Loss: 0.05451387765249801
Epoch 4000, Loss: 0.04371849339706554
Epoch 5000, Loss: 0.03655970885435705
Epoch 6000, Loss: 0.03253101923231155
Epoch 7000, Loss: 0.029031888314536562
Epoch 8000, Loss: 0.027179945044461827
Epoch 9000, Loss: 0.025950909727851648
Epoch 10000, Loss: 0.025051719122141335


This part defines a neural network class with methods for initialization, forward pass, backward pass, training, and prediction. Then, it initializes a neural network object (nn) with the specified input, hidden, and output sizes, and trains the neural network on the training data.

### Classifier Model Training and Evaluation:

In [14]:
# Implement the classifier
clf = MLPClassifier(hidden_layer_sizes=(10, 10), max_iter=1000, random_state=42)
clf.fit(X_train, y_train)

# Make predictions
y_pred = clf.predict(X_test)

# Evaluate the classifier
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))


Accuracy: 0.7325581395348837
Classification Report:
               precision    recall  f1-score   support

           0       0.78      0.65      0.71        43
           1       0.70      0.81      0.75        43

    accuracy                           0.73        86
   macro avg       0.74      0.73      0.73        86
weighted avg       0.74      0.73      0.73        86





This part initializes and trains a classifier (MLPClassifier) with specified parameters on the training data. Then, it makes predictions on the testing data and evaluates the classifier's performance using accuracy and classification report.

### Displaying Target Variable Mapping and User Input Prediction Function:

In [16]:
# Display the mapping of the target variable
decision_mapping = dict(enumerate(label_encoders['Decision'].classes_))
print("Decision Mapping:", decision_mapping)

# Function to predict decision based on user input
def predict_decision():
    # Gather user input for each feature
    user_data = {}
    for col in X.columns:
        if col in categorical_columns:
            print(f"Enter {col} ({', '.join(label_encoders[col].classes_)}):")
            user_input = input().strip()
            while user_input not in label_encoders[col].classes_:
                print(f"Invalid input. Please enter one of the following: {', '.join(label_encoders[col].classes_)}")
                user_input = input().strip()
            user_data[col] = label_encoders[col].transform([user_input])[0]
        else:
            print(f"Enter {col} (numeric value):")
            user_input = input().strip()
            while not user_input.replace('.', '', 1).isdigit():
                print("Invalid input. Please enter a numeric value.")
                user_input = input().strip()
            user_data[col] = float(user_input)

    # Convert user input into DataFrame
    user_df = pd.DataFrame(user_data, index=[0])

    # Standardize the user input
    user_df_scaled = scaler.transform(user_df)

    # Predict the decision
    prediction = clf.predict(user_df_scaled)
    predicted_decision = le_decision.inverse_transform(prediction)

    print(f"The predicted decision is: {predicted_decision[0]}")

# Call the function to predict decision based on user input
predict_decision()


Decision Mapping: {0: 'accept', 1: 'reject'}
Enter Sex (F, M):
M
Enter Age (numeric value):
25
Enter Time_at_address (numeric value):
14
Enter Res_status (owner, rent):
owner
Enter Telephone (given, not_given):
given
Enter Occupation (creative_, driver, executive, guard_etc, labourer, manager, office_st, productio, professio, sales, semi_pro, unemploye):
manager
Enter Job_status (governmen, military, private_s, retired, self_empl, student, unemploye):
private_s
Enter Time_employed (numeric value):
5
Enter Time_bank (numeric value):
14
Enter Liab_ref (f, t):
t
Enter Acc_ref (given, oth_inst_):
given
Enter Home_Expn (numeric value):
145
Enter Balance (numeric value):
2000
The predicted decision is: accept


This part displays the mapping of the target variable and defines a function predict_decision() to predict the decision based on user input. Finally, it calls the predict_decision() function to demonstrate how to use it.

As seen in the above example I have added the input and the predicted decision came as accept.