### Prediction

In [52]:
import tensorflow as tf
from tensorflow.keras.models import load_model
import pickle
import pandas as pd
import numpy as np

In [53]:
### Load the trained model, scaler pickle,onehot
model=load_model('model.h5')

## load the encoder and scaler
with open('onehot_encoder_geo.pkl','rb') as file:
    label_encoder_geo=pickle.load(file)

with open('label_encoder_gender.pkl', 'rb') as file:
    label_encoder_gender = pickle.load(file)

with open('scaler.pkl', 'rb') as file:
    scaler = pickle.load(file)



In [54]:
# Example input data
input_data = {
    'CreditScore': 600,
    'Geography': 'France',
    'Gender': 'Male',
    'Age': 40,
    'Tenure': 3,
    'Balance': 60000,
    'NumOfProducts': 2,
    'HasCrCard': 1,
    'IsActiveMember': 1,
    'EstimatedSalary': 50000
}

In [55]:
# One-hot encode 'Geography'
geo_encoded = label_encoder_geo.transform([[input_data['Geography']]]).toarray()
geo_encoded_df = pd.DataFrame(geo_encoded, columns=label_encoder_geo.get_feature_names_out(['Geography']))
geo_encoded_df




Unnamed: 0,Geography_France,Geography_Germany,Geography_Spain
0,1.0,0.0,0.0


In [56]:
input_df = pd.DataFrame([input_data])
input_df

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,600,France,Male,40,3,60000,2,1,1,50000


In [57]:
## Encode categorical variables
input_df['Gender']=label_encoder_gender.transform(input_df['Gender'])
input_df

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,600,France,1,40,3,60000,2,1,1,50000


In [58]:
## concatination one hot encoded 
input_df = pd.concat([input_df.drop('Geography', axis=1), geo_encoded_df], axis=1)
input_df

Unnamed: 0,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Geography_France,Geography_Germany,Geography_Spain
0,600,1,40,3,60000,2,1,1,50000,1.0,0.0,0.0


In [60]:
##scaling the input data
input_scaled = scaler.transform(input_df)
input_scaled

array([[-0.53598516,  0.91324755,  0.10479359, -0.69539349, -0.25781119,
         0.80843615,  0.64920267,  0.97481699, -0.87683221,  1.00150113,
        -0.57946723, -0.57638802]])

In [61]:
## PRedict churn
prediction = model.predict(input_scaled)
prediction

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 147ms/step


array([[0.04899597]], dtype=float32)

In [68]:
prediction_proba = prediction[0][0]
prediction_proba

0.048995968

In [69]:
if prediction_proba > 0.5:
    print('The customer is likely to churn.')
else:
    print('The customer is not likely to churn.')

The customer is not likely to churn.


### Explaination 
    
    Loading the Model and Encoders:  demonstrating how to load the trained model, scaler, and one-hot encoding files from pickle files. This is crucial because these files contain the necessary information that the model needs to understand and process input data effectively. Without loading these, the model won’t function as intended.

    Preparing Input Data: The instructor provides a sample input that includes various features, such as credit score, geography, gender, age, and tenure. The need to convert input data into a suitable format for prediction is emphasized. Transforming raw input into a structured format is essential because machine learning models require numerical representations of data to perform calculations.

    Data Transformation: The lecturer explains how to convert categorical features (like geography and gender) into numerical values using techniques such as one-hot encoding and label encoding. This transformation is necessary because models can only process numerical data—not raw categories—enabling them to recognize patterns effectively.

    Scaling the Data: After transformation, the input data is scaled. Scaling is important to ensure that the input features are on a similar scale, which can significantly affect the model's performance. When features vary widely in their range of values, it can lead to inefficiencies in the learning process.

    Making Predictions: With the processed and scaled input data ready, the model makes predictions, specifically predicting whether a customer is likely to churn. The results are represented as probability values, and a threshold (like 0.5) is used to determine whether the customer is predicted to churn or not.

The main reason for taking these steps is to ensure that the input data is in the correct format and range for the model to produce accurate predictions. Upon loading the trained model, properly preparing and transforming the input data is vital to leverage the model's capabilities fully.
