![Description](card.jpeg)


<div style="background-color:#000000; padding: 20px; border-radius: 5px; box-shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1); border-left: 4px solid #007acc;">
    <h1 style="font-size:22px; font-family:'Helvetica Neue', Helvetica, Arial, sans-serif; color:#ffffff;"><b>About Dataset</b></h1>
    <p style="font-size:16px; font-family:'Helvetica Neue', Helvetica, Arial, sans-serif; line-height: 1.5em; text-indent: 20px; color:#ffffff; font-style: italic;">
        This Credit Card Eligibility Dataset encompasses various factors that determine or influence a person's eligibility for a credit card. It includes demographic variables such as gender and employment status, family size, total income, education, occupation, and more. These elements collectively capture the complex nature of credit card assessments and are essential for evaluating creditworthiness and eligibility.
    </p>
</div>

| Column Name      | Description                                                                 |
|------------------|-----------------------------------------------------------------------------|
| ID               | An identifier for each individual (customer).                               |
| Gender           | The gender of the individual.                                              |
| Own_car          | A binary feature indicating whether the individual owns a car.              |
| Own_property     | A binary feature indicating whether the individual owns a property.         |
| Work_phone       | A binary feature indicating whether the individual has a work phone.        |
| Phone            | A binary feature indicating whether the individual has a phone.             |
| Email            | A binary feature indicating whether the individual has provided an email address. |
| Unemployed       | A binary feature indicating whether the individual is unemployed.           |
| Num_children     | The number of children the individual has.                                  |
| Num_family       | The total number of family members.                                         |
| Account_length   | The length of the individual's account with a bank or financial institution.|
| Total_income     | The total income of the individual.                                         |
| Age              | The age of the individual.                                                  |
| Years_employed   | The number of years the individual has been employed.                       |
| Income_type      | The type of income (e.g., employed, self-employed, etc.).                   |
| Education_type   | The education level of the individual.                                      |
| Family_status    | The family status of the individual.                                        |
| Housing_type     | The type of housing the individual lives in.                                |
| Occupation_type  | The type of occupation the individual is engaged in.                        |
| Target           | The target variable for the classification task, indicating whether the individual is eligible for a credit card or not (e.g., Yes/No, 1/0). |


<div style="background-color:#000000; padding: 20px; border-radius: 5px; box-shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1); border-left: 4px solid #007acc;">
    <h1 style="font-size:22px; font-family:'Helvetica Neue', Helvetica, Arial, sans-serif; color:#ffffff;"><b>Aims and Objectives</b></h1>
    <p style="font-size:16px; font-family:'Helvetica Neue', Helvetica, Arial, sans-serif; line-height: 1.5em; text-indent: 20px; color:#ffffff; font-style: italic;">
        The goal of this Credit Card Eligibility Dataset analysis is to investigate the factors influencing a person's eligibility for a credit card and to evaluate their creditworthiness.
       
    
</div>


<p style="background-color: #000000; font-family: 'Courier New', Courier, monospace; font-size: 28px; text-align: center; color: #FFFFFF; padding: 10px; border-radius: 20px;">
    🔍📈 Data Wrangling
</p>


<p style="background-color: #000000; font-family: 'Courier New', Courier, monospace; font-size: 28px; text-align: center; color: #FFFFFF; padding: 10px; border-radius: 20px;">
    Import Libraries
</p>


In [1]:
# Import Libraries
import pandas as pd
import numpy as np 
import seaborn as sns 
import matplotlib.pyplot as plt
import tensorflow as tf
# Import Train Test split and Grid Search Cv
from sklearn.model_selection import train_test_split, GridSearchCV
# Import LabelEncoder, StandardScaler, PolynomialFeatures
from sklearn.preprocessing import LabelEncoder, StandardScaler, PolynomialFeatures
# Column Transformer
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.feature_selection import SelectKBest, mutual_info_classif
from sklearn.ensemble import GradientBoostingClassifier
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectFromModel
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report
# Remove warnings 
import warnings
warnings.filterwarnings("ignore")

2024-05-27 20:59:59.145032: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-27 20:59:59.145265: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-27 20:59:59.334095: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


<p style="background-color: #000000; font-family: 'Courier New', Courier, monospace; font-size: 28px; text-align: center; color: #FFFFFF; padding: 10px; border-radius: 20px;">
    Dataset Overview
</p>


In [2]:
import pandas as pd
from IPython.display import display, HTML
import io

# Helper function to generate colored horizontal line
def colored_line(color='#ff00ff'):
    return f"<hr style='border: 2px solid {color};'>"

def print_dataset_analysis(df, n_top=5, heading_color='#d35400', line_color='#ff00ff', dash_color='#27ae60'):
    # Styling for the DataFrame display
    table_style = """
    <style>
        table {
            margin: 0 auto;
            border-collapse: collapse;
            color: #333;
        }
        th, td {
            padding: 8px;
            text-align: left;
        }
        tr:nth-child(even) {background-color: #f2f2f2;}
        tr:hover {background-color: #ddd;}
    </style>
    """
    
    # Printing top values
    train_heading = f"<h2 style='color:{heading_color}; text-align:center;'><strong>📋 Top {n_top} Rows of the Dataset</strong></h2>"
    
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(train_heading))
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(table_style + df.head(n_top).to_html(border=0, index=False)))

    # Printing bottom values
    bottom_heading = f"<h2 style='color:{heading_color}; text-align:center;'><strong>🔽 Bottom {n_top} Rows of the Dataset</strong></h2>"
    
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(bottom_heading))
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(table_style + df.tail(n_top).to_html(border=0, index=False)))

    # Printing dataset summary
    summary_heading = f"<h2 style='color:{heading_color}; text-align:center;'><strong>📊Dataset Summary Statistics</strong></h2>"
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(summary_heading))
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(table_style + df.describe().to_html(border=0)))

    # Printing dataset information
    info_heading = f"<h2 style='color:{heading_color}; text-align:center;'><strong>🔍Dataset Information</strong></h2>"
    buffer = io.StringIO()
    df.info(buf=buffer)
    info = buffer.getvalue().replace('\n', '<br>')
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(info_heading))
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(f"<pre>{info}</pre>"))

    # Printing dataset shape
    shape_heading = f"<h2 style='color:{heading_color}; text-align:center;'><strong>📏Dataset Shape</strong></h2>"
    shape_info = f"<p>Rows: {df.shape[0]}, Columns: {df.shape[1]}</p>"
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(shape_heading))
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(shape_info))

    # Printing missing values
    null_heading = f"<h2 style='color:{heading_color}; text-align:center;'><strong>❓ Missing Values in Dataset</strong></h2>"
    null_values = df.isnull().sum().to_frame().reset_index()
    null_values.columns = ['Column', 'Missing Values']
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(null_heading))
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(table_style + null_values.to_html(index=False, border=0)))

    # Printing duplicate rows
    duplicate_heading = f"<h2 style='color:{heading_color}; text-align:center;'><strong>♻️ Duplicate Rows in Dataset</strong></h2>"
    duplicate_count = df.duplicated().sum()
    duplicate_info = f"<p>Number of duplicate rows: {duplicate_count}</p>"
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(duplicate_heading))
    display(HTML(f"<hr style='border: 2px solid {dash_color};'>"))
    display(HTML(duplicate_info))

# Load the DataFrame
df = pd.read_csv("dataset.csv")

# Call the function to display the information with the new color scheme
print_dataset_analysis(df, heading_color='#d35400', line_color='#174747', dash_color='#174747')


ID,Gender,Own_car,Own_property,Work_phone,Phone,Email,Unemployed,Num_children,Num_family,Account_length,Total_income,Age,Years_employed,Income_type,Education_type,Family_status,Housing_type,Occupation_type,Target
5008804,1,1,1,1,0,0,0,0,2,15,427500.0,32.868574,12.435574,Working,Higher education,Civil marriage,Rented apartment,Other,1
5008806,1,1,1,0,0,0,0,0,2,29,112500.0,58.793815,3.104787,Working,Secondary / secondary special,Married,House / apartment,Security staff,0
5008808,0,0,1,0,1,1,0,0,1,4,270000.0,52.321403,8.353354,Commercial associate,Secondary / secondary special,Single / not married,House / apartment,Sales staff,0
5008812,0,0,1,0,0,0,1,0,1,20,283500.0,61.504343,0.0,Pensioner,Higher education,Separated,House / apartment,Other,0
5008815,1,1,1,1,1,1,0,0,2,5,270000.0,46.193967,2.10545,Working,Higher education,Married,House / apartment,Accountants,0


ID,Gender,Own_car,Own_property,Work_phone,Phone,Email,Unemployed,Num_children,Num_family,Account_length,Total_income,Age,Years_employed,Income_type,Education_type,Family_status,Housing_type,Occupation_type,Target
5148694,0,0,0,0,0,0,0,0,2,20,180000.0,56.400884,0.542106,Pensioner,Secondary / secondary special,Civil marriage,Municipal apartment,Laborers,1
5149055,0,0,1,1,1,0,0,0,2,19,112500.0,43.360233,7.375921,Commercial associate,Secondary / secondary special,Married,House / apartment,Other,1
5149729,1,1,1,0,0,0,0,0,2,21,90000.0,52.296762,4.711938,Working,Secondary / secondary special,Married,House / apartment,Other,1
5149838,0,0,1,0,1,1,0,0,2,32,157500.0,33.914454,3.627727,Pensioner,Higher education,Married,House / apartment,Medicine staff,1
5150337,1,0,1,0,0,0,0,0,1,13,112500.0,25.15589,3.266323,Working,Secondary / secondary special,Single / not married,Rented apartment,Laborers,1


Unnamed: 0,ID,Gender,Own_car,Own_property,Work_phone,Phone,Email,Unemployed,Num_children,Num_family,Account_length,Total_income,Age,Years_employed,Target
count,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0,9709.0
mean,5076105.0,0.348749,0.3677,0.671542,0.217427,0.287671,0.087548,0.174683,0.422804,2.182614,27.270059,181228.2,43.784093,5.66473,0.132145
std,40802.7,0.476599,0.482204,0.469677,0.412517,0.4527,0.28265,0.379716,0.767019,0.932918,16.648057,99277.31,11.625768,6.342241,0.338666
min,5008804.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,27000.0,20.504186,0.0,0.0
25%,5036955.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,13.0,112500.0,34.059563,0.92815,0.0
50%,5069449.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,2.0,26.0,157500.0,42.741466,3.761884,0.0
75%,5112986.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,1.0,3.0,41.0,225000.0,53.567151,8.200031,0.0
max,5150479.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,19.0,20.0,60.0,1575000.0,68.863837,43.020733,1.0


Column,Missing Values
ID,0
Gender,0
Own_car,0
Own_property,0
Work_phone,0
Phone,0
Email,0
Unemployed,0
Num_children,0
Num_family,0


In [3]:
!pip install autoviz

Collecting autoviz
  Downloading autoviz-0.1.904-py3-none-any.whl.metadata (14 kB)
Collecting xlrd (from autoviz)
  Downloading xlrd-2.0.1-py2.py3-none-any.whl.metadata (3.4 kB)
Collecting pyamg (from autoviz)
  Downloading pyamg-5.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.1 kB)
Collecting xgboost<1.7,>=0.82 (from autoviz)
  Downloading xgboost-1.6.2-py3-none-manylinux2014_x86_64.whl.metadata (1.8 kB)
Collecting pandas-dq>=1.29 (from autoviz)
  Downloading pandas_dq-1.29-py3-none-any.whl.metadata (19 kB)
Collecting hvplot>=0.9.2 (from autoviz)
  Downloading hvplot-0.10.0-py3-none-any.whl.metadata (15 kB)
Collecting seaborn>0.12.2 (from autoviz)
  Downloading seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)
Collecting nltk (from autoviz)
  Downloading nltk-3.8.1-py3-none-any.whl.metadata (2.8 kB)
Downloading autoviz-0.1.904-py3-none-any.whl (67 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.5/67.5 kB[0m [31m2.7 MB/s[0m eta [36m