<div style="padding:20px;
            color:white;
            margin:10;
            font-size:170%;
            text-align:left;
            display:fill;
            border-radius:5px;
            background-color:#64A36F;
            overflow:hidden;
            font-weight:700;
            font-weight:bold;"><span style='color:#CDA63A'>|</span> Introduction</div>





Telco companies face a critical need to acquire new customers while minimizing customer churn, which incurs significant costs for the company. The primary challenge lies in accurately predicting whether individual customers will churn and identifying the primary factors contributing to churn.

<div style="padding:20px;
            color:white;
            margin:10;
            font-size:170%;
            text-align:left;
            display:fill;
            border-radius:5px;
            background-color:#64A36F;
            overflow:hidden;
            font-weight:700"><span style='color:#CDA63A'>|</span> Table of Contents</div>


<a id="toc"></a>
- [1. Set-up](#1)
    - [1.1 Import Libraries](#1.1)
    - [1.2 Import Data](#1.2)
    - [1.3 Understanding data set characteristics](#1.3)
    - [1.4 Identifying dataset attributes](#1.4)
- [2. Exploring data set](#2)
    - [2.1 Quick overview](#2.1)
- [3. Data preprocessing](#3)
    - [3.1 Dealing with missing values](#3.1)
    - [3.2 Dealin with error: could not convert string to float:](#3.2)
    - [3.3 Dealing with duplicated values](#3.3)
    - [3.4 Creating numerical and categorical lists](#3.4)

<a id="1.1"></a>
## <b>1.1 <span>Import Libraries</span></b> 

In [1]:
import numpy as np
import pandas as pd

<a id="1.2"></a>
## <b>1.2 <span>Import Data</span></b> 

In [2]:
df_ = pd.read_csv("/kaggle/input/telco-customer-churn/WA_Fn-UseC_-Telco-Customer-Churn.csv")

<a id="1.3"></a>
## <b>1.3 <span>Understanding data set characteristics</span></b> 

The dataset contains customer information organized in rows, where each row represents an individual customer. The columns in the dataset provide specific details about these customers. 

* Churn: This column indicates whether a customer has recently terminated their service within the past month.

* Services: Each customer's subscription details are listed in this column, including phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies.

* Customer account information: This section includes various aspects of a customer's account, such as their tenure as a customer, contract type, payment method, preference for paperless billing, monthly charges, and total charges.

* Demographic information: This column provides insights into the customers' demographic characteristics, such as gender, age range, and whether they have partners and dependents.

<a id="1.4"></a>
## <b>1.4 <span>Identifying dataset attributes</span></b> 

<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css" />

<table>
  <tr>
    <th>Attribute</th>
    <th>Icon</th>
    <th>Description</th>
  </tr>
  <tr>
    <td>customerID</td>
    <td><i class="fas fa-id-card"></i></td>
    <td>Customer ID</td>
  </tr>
  <tr>
    <td>gender</td>
    <td><i class="fas fa-venus-mars"></i></td>
    <td>Whether the customer is a male or a female</td>
  </tr>
  <tr>
    <td>SeniorCitizen</td>
    <td><i class="fas fa-user-alt"></i></td>
    <td>Whether the customer is a senior citizen (1, 0)</td>
  </tr>
  <tr>
    <td>Partner</td>
    <td><i class="fas fa-users"></i></td>
    <td>Whether the customer has a partner (Yes, No)</td>
  </tr>
  <tr>
    <td>Dependents</td>
    <td><i class="fas fa-child"></i></td>
    <td>Whether the customer has dependents (Yes, No)</td>
  </tr>
  <tr>
    <td>tenure</td>
    <td><i class="fas fa-hourglass-half"></i></td>
    <td>Number of months the customer has stayed with the company</td>
  </tr>
  <tr>
    <td>PhoneService</td>
    <td><i class="fas fa-phone"></i></td>
    <td>Whether the customer has a phone service (Yes, No)</td>
  </tr>
  <tr>
    <td>MultipleLines</td>
    <td><i class="fas fa-project-diagram"></i></td>
    <td>Whether the customer has multiple lines (Yes, No, No phone service)</td>
  </tr>
  <tr>
    <td>InternetService</td>
    <td><i class="fas fa-wifi"></i></td>
    <td>Customer’s internet service provider (DSL, Fiber optic, No)</td>
  </tr>
  <tr>
    <td>OnlineSecurity</td>
    <td><i class="fas fa-shield-alt"></i></td>
    <td>Whether the customer has online security (Yes, No, No internet service)</td>
  </tr>
  <tr>
    <td>OnlineBackup</td>
    <td><i class="fas fa-hdd"></i></td>
    <td>Whether the customer has online backup or not (Yes, No, No internet service)</td>
  </tr>
  <tr>
    <td>DeviceProtection</td>
    <td><i class="fas fa-shield-virus"></i></td>
    <td>Whether the customer has device protection (Yes, No, No internet service)</td>
  </tr>
  <tr>
    <td>TechSupport</td>
    <td><i class="fas fa-headset"></i></td>
    <td>Whether the customer has tech support (Yes, No, No internet service)</td>
  </tr>
  <tr>
    <td>StreamingTV</td>
    <td><i class="fas fa-tv"></i></td>
    <td>Whether the customer has streaming TV service (Yes, No, No internet service)</td>
  </tr>
  <tr>
    <td>StreamingMovies</td>
    <td><i class="fas fa-film"></i></td>
    <td>Whether the customer has streaming movies service (Yes, No, No internet service)</td>
  </tr>
  <tr>
    <td>Contract</td>
    <td><i class="fas fa-file-contract"></i></td>
    <td>Indicates the type of contract (Month-to-month, One year, Two year)</td>
  </tr>
  <tr>
    <td>PaperlessBilling</td>
    <td><i class="fas fa-file-invoice-dollar"></i></td>
    <td>Whether the customer has paperless billing (Yes, No)</td>
  </tr>
  <tr>
    <td>PaymentMethod</td>
    <td><i class="fas fa-credit-card"></i></td>
    <td>Indicates the payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))</td>
  </tr>
  <tr>
    <td>MonthlyCharges</td>
    <td><i class="fas fa-dollar-sign"></i></td>
    <td>Indicates the current monthly subscription cost of the customer</td>
  </tr>
  <tr>
    <td>TotalCharges</td>
    <td><i class="fas fa-dollar-sign"></i></td>
    <td>Indicates the total charges paid by the customer so far</td>
  </tr>
  <tr>
    <td>Churn</td>
    <td><i class="fas fa-sign-out-alt"></i></td>
    <td>Indicates whether the customer churned</td>
  </tr>
</table>
