#### SyriaTel Customer Churn Prediction
Customer churn poses a major threat to SyriaTel,a telcom Service Provider, leading to revenue loss and increased marketing expenses. Retaining existing customers is significantly more cost-effective than acquiring new ones. This project aims to build a predictive model that identifies customers at high risk of churning, allowing SyriaTel to act before they leave.

By analyzing customer usage behavior, billing patterns, and service history, we can develop a churn prediction system that empowers the business to launch targeted retention efforts—such as personalized offers or proactive service follow-ups.

#### Problem Understanding

Business Challenge: SyriaTel wants to reduce churn by identifying customers likely to cancel their service.

Technical Approach: Build a binary classification model using customer activity, billing data, and service interactions to predict churn risk. The output will guide SyriaTel in taking data-driven actions to retain customers and minimize revenue loss.

In [2]:

# Import necessary libraries
# Data manipulation and visualization
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')

# Machine learning libraries
from sklearn.model_selection import (
    train_test_split, cross_val_score, GridSearchCV, StratifiedKFold
)
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.feature_selection import SelectKBest, f_classif

# Classifiers
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.tree import DecisionTreeClassifier

# Evaluation metrics
from sklearn.metrics import (
    classification_report, confusion_matrix, roc_auc_score, roc_curve,
    precision_recall_curve, accuracy_score, precision_score, recall_score, f1_score
)

# Plot settings
plt.style.use('default')
sns.set(style='whitegrid', palette='husl')


## **Data Loading and Initial Exploration**

We start by loading the dataset and gaining an initial understanding of its structure before diving into preprocessing or modeling.

Key steps in this section include:
- Loading the data from a CSV file
- Inspecting the dataset shape and column types
- Previewing the first few rows
- Checking for any missing values
- Generating basic descriptive statistics

In [8]:
# Load the dataset
df = pd.read_csv("C:/Users/user/Documents/MORINGA/Phase3/Phase3Project/TelcomData.csv")

In [9]:
df.shape

(3333, 21)