# 1. Business Understanding

## 1.1 Background

SyriaTel is one of the largest telecommunications providers in Syria, offering mobile services to a broad range of customers.  
In the highly competitive telecom industry, retaining existing customers is critical for sustaining profitability and market share.  

However, customer churn — when subscribers discontinue using SyriaTel’s services — poses a significant business challenge.  
Losing customers not only decreases revenue but also increases costs due to the need for acquiring new customers through marketing and promotional efforts.

---

## 1.2 Business Objective

The key objective of this project is to develop a **predictive model** that accurately identifies customers likely to churn.  
With such a model, SyriaTel will be able to:

- Take **proactive retention measures** for high-risk customers  
- Design **targeted marketing campaigns**  
- Improve **customer satisfaction** by addressing key pain points  

Ultimately, this will help **reduce churn rates**, save on customer acquisition costs, and maximize long-term profitability.

---

## 1.3 Stakeholders

| Stakeholder | Interest |
|---|---|
| **Executive Management** | Strategic decisions on customer retention initiatives |
| **Marketing Team** | Design targeted offers and promotions |
| **Customer Service** | Improve customer engagement and support |
| **Data Science Team** | Develop and deploy predictive churn models |

---

## 1.4 Business Problem Statement

- Churn impacts both revenue and brand reputation.
- The cost of acquiring a new customer is significantly higher than retaining an existing one.
- The current challenge is **lack of early warning systems** for customer churn.

By predicting churn probability, SyriaTel can intervene before losing the customer.

---

## 1.5 Project Goals

- **Build a classification model** using customer data to predict churn risk.
- **Identify key drivers** (features) contributing to churn.
- **Translate data insights into actionable business recommendations.**

---

## 1.6 Success Metrics

- **Model Performance:** Achieve an acceptable balance between precision and recall (using classification metrics like F1-score).
- **Business Impact:** Provide insights that SyriaTel’s marketing and customer success teams can implement.
- **Actionable Insights:** Highlight features that are strong predictors of churn to inform business decisions.

---

## 1.7 Project Scope and Limitations

- The model will be based on historical customer data available at the time of this project.
- Predictions will be probabilistic — they suggest likelihood, not certainty.
- Results and recommendations are constrained by the quality and depth of the dataset provided.

---










# 2. Data Understanding

## 2.1 Business Context Recap
The objective of this project is to predict customer churn for **SyriaTel**, a telecommunications company.  
Churn refers to customers discontinuing their service, which directly impacts company revenue.  
Understanding the patterns leading to churn helps SyriaTel retain customers through targeted actions.

---

## 2.2 Dataset Overview
- The dataset contains records of SyriaTel customers with multiple attributes describing:
  - Demographics (e.g., gender, age group)
  - Account Information (e.g., contract type, payment method)
  - Service Usage Patterns (e.g., data usage, call minutes)
  - Subscription details (e.g., international plan, voicemail plan)
  - Customer status (Churn or Not)
  
---

## 2.3 Initial Data Checks
- Loaded the dataset successfully.
- Previewed the first few rows to understand the structure.
- Performed a `.info()` check to:
  - Verify data types (numerical, categorical)
  - Check for null or missing values
  - Confirm the size of the dataset
  
---

## 2.4 Dataset Summary
| Item | Value |
|---|---|
| Number of Rows | *To be filled after `df.shape` output* |
| Number of Columns | *To be filled after `df.shape` output* |
| Target Variable | `Churn` (Binary: Yes/No) |
| Data Types | Numerical, Categorical |
| Missing Values | *Summary from `isnull().sum()`* |
| Duplicates | To be checked in Data Preparation |

---

## 2.5 Observations & Considerations
- The dataset contains both numerical and categorical features.
- Class imbalance may be present in the target variable — this will be validated.
- Understanding variable distributions and relationships will guide feature engineering and model selection.
- No immediate anomalies observed in the preview, but a thorough analysis is planned in the EDA phase.

---

## 2.6 Next Steps
- Check for duplicates and missing data in **Data Preparation**.
- Explore distributions, correlations, and target class balance in **EDA**.
- Prepare data for machine learning models by encoding, scaling, and splitting.
