## Churn Data â€” Raw Loading & Parquet Export

### By:
jdg

### Date:
2026-02-21

### Description:

Reads the raw Telco Customer Churn CSV and persists it as a Parquet file in the
`02_intermediate` layer. No transformations are applied â€” the only goal is to
materialise the data in a typed, efficient format for downstream notebooks.


## ðŸ“š Import libraries

In [1]:
from pathlib import Path

import pandas as pd

## ðŸ’¾ Load data

In [2]:
RAW_PATH = Path("../../data/01_raw/Churn/Clientes_Telcomunicaciones-Churn.csv")

df = pd.read_csv(RAW_PATH)

print(f"Loaded: {df.shape[0]:,} rows x {df.shape[1]} columns")
df.head()

Loaded: 14,214 rows x 21 columns


Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,...,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,Female,0.0,Yes,No,1.0,No,No phone service,DSL,No,...,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85,No
1,5575-GNVDE,Male,0.0,No,No,34.0,Yes,No,DSL,Yes,...,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5,No
2,3668-QPYBK,Male,0.0,No,No,2.0,Yes,No,DSL,Yes,...,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15,Yes
3,7795-CFOCW,Male,0.0,No,No,45.0,No,No phone service,DSL,Yes,...,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75,No
4,9237-HQITU,Female,0.0,No,No,2.0,Yes,No,Fiber optic,No,...,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65,Yes


In [3]:
# Remove customerID column before exporting
df = df.drop(columns=["customerID"])

print(f"Exporting: {df.shape[0]:,} rows x {df.shape[1]} columns")
df.head()

Exporting: 14,214 rows x 20 columns


Unnamed: 0,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,Female,0.0,Yes,No,1.0,No,No phone service,DSL,No,Yes,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85,No
1,Male,0.0,No,No,34.0,Yes,No,DSL,Yes,No,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5,No
2,Male,0.0,No,No,2.0,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15,Yes
3,Male,0.0,No,No,45.0,No,No phone service,DSL,Yes,No,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75,No
4,Female,0.0,No,No,2.0,Yes,No,Fiber optic,No,No,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65,Yes


## ðŸ‘· Export to intermediate layer

In [4]:
INTERMEDIATE_PATH = Path("../../data/02_intermediate/Churn/churn_raw.parquet")
INTERMEDIATE_PATH.parent.mkdir(parents=True, exist_ok=True)

df.to_parquet(INTERMEDIATE_PATH, index=False)

print(f"Saved to: {INTERMEDIATE_PATH}")
print(f"File size: {INTERMEDIATE_PATH.stat().st_size / 1024:.1f} KB")

Saved to: ../../data/02_intermediate/Churn/churn_raw.parquet
File size: 152.8 KB
