## Churn Data â€” Raw Loading & Parquet Export

### By:
jdg

### Date:
2026-02-21

### Description:

Reads the raw Telco Customer Churn CSV and persists it as a Parquet file in the
`02_intermediate` layer. No transformations are applied â€” the only goal is to
materialise the data in a typed, efficient format for downstream notebooks.


## ðŸ“š Import libraries

In [23]:
from pathlib import Path

import pandas as pd

## ðŸ’¾ Load data

In [None]:
RAW_PATH = Path("../../data/01_raw/Churn/Clientes_Telcomunicaciones-Churn.csv")

df = pd.read_csv(RAW_PATH)

print(f"Loaded: {df.shape[0]:,} rows x {df.shape[1]} columns")
df.head()

In [None]:
# Remove customerID column before exporting
df = df.drop(columns=["customerID"])

print(f"Exporting: {df.shape[0]:,} rows x {df.shape[1]} columns")
df.head()

## ðŸ‘· Export to intermediate layer

In [26]:
INTERMEDIATE_PATH = Path("../../data/02_intermediate/Churn/churn_raw.parquet")
INTERMEDIATE_PATH.parent.mkdir(parents=True, exist_ok=True)

df.to_parquet(INTERMEDIATE_PATH, index=False)

print(f"Saved to: {INTERMEDIATE_PATH}")
print(f"File size: {INTERMEDIATE_PATH.stat().st_size / 1024:.1f} KB")

Saved to: ..\..\data\02_intermediate\Churn\churn_raw.parquet
File size: 152.8 KB


## ðŸ“Š Analysis of Results and Conclusions

Description of the results obtained and if there are conclusions that can be drawn from them.

The analysis of results must be related to the description of the task.

**Note:** An analysis of results does not necessarily lead to conclusions, but to ideas or proposals for future work


*(Fill in after running the notebook)*

Expected observations to document:
- Total rows and columns loaded from the raw CSV
- Confirm the parquet file was written successfully to `02_intermediate`


## ðŸ’¡ Proposals and Ideas

From the results obtained, what ideas or proposals can be generated to continue with the project


- Proceed to `2-exploration/01_jdg_churn_data_description_20260221.ipynb` to inspect
  data types, missing values, and apply initial fixes.


## ðŸ“– References

- Dataset description: `data/01_raw/Churn/Clientes-Telecomunicaciones-Chrun-Info.txt`
- Original dataset (do not use): https://www.kaggle.com/blastchar/telco-customer-churn
