### 1. Overview & data choice for healthy vs aged classification

For this notebook I originally planned to use the **NASA PCoE Li-ion Battery Aging Datasets**  
(<https://data.nasa.gov/dataset/li-ion-battery-aging-datasets>), which contain full aging
trajectories down to clear **end-of-life (EoL)** and are widely used for **SoH / RUL**
studies. At the time of this work I could not access that dataset, so I instead reuse the
SoH-from-EIS dataset from  
[**02 – SoH from EIS (Rashid et al.)**](./02_SoH_from_EIS.ipynb) as a **stand-in** to
demonstrate a simple healthy vs aged **classification workflow**.

In real applications, a cell is typically considered to have reached EoL when its capacity
(or SoH) drops to around **80%** of nominal. The Rashid SoH-from-EIS dataset I use here
only covers SoH levels from **100% down to 80%** in 5% steps, and does **not** include
cells degraded below 80% like the NASA PCoE data would. To still illustrate a binary
screening task with the data at hand, I define:

- `healthy` = SoH ≥ 90%  
- `aged`    = SoH < 90%

This threshold is therefore **chosen for demonstration**, not as a universal definition
of EoL, and should be interpreted as “early screening for cells that have started to age”
rather than a hard safety or warranty limit. In practice, I would expect cells aged
**below 80% SoH to develop a noticeably different impedance fingerprint/signature** from
the SoH ranges considered here (SoH ≥ 90% and SoH < 90%), which could **substantially
change** both the models and the conclusions. The current notebook should therefore be
read as a **workflow demonstration** under limited data, not as a definitive analysis of
fully aged cells.

---

### 2. Dataset and feature pipeline (from 02 – SoH from EIS)

As in  
[**02 – SoH from EIS (Rashid et al.)**](./02_SoH_from_EIS.ipynb), I use the public dataset:

> Rashid, Muhammad; Faraji-Niri, Mona; Sansom, Jonathan; Sheikh, Muhammad;  
> Widanage, Dhammika; Marco, James (2023),  
> **“Dataset for rapid state of health estimation of lithium batteries using EIS and machine learning: Training and validation”**,  
> *Data in Brief*, 48, 109157, doi: 10.1016/j.dib.2023.109157.  
> Original data: **“DIB_Data”**, Mendeley Data, V3, doi: 10.17632/mn9fb7xdx6.3 (CC0 1.0).

Dataset highlights (same as in 02 – SoH from EIS):

- **25 cylindrical Li-ion cells**, aged from SoH 100% down to 80% in 5% steps  
  (100, 95, 90, 85, 80%).
- At each SoH stage, reference performance tests (capacity / SoH) plus **electrochemical
  impedance spectroscopy (EIS)** at multiple **SOC** and **temperature** conditions.
- Designed specifically to study **fast SoH estimation from EIS with machine learning**.

In this notebook I **reuse the same data loading, cleaning and feature engineering
pipeline** as in 02 – SoH from EIS:

- I load the raw EIS spectra and associated metadata,
- perform the same basic QC and filtering,
- and construct the same engineered impedance feature table (ohmic / low-frequency
  resistances, summary |Z| and phase statistics, and sampled spectral points).

For details of the data preparation steps, please refer to  
[**02 – SoH from EIS (Rashid et al.)**](./02_SoH_from_EIS.ipynb). Here I start directly
from the prepared feature table and focus on the **binary classification** of `healthy`
vs `aged` cells based on those impedance features.
