# 02 - Data Preparation and Transformation

This notebook processes the raw battery data retrieved from InfluxDB by transforming it from a long format to a wide format (pivot table). This makes the data more suitable for machine learning model training.


## Load and Transform Raw Data

Load the raw battery data from CSV and pivot it so that each field becomes a separate column, making it easier to work with for analysis and modeling.


In [None]:
import pandas as pd

df = pd.read_csv("./data/battery_raw.csv")
df_pivot = df.pivot(index=["_time", "batteryId"], columns="_field", values="_value").reset_index()
df_pivot.rename(columns={"_time": "timestamp"}, inplace=True)

## Save Prepared Data

Save the transformed data to a new CSV file that will be used for model training.


In [None]:
df_pivot.to_csv("./data/battery_data.csv", index=False)

## Preview Prepared Data

Display the structure and sample of the prepared data to verify the transformation was successful.


In [None]:
print(f"Prepared dataset shape: {df_pivot.shape}")
df_pivot.head(10)