# Vehicle Sensor Data: Exploration and Cleaning

**Objective:** Load the raw sensor data from `data/raw/`, inspect it for issues, clean it, and save the processed version to `data/processed/` for the next step in the pipeline.

In [None]:
import pandas as pd
import os

### Step 1: Load Raw Data

In [None]:
raw_data_path = '../data/raw/sensor_data.csv'
df = pd.read_csv(raw_data_path)

### Step 2: Initial Exploration

In [None]:
print('--- First 5 Rows ---')
print(df.head())
print('\n--- Data Info ---')
df.info()
print('\n--- Missing Values Check ---')
print(df.isnull().sum())

### Step 3: Data Cleaning
The initial exploration shows missing values in several sensor columns. A simple and robust strategy for this project is to fill these gaps using the mean of each respective column. This ensures we don't lose any records.

In [None]:
df.fillna(df.mean(numeric_only=True), inplace=True)

print('\n--- Missing Values After Cleaning ---')
print(df.isnull().sum())

### Step 4: Save Processed Data
Save the cleaned DataFrame to the processed data folder, ensuring the directory exists first.

In [None]:
processed_folder_path = '../data/processed/'
os.makedirs(processed_folder_path, exist_ok=True) # Ensure the directory exists

processed_file_path = os.path.join(processed_folder_path, 'cleaned_sensor_data.csv')
df.to_csv(processed_file_path, index=False)

print(f'Cleaned data successfully saved to {processed_file_path}')