# PM2.5 Histogram (PM25 UART Logs)

This notebook loads a CSV produced by `pm25_uart_simpletest.py` and plots a frequency distribution (histogram) of the PM2.5 data.

Notes:
- The **first line** of the CSV is metadata (starts with `# meta,`), so we skip it when reading the table.
- The script logs both **pm25_standard** and **pm25_env**; we graph **pm25_env** by default.

In [None]:
import pathlib
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# Point this to your CSV file (copied/pulled from the Raspberry Pi)
csv_path = pathlib.Path('pm25_log_YYYYMMDD_HHMMSS.csv')

# Read metadata line
with csv_path.open('r', encoding='utf-8') as f:
    meta_line = f.readline().strip()

print('Metadata:', meta_line)

# Load the actual CSV table (skip the metadata line)
df = pd.read_csv(csv_path, skiprows=1)
df.head()

In [None]:
# Choose which PM2.5 column to histogram
col = 'pm25_env'  # or 'pm25_standard'

pm25 = pd.to_numeric(df[col], errors='coerce').dropna()
print('Count:', len(pm25))
print('Min/Max:', float(pm25.min()), float(pm25.max()))
print('Mean:', float(pm25.mean()))

In [None]:
plt.figure(figsize=(8, 4))
plt.hist(pm25, bins=15)
plt.title(f'PM2.5 Histogram ({col})')
plt.xlabel('PM2.5')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()

## Anything interesting?

Once you collect data, look for:
- A big spike at (or near) a single value (sensor quantization or stable air)
- A long right tail (rare high-particulate events)
- Two clusters (e.g., indoor vs outdoor air sources, or fan/door events)