# 📘 `lowFreqToolKit` Explanation and Test Coverage

This notebook documents the structure, logic, and unit testing coverage of the `lowFreqToolKit` for low-frequency meteorological data analysis and visualization.

It includes:
- Class and function explanations from the toolkit
- What each function does
- How each function is tested (if applicable)
- References to real test cases in the `test_unit_lowfreq_toolkit.py` file

## 🧱 Class: `lowFreqToolKit`
**File:** `toolkit.py`

This is the main class responsible for managing the low-frequency meteorological data. It inherits from `abstractToolkit` and provides access to two subsystems:
- `analysis`: for processing and statistical operations
- `presentation`: for data visualization

**Main properties:**
- `self._analysis`: instance of `analysis` class
- `self._presentation`: instance of `presenation` class
- `docType`: returns toolkit-specific identifier string

**Test coverage:**
- `test_init_toolkit_structure`: ensures both components exist
- `test_docType`: verifies format of the document type string

## 🔎 Class: `analysis`
**File:** `analysis.py`

Handles temporal enrichment of data and statistical operations.

**Functions:**
- `addDatesColumns(data, datecolumn)`: adds `yearonly`, `monthonly`, `season`, etc.
  - ✅ `test_add_dates_columns`
- `calcHourlyDist(data, Field, normalization)`: calculates hourly distribution as 2D histogram
  - ✅ `test_calc_hourly_dist_max_normalized`
  - ✅ `test_calc_hourly_dist_density`
- `resampleSecondMoments(...)`: computes second-order statistics like variance and covariance
  - ✅ `test_resample_second_moments`

**Supporting function:** `_calculateCov`: internal helper, tested indirectly.

## 📊 Class: `presenation`
**File:** `presentationLayer.py`

Acts as a bridge for visualization, offering access to:
- `dailyPlots`: instance of `DailyPlots`
- `seasonalPlots`: instance of `SeasonalPlots`

**Test coverage:** Verified implicitly via `toolkit.presentation` initialization in multiple tests.

## 📈 Class: `DailyPlots` (inherits from `Plots`)
Provides daily visualizations for a selected meteorological field.

**Functions:**
- `plotScatter(data, plotField)`: scatter plot of field vs. hour
  - ✅ `test_plotScatter`
  - ✅ `test_plotScatter_matches_data`
  - ✅ `test_plotScatter_empty_dataframe`
  - ✅ `test_plotScatter_with_nan_and_outliers`
  - ✅ `test_plotScatter_WS_field`, `WD_field`
  - ✅ `test_plotScatter_axis_labels`
  - ✅ `test_plotScatter_creates_non_empty_image`

- `dateLinePlot(data, plotField, date)`: line plot for a single day
  - ✅ `test_dateLinePlot`
  - ✅ `test_dateLinePlot_matches_data`

- `plotProbContourf(...)`: 2D probability contour plot
  - ✅ `test_plotProbContourf`
  - ✅ `test_plotProbContourf_distribution_ranges`

**Each function handles datetime parsing, filtering invalid values, and applying customized seaborn or matplotlib settings.**

## 🌦️ Class: `SeasonalPlots` (inherits from `Plots`)
**Function:** `plotProbContourf_bySeason(...)`
Generates a 2x2 seasonal grid of contour plots using the same logic as `plotProbContourf` but filtered by seasons defined in `seasonsdict`.

- ✅ `test_plotProbContourf_bySeason`
- ✅ `test_plotProbContourf_bySeason_basic` (structure check)

## 🧪 Summary of Unit Test Coverage

- Toolkit initialization: ✅
- Analysis functions: ✅
- Plot functions:
  - `plotScatter`: ✅
  - `dateLinePlot`: ✅
  - `plotProbContourf`: ✅
  - `plotProbContourf_bySeason`: ✅
- Error handling: ✅ (empty dataframe, invalid fields)
- Output validation: ✅ (values, axis labels, images)

This comprehensive test suite ensures the correctness, robustness, and consistency of the visualization and analysis pipeline for meteorological data.

## 🚀 Full Example: How to Use `lowFreqToolKit`
This section demonstrates how to initialize and use the `lowFreqToolKit` from raw meteorological data to analysis and visualization.

**Important**: The example assumes the data file is located under the environment variable `HERA_UNITTEST_DATA`. Please set this variable to the root data folder, e.g.:

```bash
export HERA_UNITTEST_DATA=/home/ilay/hera_unittest_data
```


In [None]:
# Step 1: Import modules and resolve data path using environment variable
import os
import pandas as pd
from hera.measurements.meteorology.lowfreqdata.toolkit import lowFreqToolKit

# Load data using environment variable
data_path = os.path.join(os.environ['HERA_UNITTEST_DATA'],
                          'measurements/meteorology/lowfreqdata/YAVNEEL.parquet')
df = pd.read_parquet(data_path)
df["datetime"] = pd.to_datetime(df["datetime"], utc=True)
df.head()

In [None]:
# Step 2: Create the toolkit instance
toolkit = lowFreqToolKit(df)

### 🔄 Enrich Data with Time Columns (Optional Step)
Toolkit does this internally, but you can also do it manually if needed.

In [None]:
df["datetime"] = pd.to_datetime(df["datetime"], utc=True, errors="coerce")

try:
    df_enriched = toolkit.analysis.addDatesColumns(df)
    display(df_enriched[["yearonly", "monthonly", "season"]].head())
except Exception as e:
    print("❌ ERROR:", repr(e))
    print("datetime column dtype:", df["datetime"].dtype)
    print("NaT values in datetime:", df["datetime"].isnull().sum())

### 📌 Plot 1: Scatter Plot of Relative Humidity (RH)
This plot shows RH values across the day in hourly resolution.

In [None]:
ax = toolkit.presentation.dailyPlots.plotScatter(df, plotField='RH')
ax.set_title('Scatter Plot: Relative Humidity over 24 Hours')

### 📌 Plot 2: Line Plot for a Specific Date
Visualizes a full day of RH values to detect patterns or anomalies.

In [None]:
sample_date = df['datetime'].dt.date.astype(str).iloc[0]
ax, line = toolkit.presentation.dailyPlots.dateLinePlot(df, plotField='RH', date=sample_date)
ax.set_title(f'Daily Line Plot: RH on {sample_date}')

### 📌 Plot 3: Probability Contour Plot
Displays the likelihood (density) of RH across hours of the day.

In [None]:
CS, CFS, ax = toolkit.presentation.dailyPlots.plotProbContourf(df, plotField='RH')
ax.set_title('Probability Contour of Relative Humidity')

### 📌 Plot 4: Seasonal Contour Plots
Compares RH distributions across the 4 meteorological seasons.

In [None]:
ax_grid = toolkit.presentation.seasonalPlots.plotProbContourf_bySeason(df, plotField='RH')

## ✅ Explanation of Unit Tests
Here is a summary of the unit tests and what each one checks:

- `test_plotScatter`: Verifies scatter plot runs and returns Axes.
- `test_plotScatter_matches_data`: Ensures plotted Y values exist in raw data.
- `test_plotScatter_WS_field` / `WD_field`: Validates support for other fields.
- `test_plotScatter_axis_labels`: Checks that X and Y labels are correctly set.
- `test_plotScatter_empty_dataframe`: Ensures empty input doesn’t crash plot.
- `test_plotScatter_with_nan_and_outliers`: Ensures invalid values are filtered.
- `test_plotScatter_creates_non_empty_image`: Confirms that plot image is not empty.
- `test_dateLinePlot`: Verifies line plot runs correctly for a given date.
- `test_dateLinePlot_matches_data`: Ensures exact Y values match original data.
- `test_plotProbContourf`: Verifies 2D histogram plot runs.
- `test_plotProbContourf_distribution_ranges`: Checks axis limits cover all data.
- `test_plotProbContourf_bySeason`: Verifies seasonal plots generate a 2x2 grid.
- `test_plotProbContourf_bySeason_basic`: Structural validation of seasonal grid.
- `test_add_dates_columns`: Ensures temporal columns are added correctly.
- `test_calc_hourly_dist_max_normalized`: Validates max-normalized histograms.
- `test_calc_hourly_dist_density`: Validates density normalization.
- `test_resample_second_moments`: Confirms second moment computations run without error.