# NI Water Quality Data API Testing

During development of [NI Water](https://github.com/andrewbolster/bolster/pull/1009), the dataset changed underfoot and became disconnected from the [OpenDataNI dataset](https://admin.opendatani.gov.uk/dataset/ni-water-customer-tap-authorised-supply-point-results) that drove the mapping between Zones and Postcodes.

## API Migration

The NI Water API has been updated from the old `.ashx` endpoint to a new REST API:

- **Old API**: `https://www.niwater.com/water-quality-lookup.ashx?z={zone_code}` (no longer working)
- **New API**: `https://www.niwater.com/api/water-quality/getitem?p={postcode}` (postcode-based lookup)
- **Multi-address**: `https://www.niwater.com/api/water-quality/getitem?z={zone}&p={postcode}` (for postcodes with multiple addresses)

This notebook demonstrates how to use the updated `bolster.data_sources.ni_water` module with the new API.

In [None]:
import sys
import os
from pathlib import Path

# Get the absolute path to the src directory
src_path = Path("../src").resolve()
sys.path.insert(0, str(src_path))

import bolster
from bolster.data_sources.ni_water import *
import pandas as pd

## Example 1: Get Water Quality by Postcode (Single Address)

Most postcodes have a single water supply zone. The new API allows direct lookup by postcode:

In [None]:
# Get water quality data for a single-address postcode
data = get_water_quality_by_postcode('BT14 7EJ')
print(f"Water Supply Zone: {data['Water Supply Zone']}")
print(f"\nHardness Classification: {data['NI Hardness Classification']}")
print(f"Total Hardness: {data['Total Hardness (mg/l)']} mg/l")
data

## Example 2: Get Water Quality by Postcode (Multiple Addresses)

Some postcodes contain multiple addresses served by different water supply zones. The API detects this and automatically uses the first available zone:

In [None]:
# This postcode has multiple addresses with different zones
# The function will automatically select the first zone (ZS0107)
data_multi = get_water_quality_by_postcode('BT12 4PE')
print(f"Water Supply Zone: {data_multi['Water Supply Zone']}")
print(f"Hardness Classification: {data_multi['NI Hardness Classification']}")
data_multi

## Example 3: Specify Zone Code for Multi-Address Postcodes

If you need a specific address within a multi-address postcode, you can provide the zone code:

In [None]:
# Get data for a specific zone within the postcode
# BT12 4PE has both ZS0107 and ZS0101 zones available
data_specific = get_water_quality_by_postcode('BT12 4PE', zone_code='ZS0101')
print(f"Water Supply Zone: {data_specific['Water Supply Zone']}")
print(f"Hardness Classification: {data_specific['NI Hardness Classification']}")
data_specific

## Example 4: Get Water Quality by Zone Code

The zone-based lookup still works, but now internally uses the postcode API by finding a postcode for the zone:

In [None]:
# Get water quality data by zone code (backward compatible)
data_zone = get_water_quality_by_zone('ZS0101')
print(f"Water Supply Zone: {data_zone['Water Supply Zone']}")
print(f"Hardness Classification: {data_zone['NI Hardness Classification']}")
data_zone

## Example 5: Postcode to Water Supply Zone Mapping

The postcode-to-zone mapping comes from OpenDataNI and is used to look up zones:

In [None]:
mapping = get_postcode_to_water_supply_zone()
print(f"Total postcodes in mapping: {len(mapping)}")
print(f"Total unique zones: {len(set(mapping.values()))}")
print(f"\nSample mappings:")
for postcode, zone in list(mapping.items())[:5]:
    print(f"  {postcode} -> {zone}")

In [None]:
# Find all postcodes for a specific zone
zone_postcodes = {code for code, zone in mapping.items() if zone == 'ZS0101'}
print(f"Postcodes in zone ZS0101: {len(zone_postcodes)}")
print(f"Sample postcodes: {list(zone_postcodes)[:10]}")

## Example 6: Get All Water Quality Data

Retrieve water quality data for all valid supply zones (this takes several minutes):

In [None]:
# Get all water quality data (WARNING: slow operation)
df = get_water_quality()
print(f"Total zones with data: {len(df)}")
print(f"\nHardness classification distribution:")
print(df['NI Hardness Classification'].value_counts())
df.head()

In [None]:
# View the full dataset
df

## Data Fields

Each water quality record includes the following fields:

In [None]:
# Show all available data fields
sample_data = get_water_quality_by_postcode('BT14 7EJ')
print("Available fields:")
for field in sample_data.index:
    print(f"  - {field}")