Sources:

* <a href="https://environment.data.gov.uk/catchment-planning/api/docs" target="_blank">Catchment Data</a>

<br>
<br>

# Preliminaries

## Libraries

In [1]:
import logging
import os
import zipfile
import io
import requests

import pandas as pd
import geopandas as gpd
import folium

import seaborn as sns

<br>

Custom

In [2]:
import src.functions.archives
import src.functions.directories
import src.functions.streams

In [3]:
archives = src.functions.archives.Archives()
streams = src.functions.streams.Streams()

<br>

## Logging

In [4]:
logging.basicConfig(level=logging.INFO,
                    format='\n%(message)s\n%(asctime)s.%(msecs)03d',
                    datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger(__name__)

<br>

## Settings

Graph Settings

In [5]:
sns.set_style("white")
sns.set_context("poster")
sns.set(font_scale=1.5)

<br>

Paths

In [6]:
root = os.getcwd()
logger.info(root)


J:\library\thirdreading\experiment
2023-06-23 21:20:19.451


<br>

# Exploration

## SHAPE

In [7]:
blob = 'https://environment.data.gov.uk/catchment-planning/England/shapefile.zip'
path = os.path.join(os.getcwd(), 'data', 'mapping', 'island')
archives.directory(path)

<br>

## GeoJSON

In [8]:
url = 'https://code.highcharts.com/mapdata/countries/gb/gb-all.geo.json'
readings = gpd.read_file(filename=url)

In [9]:
readings.tail().iloc[:, -9:]

Unnamed: 0,subregion,woe-name,fips,latitude,woe-label,postal-code,type,name,geometry
228,Kent,Medway,UK24,51.3541,,MW,Unitary Authority,Medway,"POLYGON ((4977.000 761.000, 4994.000 766.000, ..."
229,Bedfordshire,Luton Borough,UK02,51.8763,,LU,Unitary Authority,Luton,"POLYGON ((4318.000 1244.000, 4356.000 1164.000..."
230,Wiltshire,Wiltshire,UK46,51.3289,,WL,Unitary Single-Tier County,Wiltshire,"POLYGON ((3343.000 928.000, 3284.000 821.000, ..."
231,The Islands,Shetland Islands,UK83,60.3097,,,Unitary District (city),Shetland Islands,"MULTIPOLYGON (((4780.000 7650.000, 4779.000 76..."
232,,,,,,,,,"LINESTRING (4078.000 9677.000, 4078.000 7445.000)"


<br>

# Hydrometry, Elements, Discharges

## Hydrometry Stations

In [10]:
storage = os.path.join(root, 'warehouse', 'hydrometry', 'references')

<br>

The gazetteer of hydrometry stations.

In [11]:
gazetteer = streams.read(uri = os.path.join(storage, 'gazetteer.csv'), header=0)
gazetteer.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8004 entries, 0 to 8003
Data columns (total 16 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   station_id             8004 non-null   object 
 1   station_guid           6499 non-null   object 
 2   easting                8004 non-null   int64  
 3   northing               8004 non-null   int64  
 4   longitude              8004 non-null   float64
 5   latitude               8004 non-null   float64
 6   catchment_area         649 non-null    float64
 7   station_name           8004 non-null   object 
 8   river_name             3539 non-null   object 
 9   date_opened            7987 non-null   object 
 10  date_closed            1439 non-null   object 
 11  wiski_id               6499 non-null   object 
 12  river_level_tool_id    1780 non-null   float64
 13  rfa_station_id         809 non-null    float64
 14  station_status         7987 non-null   object 
 15  stat

<br>

The measures associated with each station.

In [12]:
sources = streams.read(uri = os.path.join(storage, 'sources.csv'), header=0)
sources = sources.copy()[['station_id', 'station_guid', 'segment', 'measure']]
sources.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 41863 entries, 0 to 41862
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   station_id    41863 non-null  object
 1   station_guid  20331 non-null  object
 2   segment       41863 non-null  object
 3   measure       41863 non-null  object
dtypes: object(4)
memory usage: 1.3+ MB


<br>

### Focusing on water integrity stations


In [13]:
logger.info(sources['segment'].unique())


['is_groundwater' 'is_rainfall_station' 'is_sampling_location'
 'is_integrity_station']
2023-06-23 21:20:20.474


<br>

Hence

In [14]:
excerpt = sources.loc[sources['segment'] == 'is_integrity_station', :]
excerpt.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10766 entries, 20485 to 41862
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   station_id    10766 non-null  object
 1   station_guid  0 non-null      object
 2   segment       10766 non-null  object
 3   measure       10766 non-null  object
dtypes: object(4)
memory usage: 420.5+ KB


<br>

Excluding measure details, i.e., focusing on identifiers and co$\ddot{o}$rdinates of the water integrity stations.

In [15]:
points = excerpt[['station_id', 'station_guid', 'segment']].drop_duplicates()
points = gazetteer.merge(points, how='right', on=['station_id', 'station_guid'])
points.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1505 entries, 0 to 1504
Data columns (total 17 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   station_id             1505 non-null   object 
 1   station_guid           0 non-null      object 
 2   easting                1505 non-null   int64  
 3   northing               1505 non-null   int64  
 4   longitude              1505 non-null   float64
 5   latitude               1505 non-null   float64
 6   catchment_area         0 non-null      float64
 7   station_name           1505 non-null   object 
 8   river_name             1505 non-null   object 
 9   date_opened            1505 non-null   object 
 10  date_closed            1439 non-null   object 
 11  wiski_id               0 non-null      object 
 12  river_level_tool_id    0 non-null      float64
 13  rfa_station_id         0 non-null      float64
 14  station_status         1505 non-null   object 
 15  stat

<br>

### Illustrations

In [16]:
centre = gpd.tools.geocode('Great Britain', provider='nominatim', user_agent='spatial.analysis').loc[0, :]
centre

geometry    POINT (-1.9180234948012402 54.31536155)
address               Great Britain, United Kingdom
Name: 0, dtype: object

In [17]:
longitude = centre.geometry.x
latitude = centre.geometry.y

In [18]:
gbr = folium.Map(location=[latitude, longitude], tiles='Stamen Terrain', zoom_start=8, width='65%', height='65%')
gbr

In [19]:
folium.Map(location=[latitude, longitude], tiles='OpenStreetMap', zoom_start=8, width='65%', height='65%')