#### Groundwater data

- This notebook reads in the groundwater and groundwater stations data and cleans and merges them 
- [Source](https://data.cnra.ca.gov/dataset/periodic-groundwater-level-measurements) and data retrieved using requests stored in file
- [Data location](/work/assets/groundwater.csv)

In [None]:
import sys
sys.path.append('..')

In [None]:
from lib.groundwater import GroundwaterDataset



##### Groundwater measurements
https://data.cnra.ca.gov/dataset/periodic-groundwater-level-measurements

- The elevation of the groundwater surface  can be calculated by subtracting the depth to groundwater from the ground surface elevation.

> [Water levels in many aquifers](https://pubs.usgs.gov/circ/circ1217/pdf/circ1217_final.pdf) in the United States follow a natural cyclic pattern of seasonal 
> fluctuation,typically rising during the winter and spring due to greater precipitation and recharge, then declining during the summer
> and fall owing to less recharge and greater evapotranspiration. Hence, below we take only spring measurements for groundwater

In [None]:
groundwater_data =  GroundwaterDataset()

In [None]:
groundwater_data.clean_groundwater_data()

Unnamed: 0,GWE,GSE_GWE,MSMT_DATE,SITE_CODE,WLM_GSE,WLM_ID,WLM_RPE,YEAR,MONTH,LONGITUDE,LATITUDE,GSE,COUNTY_NAME,WELL_USE
0,136.92,409.0,2021-04-29 16:00:00,320000N1140000W001,545.92,3016754,545.92,2021,4,-121.755,36.5605,,Monterey,Residential
1,122.92,423.0,2020-04-30 00:00:00,320000N1140000W001,545.92,2688520,545.92,2020,4,-121.755,36.5605,,Monterey,Residential
2,165.92,380.0,2020-03-26 00:00:00,320000N1140000W001,545.92,2688519,545.92,2020,3,-121.755,36.5605,,Monterey,Residential
3,163.92,382.0,2020-02-27 00:00:00,320000N1140000W001,545.92,2688518,545.92,2020,2,-121.755,36.5605,,Monterey,Residential
4,162.92,383.0,2020-01-30 00:00:00,320000N1140000W001,545.92,2688517,545.92,2020,1,-121.755,36.5605,,Monterey,Residential
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
895604,4019.90,40.1,2002-03-13 00:00:00,420171N1214111W001,4060.00,675334,4061.00,2002,3,-121.411,42.0120,4060.0,"Klamath, OR",Other
895605,4019.10,40.9,2002-02-27 00:00:00,420171N1214111W001,4060.00,675333,4061.00,2002,2,-121.411,42.0120,4060.0,"Klamath, OR",Other
895606,4018.50,41.5,2002-02-13 00:00:00,420171N1214111W001,4060.00,675332,4061.00,2002,2,-121.411,42.0120,4060.0,"Klamath, OR",Other
895607,4017.80,42.2,2002-01-30 00:00:00,420171N1214111W001,4060.00,675331,4061.00,2002,1,-121.411,42.0120,4060.0,"Klamath, OR",Other


In [None]:
groundwater_data.draw_mising_data_chart()

In [None]:
groundwater_data.merge_data_plss()

Unnamed: 0,GWE,GSE_GWE,MSMT_DATE,SITE_CODE,WLM_GSE,WLM_ID,WLM_RPE,YEAR,MONTH,LONGITUDE,...,geometry,index_right,OBJECTID,Township,Range,Meridian,Source,Section,MTRS,TownshipRange
182895,90.00,254.00,2012-02-02 00:00:00,344779N1192479W001,344.0,54535,346.00,2012,2,-119.248,...,POINT (-119.24800 35.47790),12667.0,87185.0,T28S,R25E,MDM,BLM,23.0,MDM-T28S-R25E-23,T28S R25E
182896,38.00,306.00,2020-03-11 00:00:00,344779N1192479W001,344.0,2625242,346.00,2020,3,-119.248,...,POINT (-119.24800 35.47790),12667.0,87185.0,T28S,R25E,MDM,BLM,23.0,MDM-T28S-R25E-23,T28S R25E
182897,36.00,308.00,2019-02-01 00:00:00,344779N1192479W001,344.0,2470090,346.00,2019,2,-119.248,...,POINT (-119.24800 35.47790),12667.0,87185.0,T28S,R25E,MDM,BLM,23.0,MDM-T28S-R25E-23,T28S R25E
182898,47.00,297.00,2018-02-06 00:00:00,344779N1192479W001,344.0,2307506,346.00,2018,2,-119.248,...,POINT (-119.24800 35.47790),12667.0,87185.0,T28S,R25E,MDM,BLM,23.0,MDM-T28S-R25E-23,T28S R25E
182899,46.00,298.00,2017-02-08 00:00:00,344779N1192479W001,344.0,2252519,346.00,2017,2,-119.248,...,POINT (-119.24800 35.47790),12667.0,87185.0,T28S,R25E,MDM,BLM,23.0,MDM-T28S-R25E-23,T28S R25E
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
747576,-11.24,15.24,2018-03-30 12:00:00,385330N1213710W005,4.0,2294596,4.46,2018,3,-121.371,...,POINT (-121.37100 38.05450),848.0,11047.0,T02N,R05E,MDM,DGR,1.0,MDM-T02N-R05E-1,T02N R05E
747577,-10.04,14.04,2017-04-06 12:00:00,385330N1213710W005,4.0,2255586,4.46,2017,4,-121.371,...,POINT (-121.37100 38.05450),848.0,11047.0,T02N,R05E,MDM,DGR,1.0,MDM-T02N-R05E-1,T02N R05E
747578,-13.54,17.54,2016-04-15 12:00:00,385330N1213710W005,4.0,2194570,4.46,2016,4,-121.371,...,POINT (-121.37100 38.05450),848.0,11047.0,T02N,R05E,MDM,DGR,1.0,MDM-T02N-R05E-1,T02N R05E
747579,-13.74,17.74,2015-03-23 12:00:00,385330N1213710W005,4.0,2132808,4.46,2015,3,-121.371,...,POINT (-121.37100 38.05450),848.0,11047.0,T02N,R05E,MDM,DGR,1.0,MDM-T02N-R05E-1,T02N R05E


#### The below are functions that have now become class methods and do not need to be run

In [None]:
groundwater_df = pd.read_csv(r"/work/assets/inputs/groundwater/groundwater.csv")
groundwaterstations_df = pd.read_csv(r"/work/assets/inputs/groundwater/groundwater_stations.csv")
print(groundwater_df.shape) # (2530751, 16)
groundwater_df.drop(columns=['Unnamed: 0', '_id', 'WLM_ORG_NAME', 'COOP_ORG_NAME', 'WLM_ACC_DESC', 'WLM_QA_DESC', 'WLM_DESC', 'MSMT_CMT', 'MONITORING_PROGRAM' ], inplace=True)

groundwaterstations_df = groundwaterstations_df.loc[:,['SITE_CODE','LONGITUDE','LATITUDE','GSE','COUNTY_NAME','WELL_USE']].copy()

# create simple year and month columns
groundwater_df['MSMT_DATE'] = pd.to_datetime(groundwater_df.MSMT_DATE)
groundwater_df['YEAR'] = groundwater_df['MSMT_DATE'].dt.year
groundwater_df['MONTH'] = groundwater_df['MSMT_DATE'].dt.month

# Retain only those records that have Groundwater measurements
groundwater_df = groundwater_df[~groundwater_df['GSE_GWE'].isnull()] # 2325741 
print(groundwater_df.shape) # (2530751, 16)

# merge with station data for location info
spring_month_groundwater_location = spring_months_groundwater.merge(groundwaterstations_df, on='SITE_CODE')
spring_month_groundwater_location.sample()

# drop the rows that have incorrect measurements that of 0 or less
spring_month_groundwater_location = spring_month_groundwater_location[spring_month_groundwater_location['GSE_GWE'] > 0]



In [None]:
# Follow the routine to create a GeoDataFrame
gdf_spring_groundwater = gpd.GeoDataFrame(
    spring_month_groundwater_location, 
    geometry=gpd.points_from_xy(
        spring_month_groundwater_location.LONGITUDE, 
        spring_month_groundwater_location.LATITUDE
    ))

#Set the coordinate reference system so that we now have the projection axis
gdf_spring_groundwater = gdf_spring_groundwater.set_crs('epsg:4326')

# match up based on longitude/latitude
spring_groundwater_plss = gdf_spring_groundwater.sjoin(SJ_subbasin_plss, how="left")
spring_groundwater_plss.sample()

# drop the ones that aren't in a subbasin trs
spring_groundwater_plss = spring_groundwater_plss.dropna(subset=['MTRS'])

spring_groundwater_plss[spring_groundwater_plss['YEAR']==2019]['geometry'].explore(color='TownshipRange')

In [None]:
# Group wells that had multiple spring measurements in some years and get the average
spring_groundwater_group = spring_groundwater_plss.groupby(['SITE_CODE','MTRS','TownshipRange','COUNTY_NAME','YEAR']).agg({
    'GSE_GWE': ['mean'],
}).reset_index()
#spring_groundwater_group.columns = ['site','MTRS','TownshipRange','county','year','gse_gwe']

In [None]:
# export all measurements for analysis
spring_groundwater_group.to_csv("/work/assets/clean_data/spring_groundwater_levels_clean.csv", index=False)

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=b042e2da-6536-449d-95b8-d85fa08825de' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>