#### Groundwater data

- This notebook reads in the groundwater and groundwater stations data and cleans and merges them 
- [Source](https://data.cnra.ca.gov/dataset/periodic-groundwater-level-measurements) and data retrieved using requests stored in file
- [Data location](/work/assets/groundwater.csv)

##### Groundwater measurements
https://data.cnra.ca.gov/dataset/periodic-groundwater-level-measurements

- The elevation of the groundwater surface  can be calculated by subtracting the depth to groundwater from the ground surface elevation.

> [Water levels in many aquifers](https://pubs.usgs.gov/circ/circ1217/pdf/circ1217_final.pdf) in the United States follow a natural cyclic pattern of seasonal 
> fluctuation,typically rising during the winter and spring due to greater precipitation and recharge, then declining during the summer
> and fall owing to less recharge and greater evapotranspiration. Hence, below we take only spring measurements for groundwater

In [None]:
groundwater_df = pd.read_csv(r"/work/assets/inputs/groundwater/groundwater.csv")
groundwaterstations_df = pd.read_csv(r"/work/assets/inputs/groundwater/groundwater_stations.csv")
print(groundwater_df.shape) # (2530751, 16)
groundwater_df.drop(columns=['Unnamed: 0', '_id', 'WLM_ORG_NAME', 'COOP_ORG_NAME', 'WLM_ACC_DESC', 'WLM_QA_DESC', 'WLM_DESC', 'MSMT_CMT', 'MONITORING_PROGRAM' ], inplace=True)

groundwaterstations_df = groundwaterstations_df.loc[:,['SITE_CODE','LONGITUDE','LATITUDE','GSE','COUNTY_NAME','WELL_USE']].copy()

# create simple year and month columns
groundwater_df['MSMT_DATE'] = pd.to_datetime(groundwater_df.MSMT_DATE)
groundwater_df['YEAR'] = groundwater_df['MSMT_DATE'].dt.year
groundwater_df['MONTH'] = groundwater_df['MSMT_DATE'].dt.month

# Retain only those records that have Groundwater measurements
groundwater_df = groundwater_df[~groundwater_df['GSE_GWE'].isnull()] # 2325741 
print(groundwater_df.shape) # (2530751, 16)

# merge with station data for location info
spring_month_groundwater_location = spring_months_groundwater.merge(groundwaterstations_df, on='SITE_CODE')
spring_month_groundwater_location.sample()

# drop the rows that have incorrect measurements that of 0 or less
spring_month_groundwater_location = spring_month_groundwater_location[spring_month_groundwater_location['GSE_GWE'] > 0]



In [None]:
# Follow the routine to create a GeoDataFrame
gdf_spring_groundwater = gpd.GeoDataFrame(
    spring_month_groundwater_location, 
    geometry=gpd.points_from_xy(
        spring_month_groundwater_location.LONGITUDE, 
        spring_month_groundwater_location.LATITUDE
    ))

#Set the coordinate reference system so that we now have the projection axis
gdf_spring_groundwater = gdf_spring_groundwater.set_crs('epsg:4326')

# match up based on longitude/latitude
spring_groundwater_plss = gdf_spring_groundwater.sjoin(SJ_subbasin_plss, how="left")
spring_groundwater_plss.sample()

# drop the ones that aren't in a subbasin trs
spring_groundwater_plss = spring_groundwater_plss.dropna(subset=['MTRS'])

spring_groundwater_plss[spring_groundwater_plss['YEAR']==2019]['geometry'].explore(color='TownshipRange')

In [None]:
# Group wells that had multiple spring measurements in some years and get the average
spring_groundwater_group = spring_groundwater_plss.groupby(['SITE_CODE','MTRS','TownshipRange','COUNTY_NAME','YEAR']).agg({
    'GSE_GWE': ['mean'],
}).reset_index()
#spring_groundwater_group.columns = ['site','MTRS','TownshipRange','county','year','gse_gwe']

In [None]:
# export all measurements for analysis
spring_groundwater_group.to_csv("/work/assets/clean_data/spring_groundwater_levels_clean.csv", index=False)

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=b042e2da-6536-449d-95b8-d85fa08825de' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>