# San Joaquin Valley Township-Range Well Completion Reports Datasets

Related links:
* For the documentation about this dataset, its source, how to download, and the features of interest, please refer to our [Well Completion Reports Dataset](doc/assets/well_completion_reports.md) documentation.
* For the explanations on how the sortage mapping datasets are mapped to TownshipRange please refer to our [Public Land Survey System](doc/assets/plss_sanjoaquin_riverbasin.md) documentation.

In [None]:
import sys
sys.path.append('..')

In [None]:
import numpy as np
import altair as alt
from lib.well_completion import WellCompletionReportsDataset
from lib.viz import draw_missing_data_chart, view_trs_side_by_side, display_data_on_map, visualize_seasonality_by_month

In [None]:
alt.data_transformers.disable_max_rows()

DataTransformerRegistry.enable('default')

In [None]:
wcr = WellCompletionReportsDataset()

Loading local datasets. Please wait...
Loading of datasets complete.


Pre-process the well completion report data, geospatial location and additional elevation, combine them into the the geospatial map_df dataset and overlay the San Joaquin Valley boundaries to keep only the data in the San Joaquin Valley

In [None]:
wcr.preprocess_map_df(features_to_keep=["WCRNUMBER", "YEAR", "MONTH", "BOTTOMOFPERFORATEDINTERVAL", "GROUNDSURFACEELEVATION", "STATICWATERLEVEL", "TOPOFPERFORATEDINTERVAL", "TOTALDRILLDEPTH", "TOTALCOMPLETEDDEPTH", "CASINGDIAMETER", "TOTALDRAWDOWN", "WELLYIELD", "USE", "geometry"])
wcr.keep_only_sjv_data()
wcr.map_df

Unnamed: 0,WCRNUMBER,YEAR,MONTH,BOTTOMOFPERFORATEDINTERVAL,GROUNDSURFACEELEVATION,STATICWATERLEVEL,TOPOFPERFORATEDINTERVAL,TOTALDRILLDEPTH,TOTALCOMPLETEDDEPTH,CASINGDIAMETER,TOTALDRAWDOWN,WELLYIELD,USE,geometry
0,WCR2016-015575,2016,1,1176.0,144.75,459.0,975.0,,2200.0,36.0,,330.0,Agriculture,POINT (-119.13278 35.07333)
1,WCR2016-002959,2016,1,1460.0,134.01,410.0,540.0,1509.0,1480.0,,240.00,800.0,Agriculture,POINT (-119.10915 35.07731)
2,WCR2015-002563,2015,7,1180.0,136.03,,440.0,,1200.0,16.0,,,Agriculture,POINT (-119.14529 35.08397)
3,WCR2017-008490,2017,4,160.0,123.70,,430.0,,1015.0,34.0,,,Agriculture,POINT (-119.13750 35.09361)
4,WCR0266449,2014,6,,142.92,351.0,,,2140.0,,,1000.0,Agriculture,POINT (-119.21614 35.08392)
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14608,WCR2019-000470,2018,9,141.0,80.32,32.0,40.0,141.0,141.0,,0.00,50.0,Agriculture,POINT (-120.97241 38.26662)
14609,WCR2014-004482,2014,6,266.0,77.72,,166.0,,325.0,10.0,,,Agriculture,POINT (-120.93000 38.28917)
14610,WCR2015-012786,2015,2,500.0,98.74,25.0,200.0,,500.0,12.0,,8.0,Domestic,POINT (-120.35110 37.51419)
14611,WCR2020-005670,2020,5,374.0,92.56,40.0,110.0,385.0,379.0,,56.16,30.0,Domestic,POINT (-120.37339 37.51425)


Look at missing data in the dataset

In [None]:
draw_missing_data_chart(wcr.map_df)

Map of the wells completed in the year 2021 in the San Joaquin Valley

In [None]:
display_data_on_map(wcr.map_df, feature="USE", year=2021, color_scheme="tab10")

Check for seasonality in well completion

In [None]:
visualize_seasonality_by_month(wcr.get_well_count_by_year_month(), feature="WELL_COUNT")

Next, for all Township-Ranges and every year:
1. Compute the average well value in every Township-Range
2. Count the number of wells
3. Fill NaN values with 0

In [None]:
wcr.compute_features_by_township(features_to_average=["BOTTOMOFPERFORATEDINTERVAL", "GROUNDSURFACEELEVATION", "STATICWATERLEVEL", "TOPOFPERFORATEDINTERVAL", "TOTALDRILLDEPTH", "TOTALCOMPLETEDDEPTH", "WELLYIELD"], add_well_count=True, fill_na_with_zero=False)
wcr.map_df

Unnamed: 0,TOWNSHIP_RANGE,YEAR,geometry,BOTTOMOFPERFORATEDINTERVAL_AVG,GROUNDSURFACEELEVATION_AVG,STATICWATERLEVEL_AVG,TOPOFPERFORATEDINTERVAL_AVG,TOTALDRILLDEPTH_AVG,TOTALCOMPLETEDDEPTH_AVG,WELLYIELD_AVG,WELL_COUNT,WELL_COUNT_AGRICULTURE,WELL_COUNT_DOMESTIC,WELL_COUNT_INDUSTRIAL,WELL_COUNT_PUBLIC
0,T01N R02E,2014,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",197.500000,56.905000,25.00,80.000000,,337.500000,3.000000,2,1,1,0,0
1,T01N R02E,2016,"POLYGON ((-121.78743 37.88191, -121.78743 37.9...",150.000000,20.280000,44.00,130.000000,,150.000000,20.000000,1,0,1,0,0
2,T01N R02E,2017,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",166.666667,41.086667,31.00,126.666667,210.000000,196.666667,39.333333,3,0,3,0,0
3,T01N R02E,2018,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",138.000000,69.050000,75.50,118.000000,169.000000,169.000000,4.000000,2,1,1,0,0
4,T01N R02E,2019,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",360.000000,85.320000,35.00,60.000000,370.000000,370.000000,,2,1,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2270,T32S R29E,2017,"POLYGON ((-118.91470 35.09263, -118.91470 35.1...",810.000000,142.850000,447.00,730.000000,,1350.000000,4750.000000,1,1,0,0,0
2271,T32S R29E,2018,"POLYGON ((-118.91470 35.18006, -118.80566 35.1...",1030.000000,140.146667,422.65,525.000000,1108.000000,1096.666667,4750.000000,3,3,0,0,0
2272,T32S R29E,2019,"POLYGON ((-118.91470 35.18006, -118.80566 35.1...",907.500000,122.370000,369.00,565.500000,990.666667,920.000000,1600.000000,3,1,0,0,2
2273,T32S R29E,2020,"POLYGON ((-118.91470 35.09263, -118.91470 35.1...",1090.000000,129.900000,,490.000000,1120.000000,1100.000000,,1,1,0,0,0


Fill in missing Township-Range with:
1. a WELL_COUNT of 0
2. NaN for the features

In [None]:
wcr.fill_townships_with_no_data(features_to_fill=["WELL_COUNT", "WELL_COUNT_AGRICULTURE", "WELL_COUNT_DOMESTIC", "WELL_COUNT_INDUSTRIAL", "WELL_COUNT_PUBLIC"], feature_value=0)
wcr.fill_townships_with_no_data(features_to_fill=["BOTTOMOFPERFORATEDINTERVAL_AVG", "GROUNDSURFACEELEVATION_AVG", "STATICWATERLEVEL_AVG", "TOPOFPERFORATEDINTERVAL_AVG",
                                                  "TOTALDRILLDEPTH_AVG", "TOTALCOMPLETEDDEPTH_AVG", "WELLYIELD_AVG"], feature_value=np.nan)
wcr.map_df

Unnamed: 0,TOWNSHIP_RANGE,YEAR,geometry,BOTTOMOFPERFORATEDINTERVAL_AVG,GROUNDSURFACEELEVATION_AVG,STATICWATERLEVEL_AVG,TOPOFPERFORATEDINTERVAL_AVG,TOTALDRILLDEPTH_AVG,TOTALCOMPLETEDDEPTH_AVG,WELLYIELD_AVG,WELL_COUNT,WELL_COUNT_AGRICULTURE,WELL_COUNT_DOMESTIC,WELL_COUNT_INDUSTRIAL,WELL_COUNT_PUBLIC
0,T01N R02E,2014,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",197.500000,56.905000,25.0,80.000000,,337.500000,3.000000,2.0,1.0,1.0,0.0,0.0
1,T01N R02E,2016,"POLYGON ((-121.78743 37.88191, -121.78743 37.9...",150.000000,20.280000,44.0,130.000000,,150.000000,20.000000,1.0,0.0,1.0,0.0,0.0
2,T01N R02E,2017,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",166.666667,41.086667,31.0,126.666667,210.0,196.666667,39.333333,3.0,0.0,3.0,0.0,0.0
3,T01N R02E,2018,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",138.000000,69.050000,75.5,118.000000,169.0,169.000000,4.000000,2.0,1.0,1.0,0.0,0.0
4,T01N R02E,2019,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",360.000000,85.320000,35.0,60.000000,370.0,370.000000,,2.0,1.0,1.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
471,T32S R24E,2021,"POLYGON ((-119.44762 35.09378, -119.44762 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0
472,T32S R25E,2021,"POLYGON ((-119.34122 35.09371, -119.34122 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0
474,T32S R27E,2021,"POLYGON ((-119.12837 35.09439, -119.12837 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0
475,T32S R28E,2021,"POLYGON ((-119.02170 35.09292, -119.02170 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0


In [None]:
# We remove the "WELL_COUNT" feature from the dataset, since it is the sum of each category of well (and is thus fully correlated with these features)
wcr.prepare_output_from_map_df(unwanted_features=["WELL_COUNT"])
wcr.output_dataset_to_csv(output_filename="../assets/outputs/well_completion_reports.csv")

## Visualizing the Normalized Well Count


Compute the WELL_COUNT normalized by year

In [None]:
normalized_df = wcr.return_yearly_normalized_township_feature(feature_name="WELL_COUNT", normalize_method = "minmax")
normalized_df

Unnamed: 0,TOWNSHIP_RANGE,YEAR,geometry,BOTTOMOFPERFORATEDINTERVAL_AVG,GROUNDSURFACEELEVATION_AVG,STATICWATERLEVEL_AVG,TOPOFPERFORATEDINTERVAL_AVG,TOTALDRILLDEPTH_AVG,TOTALCOMPLETEDDEPTH_AVG,WELLYIELD_AVG,WELL_COUNT,WELL_COUNT_AGRICULTURE,WELL_COUNT_DOMESTIC,WELL_COUNT_INDUSTRIAL,WELL_COUNT_PUBLIC,WELL_COUNT_NORMALIZED
0,T01N R02E,2014,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",197.500000,56.905000,25.0,80.000000,,337.500000,3.000000,2.0,1.0,1.0,0.0,0.0,0.037037
1,T01N R02E,2016,"POLYGON ((-121.78743 37.88191, -121.78743 37.9...",150.000000,20.280000,44.0,130.000000,,150.000000,20.000000,1.0,0.0,1.0,0.0,0.0,0.012048
2,T01N R02E,2017,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",166.666667,41.086667,31.0,126.666667,210.0,196.666667,39.333333,3.0,0.0,3.0,0.0,0.0,0.075000
3,T01N R02E,2018,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",138.000000,69.050000,75.5,118.000000,169.0,169.000000,4.000000,2.0,1.0,1.0,0.0,0.0,0.071429
4,T01N R02E,2019,"POLYGON ((-121.78743 37.96886, -121.69601 37.9...",360.000000,85.320000,35.0,60.000000,370.0,370.000000,,2.0,1.0,1.0,0.0,0.0,0.048780
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
471,T32S R24E,2021,"POLYGON ((-119.44762 35.09378, -119.44762 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0,0.000000
472,T32S R25E,2021,"POLYGON ((-119.34122 35.09371, -119.34122 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0,0.000000
474,T32S R27E,2021,"POLYGON ((-119.12837 35.09439, -119.12837 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0,0.000000
475,T32S R28E,2021,"POLYGON ((-119.02170 35.09292, -119.02170 35.1...",,,,,,,,0.0,0.0,0.0,0.0,0.0,0.000000


### Comparison per year

In [None]:
view_trs_side_by_side(normalized_df, feature="YEAR", value="WELL_COUNT_NORMALIZED", title="Well Counts")

### Map visualization for 2018

In [None]:
display_data_on_map(normalized_df, feature="WELL_COUNT_NORMALIZED", year=2018)

### Map visualization for 2021

In [None]:
display_data_on_map(normalized_df, feature="WELL_COUNT_NORMALIZED", year=2021)

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=b042e2da-6536-449d-95b8-d85fa08825de' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>