# Preprocessing of Climate and Environment Data
This notebook loads raster data for climate and environmental data important to cropping production. Data is processed for an area of interest, and downscaled to a common spatial grid. Data are exported as a csv file. 

In [1]:
from process_functions import * 
import pandas as pd

**Description of Raw Data**
1. Climate Data is obtained from the PRISM Climate Group.\
    a. 30-yr normals for the period of 1991-2020 \
    b. 800m resolution \
    c. Variables of mean annual temperature, mean total annual precipitation, and mean elevation which is used for a common grid reference \
    d. Data and metadata are availabe at https://prism.oregonstate.edu/normals/ 
2. Soil data is obtained from UC Davis compilation of SSURGO NRCS \
    a. SSURGO http://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/?cid=nrcs142p2_053627 \
    b. 800m resolution \
    c. Variables of soil texture for 0-25 cm and 25 - 50 cm, plant available water content from 0-50 cm, and organic matter in kg/m2 \
    d. Data and metadata are available at https://casoilresource.lawr.ucdavis.edu/soil-properties/download.php
3. Climate Data for Reference Evapotranspiration (ET0) is obtained from TERRACLIMATE \
    a. ET0 is provided as a 30-year monthly summary for the period of 1991-2020 as a compressed netCDF file \
    b. ~4 km resolution \
    b. ET0 is a derived data set and available at https://www.climatologylab.org/terraclimate.html 
 
    
**Description of Processed Data**
1. Read each unique raster file. 
2. Reproject to common crs
3. Clip to the outer most bounds of Kansas
4. Rescale to common grid
5. Calculate Values for depth and croplayer
6. Export array as a CSV 

**Read each unique file name and generate lists of file names**

In [2]:
# Soil Files 
paws_050 = "C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/paws_050.tif"
om_kg_sq_m = "C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/om_kg_sq_m.tif"
depth= "C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/soil_depth.tif"
depth_restriction="C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/resdept.tif"

# TerraClimate File 
pet = "C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/pet_ks.tif" # included in soil file list as this needs reprojected to common crs

soil_files = [ paws_050, om_kg_sq_m, depth, depth_restriction, pet] 
soil_files_str = ['paws_050', 'om_kg_sq_m', 'depth', 'dep_res', 'pet'] 

# Climate Files
precip="C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/PRISM_ppt_30yr_normal_800mM3_annual_bil.bil"
temp="C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/PRISM_tmean_30yr_normal_800mM3_annual_bil.bil"
elv="C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/PRISM_us_dem_800m_bil.bil"

prism_files = [precip, temp, elv]
prism_files_str = ['precip', 'temp', 'elv']

In [3]:
# Update projection for files without 4326 and rename in the file list

new_crs = 'EPSG:4326'

for(i,j) in zip(soil_files, soil_files_str):
    reproject_file(i,j, new_crs)

# Update file names
soil_files = ['C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/'+i+'_wgs84.tif' for i in soil_files_str]           

In [4]:
# Crop to Shape to outer most bounds of Kansas

# Soil Data
for(i,j) in zip(soil_files, soil_files_str):
    crop_to_shape(i,j)

soil_files = ['C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/'+i+'_wgs84_ks.tif'for i in soil_files_str] # Update file names
   
# Prism Data
for(i,j) in zip(prism_files, prism_files_str):
    crop_to_shape(i,j)

prism_files = ['C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/'+i+'_wgs84_ks.tif' for i in prism_files_str]

In [5]:
# Use one map as the template for the other maps.
# PRISM Elevation is a standard 800m grid product
template = rasterio.open('C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/elv_wgs84_ks.tif').read(1)
print(template.shape)
new_width = template.shape[1]
new_height = template.shape[0]

(362, 896)


In [6]:
# Rescale to common spatial grid based on elevation grid

#Soil Data 
for(i,j) in zip(soil_files, soil_files_str):
    rescale_map(i,j, new_width, new_height, 'cubic')

soil_files = ['C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/'+i+'_wgs84_ks_scaled.tif'for i in soil_files_str] # Update file names

# Prism Data
for(i,j) in zip(prism_files, prism_files_str):
    rescale_map(i,j, new_width, new_height, 'cubic')

prism_files = ['C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/'+i+'_wgs84_ks_scaled.tif' for i in prism_files_str]

In [7]:
# Elevation was only used as a template and is not retained for further processing
prism_files.remove('C:/Users/sarahann.USERS/Desktop/code/ks_agro_climate/elv_wgs84_ks_scaled.tif')
prism_files_str.remove('elv')

In [8]:
# Generate a single string and file name list 
strings = soil_files_str + prism_files_str 
files = soil_files + prism_files 

In [9]:
# Open each file and read into a dictionary
variables = {}

for s, f in zip(strings, files):
    variables[s] = rasterio.open(f).read(1)

**Data and Variable Cleaning**
1. Generate depth variable that is the lesser of restricted depth, 150, or actual depth
2. Generate a crop layer variable and non crop layer variable

In [10]:
#Call numpy.where(condition, x, y) to replace values that meet condition with x, and those that do not with y.
variables['dep_res']= np.where(variables['dep_res'] < 0, np.nan, variables['dep_res'])
variables['depth']= np.where(variables['depth'] < 0, np.nan, variables['depth'])

variables['depth_adj']= np.where(variables['dep_res'] < variables['depth'],  variables['dep_res'],  variables['depth'])
variables['depth_adj']= np.where(variables['depth_adj'] > 150,  150,  variables['depth_adj'] ) # set all values to no higher than 150 cm depth (max effective root zone)

variables.pop('depth') # remove from the list
variables.pop('dep_res') # remove from the list 

array([[      nan,       nan,       nan, ...,       nan,       nan,
              nan],
       [      nan, 35.64532 , 47.12787 , ...,       nan,       nan,
              nan],
       [      nan, 35.64532 , 57.482594, ...,       nan,       nan,
              nan],
       ...,
       [      nan,       nan,       nan, ...,       nan,       nan,
              nan],
       [      nan,       nan,       nan, ...,       nan,       nan,
              nan],
       [      nan,       nan,       nan, ...,       nan,       nan,
              nan]], dtype=float32)

In [11]:
# For all variables that are a nan value [-9999] set to nan. 
for k in variables: 
    mask = (variables[k] <= 0)
    variables[k][mask] = np.nan

**Export Data as a CSV**

In [12]:
df = pd.DataFrame({k:list(v.flatten()) for k,v in variables.items()})
df.to_csv('variables.csv')