<span style="color: purple">

Load in stored variables:

</span>

In [1]:
%store -r siskiyou_forest_gdf padres_forest_gdf
%store -r siskiyou_soil_ph_da padres_soil_ph_da
%store -r siskiyou_srtm_da padres_srtm_da
%store -r ave_annual_pr_das_list

<span style="color: purple">

Import packages:

</span>

In [2]:
# Import necessary packages
import matplotlib.pyplot as plt # Overlay pandas and xarray plots
import rioxarray as rxr # Work with raster data
from tqdm.notebook import tqdm # Progress bars on loops
import xarray as xr # Work with DataArrays

## STEP 3: HARMONIZE DATA

<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-task"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Try It</div></div><div class="callout-body-container callout-body"><p>Make sure that the grids for all your data match each other. Check
out the <a
href="https://corteva.github.io/rioxarray/stable/examples/reproject_match.html#Reproject-Match"><code>ds.rio.reproject_match()</code>
method</a> from <code>rioxarray</code>. Make sure to use the data source
that has the highest resolution as a template!</p></div></div>

> **Warning**
>
> If you are reprojecting data as you need to here, the order of
> operations is important! Recall that reprojecting will typically tilt
> your data, leaving narrow sections of the data at the edge blank.
> However, to reproject efficiently it is best for the raster to be as
> small as possible before performing the operation. We recommend the
> following process:
>
>     1. Crop the data, leaving a buffer around the final boundary
>     2. Reproject to match the template grid (this will also crop any leftovers off the image)

In [3]:
# add names to soil and elevation DataArrays
siskiyou_soil_ph_da.name = 'Siskiyou Soil pH'
siskiyou_srtm_da.name = 'Siskiyou Elevation (m)'
padres_soil_ph_da.name = 'Los Padres Soil pH'
padres_srtm_da.name = 'Los Padres Elevation (m)'

In [4]:
# create a list of all DataArrays
das_list = [
    siskiyou_soil_ph_da,
    siskiyou_srtm_da,
    padres_soil_ph_da,
    padres_srtm_da,
    ave_annual_pr_das_list[0],
    ave_annual_pr_das_list[1],
    ave_annual_pr_das_list[2],
    ave_annual_pr_das_list[3],
    ave_annual_pr_das_list[4],
    ave_annual_pr_das_list[5],
    ave_annual_pr_das_list[6],
    ave_annual_pr_das_list[7],
    ave_annual_pr_das_list[8],
    ave_annual_pr_das_list[9],
    ave_annual_pr_das_list[10],
    ave_annual_pr_das_list[11],
    ave_annual_pr_das_list[12],
    ave_annual_pr_das_list[13],
    ave_annual_pr_das_list[14],
    ave_annual_pr_das_list[15]
    ]

In [5]:
# Define siskiyou bounds
siskiyou_bounds = tuple(siskiyou_forest_gdf
                        .total_bounds)

# Add buffer to siskiyou bounds
buffer = .025
(siskiyou_xmin, siskiyou_ymin,
 siskiyou_xmax, siskiyou_ymax) = siskiyou_bounds
siskiyou_bounds_buffer = (siskiyou_xmin-buffer,
                          siskiyou_ymin-buffer,
                          siskiyou_xmax+buffer,
                          siskiyou_ymax+buffer)

# Define padres bounds
padres_bounds = tuple(padres_forest_gdf
                      .total_bounds)

# Add .025 buffer to bounds
(padres_xmin, padres_ymin,
 padres_xmax, padres_ymax) = padres_bounds
padres_bounds_buffer = (padres_xmin-buffer,
                        padres_ymin-buffer,
                        padres_xmax+buffer,
                        padres_ymax+buffer)

<span style="color: purple">
All DataArrays will be cropped to the appropriate national forest boundary.

All DataArrays will be reprojected to the soil pH DataArrays since they have a 30 meter resolution. The SRTM DataArrays also have a 30 m resolution, however I will reproject them as well for consistency.

Resolutions:

* Soil data - 30 m resolution

* SRTM data - 30 m resolution

* MACA Climate data - either [4 or 6 km](https://climate.northwestknowledge.net/MACA/gallery_info.php)

### Crop and Reproject SRTM and MACA Climate DataArrays to the soil pH DataArrays
</span>

In [6]:
# see what bounds of some of the DataArrays are before reprojecting and matching
print(siskiyou_soil_ph_da.rio.bounds())
print(siskiyou_srtm_da.rio.bounds())
print(ave_annual_pr_das_list[0].rio.bounds())

print(padres_soil_ph_da.rio.bounds())
print(padres_srtm_da.rio.bounds())
print(ave_annual_pr_das_list[8].rio.bounds())

(-124.41638888889439, 41.88055555556612, -123.30833333334934, 42.886666666667736)
(-124.44152777777778, 41.855138888888874, -123.28347222222224, 42.91152777777778)
(-124.4180042560284, 41.8753080368042, -123.29300953791692, 42.916958808898926)
(-121.8491666666681, 34.39138888891351, -118.74250000003074, 36.404166666672296)
(-121.87430555555555, 34.36625, -118.71763888888896, 36.42930555555556)
(-121.87637807210287, 34.3754301071167, -118.70975779215495, 36.41706562042236)


In [6]:
# empty list for reprojected DataArrays
reproj_das_list = []

# for each DataArray in das_list,
for da in tqdm(das_list):
    # if 'Siskiyou' is in the name of the DataArray
    if 'Siskiyou' in da.name:
        # crop the da
        s_cropped_da = da.rio.clip_box(*siskiyou_bounds_buffer)
        # reproject and match the cropped DataArray to the siskiyou_soil_ph_da
        s_reproj_da = (s_cropped_da.rio.reproject_match(siskiyou_soil_ph_da))
        # add reproj_da to the reproj_ave_annual_pr_das list
        reproj_das_list.append(s_reproj_da)
    if 'Padres' in da.name:
        # crop the da
        p_cropped_da = da.rio.clip_box(*padres_bounds_buffer)
        # reproject and match the cropped DataArray to the padres_soil_ph_da
        p_reproj_da = (p_cropped_da.rio.reproject_match(padres_soil_ph_da))
        # add reproj_da to the reproj_ave_annual_pr_das list
        reproj_das_list.append(p_reproj_da)

# check reproj_das_list
# should have 20 cropped & reprojected DataArrays
reproj_das_list

  0%|          | 0/20 [00:00<?, ?it/s]

[<xarray.DataArray 'Siskiyou Soil pH' (y: 3622, x: 3989)> Size: 58MB
 array([[4.900687 , 4.9083242, 4.9083242, ..., 6.064574 , 6.0503283,
               nan],
        [4.8496294, 5.064386 , 5.0651016, ..., 6.1050434, 5.9562616,
               nan],
        [4.8344727, 4.8875594, 4.83582  , ..., 5.9791656, 6.0053306,
               nan],
        ...,
        [      nan,       nan,       nan, ..., 5.9569807, 5.984298 ,
               nan],
        [      nan,       nan,       nan, ..., 5.9737034, 6.02246  ,
               nan],
        [      nan,       nan,       nan, ..., 5.9786186, 5.9872603,
               nan]], dtype=float32)
 Coordinates:
     band         int64 8B 1
     spatial_ref  int64 8B 0
   * x            (x) float64 32kB -124.4 -124.4 -124.4 ... -123.3 -123.3 -123.3
   * y            (y) float64 29kB 42.89 42.89 42.89 42.89 ... 41.88 41.88 41.88
 Attributes:
     AREA_OR_POINT:  Area
     _FillValue:     nan,
 <xarray.DataArray 'Siskiyou Elevation (m)' (y: 3622, x: 3989)>

In [8]:
# check bounds of cropped & reprojected siskiyou DataArrays

# original soil
print(siskiyou_soil_ph_da.rio.bounds())
# reproj siskiyou soil
print(reproj_das_list[0].rio.bounds())
# reproj siskiyou elev
print(reproj_das_list[1].rio.bounds())
# reproj siskiyou precip CanESM2 2050
print(reproj_das_list[4].rio.bounds())

#check bounds of cropped & reprojected padres DataArrays
#original soil
print(padres_soil_ph_da.rio.bounds())
# reproj padres soil
print(reproj_das_list[2].rio.bounds())
# reproj padres elev
print(reproj_das_list[3].rio.bounds())
# reproj padres precip CanESM2 2050
print(reproj_das_list[12].rio.bounds())

(-124.41638888889439, 41.88055555556612, -123.30833333334934, 42.886666666667736)
(-124.41638888889439, 41.88055555556612, -123.30833333334934, 42.886666666667736)
(-124.41638888889439, 41.88055555556612, -123.30833333334934, 42.886666666667736)
(-124.41638888889439, 41.88055555556612, -123.30833333334934, 42.886666666667736)
(-121.8491666666681, 34.39138888891351, -118.74250000003074, 36.404166666672296)
(-121.8491666666681, 34.39138888891351, -118.74250000003074, 36.404166666672296)
(-121.8491666666681, 34.39138888891351, -118.74250000003074, 36.404166666672296)
(-121.8491666666681, 34.39138888891351, -118.74250000003074, 36.404166666672296)


<span style='color:purple'>

Check Plots:

*the plotting code below is currently commented out to save time; each of the Padres plots takes about 3 minutes*

</span>

In [9]:
# # Plot reprojected Siskiyou Elevation
# reproj_das_list[1].plot()

# # Plot Siskiyou National Forest boundary on reproj_das_list[1] plot
# siskiyou_forest_gdf.boundary.plot(ax = plt.gca(), color='black')

# plt.title('Siskiyou National Forest & Surrounding Area Elevation Reprojected & '
#           'Matched to the Siskiyou Soil pH DataArray')
# plt.show()

In [10]:
# # plot reprojected Siskiyou Average Annual Precipitation (mm), 2036-2065, CanESM2
# reproj_das_list[4].plot()

# # Plot Siskiyou National Forest boundary on reproj_das_list[4] plot
# siskiyou_forest_gdf.boundary.plot(ax = plt.gca(), color='black')

# plt.title('Siskiyou National Forest & Surrounding Area Climate Reprojected & '
#           'Matched to the Siskiyou Soil pH DataArray')
# plt.show()

In [11]:
# # Plot reprojected Padres Elevation
# reproj_das_list[3].plot()

# # Plot Padres National Forest boundary on reproj_das_list[3] plot
# padres_forest_gdf.boundary.plot(ax = plt.gca(), color='black')

# plt.title('Los Padres National Forest & Surrounding Area Elevation Reprojected & '
#           'Matched to the Siskiyou Soil pH DataArray')
# plt.show()

In [12]:
# # plot reprojected Padres Average Annual Precipitation (mm), 2036-2065, CanESM2
# reproj_das_list[12].plot()

# # Plot Padres National Forest boundary on reproj_das_list[12] plot
# padres_forest_gdf.boundary.plot(ax = plt.gca(), color='black')

# plt.title('Los Padres National Forest & Surrounding Area Climate Reprojected & '
#           'Matched to the Los Padres Soil pH DataArray')
# plt.show()

<span style='color: purple'>

Store reprojected DataArrays:

*I've gotten a MemoryError a couple times while running this last step, even after restarting the kernal. Not sure why it works sometimes but not others.*

</span>

In [7]:
%store reproj_das_list

Stored 'reproj_das_list' (list)
