## Preparing variables

### Prepare temporal statistics (trend, mean, stdev) data at the co2-data pixel resolution for various variables, including:
- Climate: temperature, precipitation
- Greening: ndvi, evi, fpar, lai
- Land cover: individual types, total forest, shrub+savanna, grassland, cropland, grassland+cropland+CropNatMosiac

- Load geometries, co2flux amp trends, and LCC data
- For a chosen ROI and a given LC (or LC group), calculate the temporal LC percent cover trend
- Correlate 

### Notes

#### Applying same crs to all data
Using the common WGS84: EPSG 4326. This should be set to all rasters used. If the crs was WGS84 but the property was not set, then use:  
`mydata.rio.write_crs("epsg:4326", inplace=True)`  
Else, change the crs with
`mydata.rio.reproject("EPSG:4326")`  
When adding more datasets, these can be adjusted to the first using:  
`mydata2 = mydata2.rio.reproject_match(mydata)`  

#### Check if and how missing data value is set
`mydata.rio.nodata` or `mydata.rio.encoded_nodata` will show the fill value if it is set
`mydata.rio.set_nodata(-9999, inplace=True)` # will set the nadata attrribute without modifying the data
`mydata.rio.write_nodata(-9999, inplace=True)` # will write to the array (I guess replacing the existing missing data value?) Need to test.

Note that the reproject_match method from above will modify the nodata value of mydata2 to match that of mydata.  

Use the following to mask the missing data:  
```
nodata = raster.rio.nodata
raster = raster.where(raster != nodata)
raster.rio.write_nodata(nodata, encoded=True, inplace=True)
```

In [8]:
# Import packages
# See: https://www.earthdatascience.org/courses/use-data-open-source-python/hierarchical-data-formats-hdf/open-MODIS-hdf4-files-python/
import os
import warnings
import numpy.ma as ma
from shapely.geometry import mapping, box
import geopandas as gpd
import earthpy as et
import earthpy.spatial as es
import earthpy.plot as ep
from rasterio.crs import CRS
import rasterio

import xarray as xr
from osgeo import gdal
import rioxarray as rio
import pandas as pd
import numpy as np
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import folium

In [9]:
# Load the co2 data ----
file_co2amp_trend = '../data/co2invSeasAmpTrend.nc'
co2amp = rio.open_rasterio(file_co2amp_trend)
co2amp.rio.write_crs(4326, inplace=True)

### Preparing LC data

In [10]:

# Index for MCD12C1 Land_Cover_Type_1_Percent: IGBP land cover types
lcIndex = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
lcNames = ['Water', 'ENForest', 'EBForest', 'DNForest', 'DBForest', 
    'MixForest', 'ClosedShrub', 'OpenShrub', 'WoodySavanna',
    'Savanna', 'Grassland', 'PermWetland', 'Cropland',
    'Urban', 'CropNatMosiac', 'PermSnowIce', 'Barren']

In [70]:
# Load the low resolution LC data

# Path to files
path = '/Users/moyanofe/BigData/GeoSpatial/LandCover/LandCover_MODIS_MCD12/MCD12C1_proc'
file_in = 'MCD12C1.A2001-2021.061.LCtype1.All.lr.nc'
filepath_in = os.path.join(path, file_in)
ds_lc = rio.open_rasterio(filepath_in, masked=True)

### Calculating trends

The linear_trend function from xarrayutils.utils returns an xarray dataset with variables including slope, intercept, p-value


In [74]:
# Calculate trends and intercept for each LC type
from xarrayutils.utils import linear_trend

dict_lcTrends = dict()

for i in range(len(lcNames)): # [0]: # range(len(lcNames)):
    # print(lcNames[i])
    da_lc = ds_lc[lcNames[i]]
    # Calculate trends
    lc_trend = linear_trend(da_lc, 'time')
    lc_trend.slope.attrs['units'] = 'percent/y'
    lc_trend.slope.attrs['long_name'] = 'Trend in land cover'
    lc_trend.rio.write_crs(4326, inplace=True)
    dict_lcTrends[lcNames[i]] = lc_trend

# Calculate trends for grouped LC types
lcNamesNew = ['Forest', 'Shrub', 'GrassCrop']
ds_lc['Forest'] = ds_lc['ENForest'] + ds_lc['EBForest'] + ds_lc['DNForest'] + ds_lc['DBForest'] + + ds_lc['WoodySavanna']
ds_lc['Shrub'] = ds_lc['ClosedShrub'] + ds_lc['OpenShrub']
ds_lc['GrassCrop'] = ds_lc['Grassland'] + ds_lc['Cropland'] + ds_lc['CropNatMosiac']

for i in range(len(lcNamesNew)): # [0]: # range(len(lcNamesNew)):
    # print(lcNamesNew[i])
    da_lc = ds_lc[lcNamesNew[i]]
    # Calculate trends
    lc_trend = linear_trend(da_lc, 'time')
    lc_trend.slope.attrs['units'] = 'percent/y'
    lc_trend.slope.attrs['long_name'] = 'Trend in land cover'
    lc_trend.rio.write_crs(4326, inplace=True)
    dict_lcTrends[lcNamesNew[i]] = lc_trend