# MODIS composites and zonal stats for SOC-D 


### The following will download then average temporally through some MODIS rasters to produce the desired 5 yr May inclusive average for the SOC-D project which will be used to attribute corrine land cover polygons. 

The polygons are just subsets for demo purposes.

### This notebook processes the monthly 1km product.

It should be noted that the input rasters have already been averaged by NASA/USGS prior to the averaging below.

**Please register with APPears so you can use your username and password in the workflow**

https://lpdaacsvc.cr.usgs.gov/appeears/

**The first part utilises a function I have written to access the APPears API to download the desired data within an AOI geojson.**

**The second stacks and composites rasters then attributes the shapefiles using functions from my own lib geospatial_learn.** 

https://github.com/Ciaran1981/geospatial-learn

The dependencies for this are numerous and it will take a long time to install SO for brevity...

...rather than you having to wait, I have just provided the necessary functions locally, so provided you work in this folder you can import the function. The files are ```appears_down.py``` and ```shape.py```. This reduces our dendencies considerably.

The locally based functions are imported below.


In [None]:
from src.appears_down import appears_download
from src.shape import zonal_stats
#from geospatial_learn.shape import zonal_stats
import src.raster as rs
# from geospatial_learn import raster as rs
import os
import pandas as pd

### Part 1

Data download. We use a polygon to demarcate the area of interest. To this end we have an aoi geojson ```aoi.geojson``` which is a part of Wales.

We use the APPears API (functionalised for you in the file ```appears_down.py```)

**Remember to register to obtain a user, password you can use with this**


In [None]:
# the download folder
out_dir = 'modis1km'
os.mkdir(out_dir)
# the vars we need for the function
aoi = "aoi.geojson"
user = 'your name'
password='your password'
start_date="05-01"
end_date="05-31"

Please use...
```python 
appears_download?
```
...to query the docstring of the fuction and the params will be explained. 

Of most importance here are:
```python
product_layer=[('MOD13A3.006', '_1_km_monthly_NDVI')], recurring=True
```
**The products**

The APPears API requires the NASA/USGS code for the product ```MOD13A3.006``` and ```_1_km_monthly_NDVI``` layer. This allows us to download subset of the product and a similar theory can be applied to Landsat bands etc.

**The search**

The parameters below stipulate that we are searching from start to end of May, from 2000-05 and only within the month of May. If the recurring param was switched to False we'd get all the imagery from 2000-05-01 -> 2018-05-01 rather than only within the month itself.

```python 
start_date, end_date, year_range=[2011, 2015], recurring=True

```


Whilst processing a repeat message will appear (pending > processing > done), followed by some joblib outputs.

In [None]:
appears_download(aoi, user, password, start_date, end_date, 
                 out_dir, year_range=[2000, 2018], product_layer=[('MOD13A3.006', '_1_km_monthly_NDVI')],
                 recurring=True)

### Part 2: Temporal composite and zonal stats

It's always useful to have the whole time series so we will just filter based on the meta.

The following will use the downloaded data and meta to sort through the files and weed out some errors associated with this data layer. 

To this end we use pandas and filter the meta data associated with the download.


In [None]:
# Apopgies - this is on the ugly side

df = pd.read_csv(os.path.join(out_dir,'MOD13A3-006-Statistics.csv'))
# change to datetime format
df['Date'] = pd.to_datetime(df['Date'])

# get the 08-12 only images
df1 = df.loc[(df['Date'] > "2008-01-01") & (df['Date'] <= "2012-06-01")]
# Get the file paths
fileList1 = [os.path.join(out_dir, f+'.tif') for f in df1["File Name"]]
# fix the file names
fileList1 = [f.replace("MOD13A3_006", "MOD13A3.006") for f in fileList1]
fileList1.sort()

# get the 14-18 only images
df2 = df.loc[(df['Date'] > "2014-01-01") & (df['Date'] <= "2018-06-01")]
# Get the file paths
fileList2 = [os.path.join(out_dir, f+'.tif') for f in df2["File Name"]]
# fix the file names
fileList2 = [f.replace("MOD13A3_006", "MOD13A3.006") for f in fileList]
fileList2.sort()



Sanity checks!

In [None]:
df1

In [None]:
df2

### Phew - finally we process rasters to composites, the steps of which are:

1. Stack the layers
2. Warp the raster to OSGB ('EPSG:27700')
3. Calculate the depth-wise stat


In [None]:
# 2008-12
outRas1 = os.path.join(out_dir, "ModisMay08-12.tif")
rs.stack_ras(fileList1, outRas1)

# warp & reproj it for later
repro1 = outRas1[:-4]+"espg3035.tif"
rs._quickwarp(outRas1, repro1, proj='EPSG:3035')

# As above create the output stacks
compRas1 = os.path.join(out_dir, "ModisMeanMay08-12.tif")

# Finally, average through the bands 
rs.stat_comp(repro1, compRas1, bandList=[1], stat='mean')

# 2014-18 - REPEAT as for above
outRas2 = os.path.join(out_dir, "ModisMay14-18.tif")
rs.stack_ras(fileList2, outRas2 )

repro2 = outRas2[:-4]+"espg3035.tif"
rs._quickwarp(outRas2, repro2, proj='EPSG:3035')

compRas2 = os.path.join(out_dir, "ModisMeanMay14-18.tif")

rs.stat_comp(repro2, compRas2, bandList=[1], stat='mean')

### Zonal Stats

Lastly we add the zonal stats using an all touching strategy. Subsets of the Corine lcm is included in the repo ```Corine12/18.gpkg```

Should you wish to only include pixels not touching borders add...

```python
all_touched=False
```

In [None]:
# Now we add the zonal stat to each polygon in corine
inShp1 = "corine12.gpkg"
inShp2 = "corine18.gpkg"

# stats
stats = zonal_stats(inShp1, compRas1, 1, "May2008_15ndvi")
stats = zonal_stats(inShp2, compRas2, 1, "May2014_18ndvi")
