## Multi-dimensional Analysis with xarray

### Questions
- How do I work with multidimensional data like NetCDF files? 

### Objectives
- Learn how to use xarray to conscisely work with multidimensional data

### Introduction
Xarray is an open source Python package that extends the labeled data functionality of Pandas to N-dimensional array-like datasets. It has a similar API to NumPy and Pandas, and supports both Dask and NumPy arrays. 

Xarray data structures can store netCDF, and GeoTiFFs. This notebook uses xarray to illustrate simple NDVI calculation using from GeoTIFFs. 

In [None]:
import os
import json
import rasterio
import requests

import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import xarray as xr 

#### Recall that we are interested in looking at landuse over the State of Pará in Brazil, where extensive logging and illegal deforestation is happening. The Landsat tile we will be looking at is Path 227, Row 065. The date for the file we will be accessing is 8 June, 2020 and we will extract the NIR, red band and metadata file from the AWS s3 bucket


In [None]:
# Open path to file on s3 bucket with rasterio
print('Landsat on AWS:')
filepath = 'http://landsat-pds.s3.amazonaws.com/c1/L8/227/065/LC08_L1TP_227065_20200608_20200626_01_T1/LC08_L1TP_227065_20200608_20200626_01_T1_B4.TIF'
with rasterio.open(filepath) as src:
    print(src.profile)

In [None]:
date = '2020-06-08'
url = 'http://landsat-pds.s3.amazonaws.com/c1/L8/227/065/LC08_L1TP_227065_20200608_20200626_01_T1/'
redband = 'LC08_L1TP_227065_20200608_20200626_01_T1_B{}.TIF'.format(4)
nirband = 'LC08_L1TP_227065_20200608_20200626_01_T1_B{}.TIF'.format(5)
mtlfile = 'LC08_L1TP_227065_20200608_20200626_01_T1_{}.json'.format('MTL')

with rasterio.open(url+redband) as src:
    profile = src.profile
    oviews = src.overviews(1) # list of overviews from biggest to smallest
    oview = oviews[1]  # Use second-highest resolution overview
    print('Decimation factor= {}'.format(oview))
    red = src.read(1, out_shape=(1, int(src.height // oview), int(src.width // oview)))

plt.imshow(red)
plt.colorbar()
plt.title('{}\nRed {}'.format(redband, red.shape))
plt.xlabel('Column #')
plt.ylabel('Row #')

In [None]:
# Get the shape size for the red band image

red = rasterio.open(url+redband)
print(red.is_tiled)
red.block_shapes

Create Xarray size with 512* 512 byte XArray

In [None]:
red = xa.open_rasterio(url+redband, chunks={'band': 1, 'x': 1024, 'y': 1024})
nir = xa.open_rasterio(url+nirband, chunks={'band': 1, 'x': 1024, 'y': 1024})
red

Inspecting the Dataset above, it has three dimensions (band, y, and x), similar to axes in NumPy and pandas. Index objects (also named band, x, and y), and no data variables.

## Calculate NDVI

N𝐷𝑉𝐼=𝑁𝐼𝑅−𝑅𝑒𝑑 / 𝑁𝐼𝑅+𝑅𝑒𝑑

In [None]:
ndvi = (nir - red) / (nir + red)
ndvi2d = ndvi.squeeze()

In [None]:
plt.figure()
im = ndvi2d.compute().plot.imshow(cmap='BrBG', vmin=-0.5, vmax=1)
plt.axis('equal')
plt.show()