# Collie

Collie is a town in the South West region of Western Australia, 213 kilometres south of the state capital, Perth, and 59 kilometres inland from the regional city and port of Bunbury. It is near the junction of the Collie and Harris Rivers, in the middle of dense jarrah forest and the only coalfields in Western Australia. At the 2016 census, Collie had a population of 7,192.

Collie is mainly known as a coal-producing centre, but also offers industrial, agricultural and aquaculture tourism industries. Muja Power station is located east of the town, and to its west is the Wellington Dam, a popular location for fishing, swimming and boating.

<img src="data/Collie_Satellite.png" alt="drawing" width="400" align="left"/>

Western Collieries Limited in 1950. (Source: Premier Coal)
<img src="data/western_colleries.jpeg" alt="drawing" width="400" align="left"/>


### Task:

You are the owner of Westenviro, a consulting company in Perth especialised in environmental impact studies and landscape rehabilitation plans. You have been contacted by the Collie local council regarding the inminent closure of the local coal mine. They want to carry out a study about the impact that this mine has had in the area over the last years and potential ways of rehabilitating the area over the coming years.

Back in your office, you want to have a look at the effect that this open mine has had in the area. You have used the DEA Fractional Cover product in the past and decide to have a look and compare the product generated by Landsat 5 in the 90's with the current situation captured by Landsat 8.

Fractional Cover represents the proportion of the land surface that is bare soil (BS), covered by photosynthetic vegetation (PV), or non-photosynthetic vegetation (NPV). The green (PV) fraction includes leaves and grass, the non-photosynthetic fraction (NPV) includes branches, dry grass and dead leaf litter, and the bare soil (BS) fraction includes bare soil or rock. You expect to see an increase in the bare soil fraction over the past years.

In [None]:
%matplotlib inline

import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import xarray as xr
import numpy as np
import os
import sys

#modules for datacube
import datacube
from datacube.utils import masking

# Import external functions from dea-notebooks
sys.path.append('./Scripts')
import DEAPlotting, DEADataHandling

#ignore datacube warnings (needs to be last import statement)
import warnings
warnings.filterwarnings('ignore', module='datacube')

### Setting up

We create the DEA object and list the products that are currently available on the DEA containing the string `fc` which is the code that indicates Fractional Cover.

In [None]:
dc = datacube.Datacube(app='dc-FC')
products = dc.list_products()
display_columns = ['name', 'description']
display_rows = [1]
dc_products = products[display_columns]
dc_products[dc_products['name'].str.contains("fc")]

### Collie in 2004

You define the query that will gather the Fractional Cover images for the region east of Collie, where the mine is located for the first months of 2004 specifying the Landsat 5 derived product `ls5_fc_albers`.

In [None]:
query = {
        'lat': (-33.3, -33.4),
        'lon': (116.2, 116.3),
        'time':('2004-01-01', '2004-03-31')
        }

collie_04 = dc.load(product='ls5_fc_albers', **query)

collie_04

### Visualising the region

After performing the request you plot the bare soil component of the 5 images returned by DEA.

In [None]:
collie_04.BS.plot(col='time', col_wrap=5, cmap='copper')

### Deciding on a representative sample

After this visual inspection you realise the first 3 images are probably affected by clouds and decide to use the 4th as your representative sample of the state of the region in 2004.

> Tip: Notice that we are selecting just one temporal element but we use the `slice()` notation so we don't loose the temporal dimension in the resulting dataset. We'll need this temporal information for later in the tutorial

In [None]:
collie_04 = collie_04.isel(time=slice(3,4))

### Collie in 2019

You repeat the operation using Landsat 8 images `ls8_fc_albers` over the same period of 2019 to minimise the seasonal changes. 

In [None]:
query = {
        'lat': (-33.3, -33.4),
        'lon': (116.2, 116.3),
        'time':('2019-01-01', '2019-03-31')
        }

collie_19 = dc.load(product='ls8_fc_albers', **query)

collie_19

In [None]:
collie_19.BS.plot(col='time', col_wrap=5, cmap='copper')

In [None]:
collie_19 = collie_19.isel(time=slice(4,5))

### Visualising the 3 fractions together

In [None]:
def plot_fc_fractions(ds):
    #set up our images on a grid using gridspec
    plt.figure(figsize=(12,8))
    gs = gridspec.GridSpec(2,2) # set up a 2 x 2 grid of 4 images for better presentation

    ax1=plt.subplot(gs[0,0])
    ds.PV.plot(cmap='gist_earth_r')
    ax1.set_title('PV')

    ax2=plt.subplot(gs[1,0])
    ds.BS.plot(cmap='Oranges')
    ax2.set_title('BS')

    ax3=plt.subplot(gs[0,1])
    ds.NPV.plot(cmap='copper')
    ax3.set_title('NPV')

    ax4=plt.subplot(gs[1,1])
    ds.UE.plot(cmap='magma')
    ax4.set_title('UE')

    plt.tight_layout()
    plt.show()
    
plot_fc_fractions(collie_04)

In [None]:
plot_fc_fractions(collie_19)

### First visual assessment

Comparing the fractions of the two dates you can see that the area has suffered a quite drastic change over the last 15 years. The mine seem to have extended over the west region where the BS fraction has significantly increased and the PV has suffered a notable reduction. 

Happy with these results you decide to work on your data a little bit more to try to get some numerical evidence of the changes.

You have seen that the UE component, which corresponds to Unmixing Error presents some high values in some areas of your image. The residual error is defined as the Euclidean Norm of the Residual Vector that calculates the fractions. High values express less confidence in the fractional components and you decide to get rid these pixels that might introduce errors in your calculations.

### Filtering suspicious pixels

You are interested in getting some numbers of the PV fraction but you realise the water bodies in the area introduce some false photosynthetic signal and you want to filter those out. 

1. Can you find a suitable value for UE to mask the signal from the water bodies in this area? _Replace the `?` symbol with the right value.
2. Can you compute the average PV fraction on 2004 and compare that with the one in 2019?

In [None]:
plt.figure(figsize=(12,8))
#collie_04.UE.where(collie_04.UE<=?).plot(cmap='gist_earth_r')

### Your code goes here

### Filtering fractional cover scenes using WOfS feature layers (WOFLs) 

There is not much going on today in the office and you decide to experiment a little bit more with these data. You've heard of other people using the WOfS quality layers to filter other products and decide to give it a go here to create  masks for removing areas of water.

- For more information on WOfS, see the [Introduction_to_WOfS](https://github.com/GeoscienceAustralia/dea-notebooks/blob/master/02_DEA_datasets/Introduction_to_WOfS.ipynb) notebook.

You can load the wofs feature layers (wofls) within the same query as Fractional Cover, using the functionality `like`

In [None]:
wofls_04 = dc.load(product = 'wofs_albers', like=collie_04)

### Displaying the values in wofs
WOfS uses [bit flags](http://datacube-core.readthedocs.io/en/latest/dev/api/masking.html) to flag pixels as 'wet' or otherwise

In [None]:
masking.describe_variable_flags(wofls_04, with_pandas=True)

### Plotting the water mask

In [None]:
wetwofl = masking.make_mask(wofls_04, wet=True)
wetwofl.water.isel(time=0).plot()

In [None]:
unwofld_04 = collie_04.where(wetwofl.water==False)
unwofld_04.PV.isel(time=0).plot()

Can you follow a similar method to filter the water out of the 2019 image?

In [None]:
### Your code goes here


### Plotting the components as an RGB

In [None]:
DEAPlotting.rgb(unwofld_04, bands=['BS','PV','NPV'], index=0, index_dim='time')
DEAPlotting.rgb(unwofld_19, bands=['BS','PV','NPV'], index=0, index_dim='time')

### Concatenating data from different sensors

You want to analyse the changes in the Photosynthetic Vegetation fraction and decide that it would be convenient to merge this variable from both Datasets into a single one. Looking at XArray documentation website you learn about the `xr.concat()` function and decide to apply it here.

In [None]:
pv = xr.concat((collie_04.PV,collie_19.PV), dim='time')

pv

### Visualising the temporal changes

Using XArray functionality to compute statistics and plot data you confirm your suspicions that the region is quite less green than it used to be. 

In [None]:
pv.mean(dim=['x','y']).plot()

### Exercise

Can you try adding a few more points to this graph for the years between 2004 and 2019? You want to find out how fast this mine in Collie has expanded over the last 15 years.