# Pixel Drill for Indices outputs to CSV - Multi-site

| Authors:  | Bex Dunn|
|----------|----------------|
| Created: | March 6, 2019 |
| Last edited: | Oct 8, 2019 |

**Requirements:**

You need to run the following commands from the command line prior to launching jupyter notebooks from the same terminal so that the required libraries and paths are set:

`module use /g/data/v10/public/modules/modulefiles` 

`module load dea`

If you find an error or bug in this notebook, please either create an 'Issue' in the Github repository, or fix it yourself and create a 'Pull' request to contribute the updated notebook back into the repository (See the repository [README](https://github.com/GeoscienceAustralia/dea-notebooks/blob/master/README.rst) for instructions on creating a Pull request).

__Background:__ Data from the [Landsat](https://landsat.usgs.gov/about-landsat) 5,7 and 8 satellite missions are accessible through [Digital Earth Australia](http://www.ga.gov.au/about/projects/geographic/digital-earth-australia) (DEA).

__What does this notebook do?:__ This notebook takes a supplied CSV of site points. It runs a pixel drill through surface reflectance, calculates NDVI, Taselled cap wetness and greenness, and outputs a csv of values for each site and plots of each index for each site.

**Tags**: :index:`Landsat`,:index:`Landsat5`,:index:`Landsat7`,:index:`Landsat8`, :index:`pixeldrill`, :index:`DEAPlotting`, :index:`datacube.utils.geometry`, :index:`query`,:index:`Scripts`,:index:`tasseled_cap`, :index:`NDVI`,                                                                                                           :index:`DEADataHandling`, :index:`DEAPlotting`, :index:`load_clearlandsat`

import some modules

In [5]:
import datacube
import datetime
import fiona
import geopandas as gpd
import numpy as np
import pandas as pd
import rasterio.mask
import rasterio.features
import shapely
import seaborn as sns
import sys
import xarray as xr

import matplotlib.dates as mdates
import matplotlib.gridspec as gridspec
import matplotlib.pyplot as plt

from datacube.storage import masking
from datacube.utils import geometry

sys.path.append('../../10_Scripts')
import DEADataHandling, DEAPlotting, TasseledCapTools, BandIndices

dc = datacube.Datacube(app='pixel drill')

%load_ext autoreload

%autoreload 2

#set up file to open 

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [6]:
inpath = '/g/data/r78/rjd547/Melbourne_water/50 more points to check.csv'
### Convert csv latitude and longitude values into a geopandas geodatafrome

sites = pd.read_csv(inpath, delimiter=",")
#turn csv geometry into geopandas dataframe geometry using lambda functions and shapely

In [7]:
sites['x']=sites['lat']
sites['y']=sites['long']

In [8]:
sites['geometry']=sites.apply(lambda z: shapely.geometry.Point(z.x, z.y), axis=1)
sites = gpd.GeoDataFrame(sites)
sites.head()

#define an output location
output_loc = '/g/data/r78/rjd547/Melbourne_water/points/point_'

### run this for multiple sites

for site in range(0,1):#(len(sites)),1):    
    print(f'running for index {site}')
    lon = sites['x'][site]
    lat = sites['y'][site]
    query = {'lat':lat, 
          'lon':lon}               
    #print(query)
    outfilename=f'{output_loc}{lat}_{lon}.csv'
    print (outfilename)

    try:
        ls578 = DEADataHandling.load_clearlandsat(dc=dc, query=query, product='nbart', ls7_slc_off=True)
    except:
        print ('nah')
        ### Calculate NDVI 

running for index 0
/g/data/r78/rjd547/Melbourne_water/points/point_-38.27652174_145.57951719999997.csv
Loading ls5
    Loading 208 filtered ls5 timesteps
Loading ls7
    Loading 257 filtered ls7 timesteps
Loading ls8
    Loading 143 filtered ls8 timesteps
Combining and sorting ls5, ls7, ls8 data
    Replacing invalid -999 values with NaN (data will be coerced to float64)


In [9]:
    ### Calculate the tasseled cap indices
    tci = TasseledCapTools.thresholded_tasseled_cap(ls578,wetness_threshold=-350, drop=True , drop_tc_bands=False)
    

In [10]:
tci

<xarray.Dataset>
Dimensions:                 (time: 608, x: 1, y: 1)
Coordinates:
  * y                       (y) float64 -4.249e+06
  * x                       (x) float64 1.195e+06
  * time                    (time) datetime64[ns] 1987-05-27T23:27:59.500000 ... 2019-09-09T00:03:41
Data variables:
    greenness               (time, y, x) float64 nan 501.9 153.1 ... 935.1 nan
    greenness_thresholded   (time, y, x) float64 nan nan nan ... nan 935.1 nan
    brightness              (time, y, x) float64 nan 3.352e+03 ... 3.874e+03 nan
    brightness_thresholded  (time, y, x) float64 nan nan nan nan ... nan nan nan
    wetness                 (time, y, x) float64 nan -871.7 ... -1.629e+03 nan
    wetness_thresholded     (time, y, x) float64 nan nan 160.4 ... nan nan nan
Attributes:
    crs:      EPSG:3577

In [14]:
tci = tci.drop(['greenness_thresholded','brightness_thresholded','wetness_thresholded'])

In [27]:
#drop the 1-dimensional x and ys
tci =tci.squeeze()

In [56]:
#make a dataframe for writing to csv
tci_df = tci.to_dataframe()
tci_df = tci_df.drop(columns=['y','x'])
#drop all nan-slices
tci_df.dropna(axis='index', thresh =2)

Unnamed: 0_level_0,greenness,brightness,wetness
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1987-07-30 23:29:25.500,501.8768,3352.1549,-871.6782
1987-08-31 23:30:20.500,153.1092,680.1554,160.3808
1987-10-18 23:31:22.500,1520.0638,3474.7177,-917.0949
1987-12-21 23:32:29.500,857.6801,2801.1277,-741.1382
1988-03-26 23:33:51.500,1254.6336,3421.4741,-1033.1302
...,...,...,...
2019-03-08 23:57:00.000,1072.9993,5164.4219,-1935.2524
2019-03-17 00:03:02.000,1392.5752,4901.8552,-1692.4041
2019-04-02 00:02:59.000,1522.3535,4123.4779,-1334.7667
2019-06-21 00:03:18.500,1376.6666,3222.1653,-725.4150


In [57]:
tci_df

Unnamed: 0_level_0,greenness,brightness,wetness
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1987-05-27 23:27:59.500,,,
1987-07-30 23:29:25.500,501.8768,3352.1549,-871.6782
1987-08-31 23:30:20.500,153.1092,680.1554,160.3808
1987-09-16 23:30:44.500,,,
1987-10-02 23:31:05.500,,,
...,...,...,...
2019-07-07 00:03:22.500,,,
2019-07-23 00:03:26.000,,,
2019-08-08 00:03:33.000,,,
2019-08-24 00:03:37.000,935.1258,3873.6255,-1628.8586


In [38]:
!ls /g/data/r78/rjd547/Melbourne_water/more_points/
!head /g/data/r78/rjd547/Melbourne_water/more_points/point_-38.27652174_145.57951719999997.csv

point_-38.27652174_145.57951719999997.csv
time,x,y,greenness,brightness,wetness
1987-07-30 23:29:25.500,1195362.5,-4248512.5,501.8768000000001,3352.1549,-871.6782
1987-08-31 23:30:20.500,1195362.5,-4248512.5,153.10920000000002,680.1554,160.38079999999994
1987-10-18 23:31:22.500,1195362.5,-4248512.5,1520.0638,3474.7177,-917.0949
1987-12-21 23:32:29.500,1195362.5,-4248512.5,857.6801000000003,2801.1277,-741.1382
1988-03-26 23:33:51.500,1195362.5,-4248512.5,1254.6336,3421.4741,-1033.1302
1988-04-11 23:33:56.500,1195362.5,-4248512.5,1393.0037000000002,3059.3094000000006,-962.186
1988-05-29 23:34:20.000,1195362.5,-4248512.5,1028.4586,2470.2125000000005,-1044.9011
1988-06-14 23:34:25.500,1195362.5,-4248512.5,1113.3312,2586.3375000000005,-1071.2801
1988-07-16 23:34:32.500,1195362.5,-4248512.5,1045.3794,3243.4834,-1056.9989


In [19]:
    !ls /g/data/r78/rjd547/Melbourne_water/points/*.csv     

running for index 0
/g/data/r78/rjd547/Melbourne_water/points/point_-38.27652174_145.57951719999997.csv
Loading ls5
    Loading 208 filtered ls5 timesteps
Loading ls7
    Loading 257 filtered ls7 timesteps
Loading ls8
    Loading 142 filtered ls8 timesteps
Combining and sorting ls5, ls7, ls8 data
    Replacing invalid -999 values with NaN (data will be coerced to float64)


ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()