# Time Serise Algorithm for Sentinel-2 Cloud Mask

## Overview
This product is a time series cloud and cloud shadow detection algorithm for Sentinel-2 surface reflectance data.It models time series of surface reflectance derived indices and calculates time series abnormality coefficients for pixels in the time series. It does not rely on predefined training data to generate complex models with many rule sets, which often work well for data similar to the training data while returning poor results for data contrasting to the training data. Instead, it identifies cloud and cloud shadows by detecting local abnormalities in temporal and spatial contexts from abnormality coefficients.


## Required Modules 

### DEA modules


The program requires functions in DEA Datacube API. To load DEA module, open a terminal and run following commands: 

`module use /g/data/v10/public/modules/modulefiles`

`module load dea`

### Other Python modules 


## Input Data



## Workflow and Functions of the Program  

### Data loading




### Cloud and Cloud shadow detection



## Output Data

The program produces a cloud mask data with the same dimension as the input time series data,a Sentinel-2 pixel is calsiified as one of four distinctive categories:
 

No observation ---> 0
Clear ---> 1
Cloud ---> 2
Cloud shadow ---> 3



**Function ID: F** 

**Function Name:**

**Description:** 


This is the template for function description

**Input:**

**Return:**

**called by:**


**Function ID:** 

F1

**Function Name:**

load_s2_nbart_ts

**Description:** 


This function load Sentinel-2 surface reflectance data from DEA database. The spatial coverage is specfied by a rectangle bounding box. The temporal coverage is specified by the start date and the end date of the time series data

**Input:**

The lat./Lon. of the top-left and bottom-right corners of a rectangle bounding box, he start date and the end date of the time series data 

**Return:**
an Xarray contains 6 spectral bands of time series surface reflectance data 

**called by:**
Main Program

In [7]:
## Function F1

def load_s2_nbart_ts(dc, lat_top, lat_bottom, lon_left, lon_right, start_of_epoch, end_of_epoch):
    
    # Define spatial and temporal coverage 
    
    newquery={'x': (lon_left, lon_right),
          'y': (lat_top, lat_bottom),
          'time': (start_of_epoch, end_of_epoch),
          'output_crs': 'EPSG:3577',
          'resolution': (-20, 20)} 
    
    # Names of targeted spectral bands
    
    allbands=['nbart_blue', 'nbart_green', 'nbart_red', 'nbart_nir_2', 'nbart_swir_2', 'nbart_swir_3']

    new_bandlabels = ['blue','green', 'red', 'nir', 'swir1', 'swir2']

    # Load S2 data using Datacube API
    s2_ds = DEADataHandling.load_clearsentinel2(dc=dc, query=newquery, sensors=('s2a', 's2b'), product='ard',
                                    bands_of_interest=allbands, mask_pixel_quality=False, 
                                    mask_invalid_data=False, masked_prop=0.0)

    # Rename spectral band names to new band labels
 
    rndic=dict(zip(allbands, new_bandlabels))
    s2_ds = s2_ds.rename(rndic)
    
    # Add tsmask dataarray to the dataset
    
    s2_ds['tsmask']=s2_ds['blue']
    s2_ds['tsmask'].values=np.zeros((s2_ds['time'].size, s2_ds['y'].size,s2_ds['x'].size), dtype=np.uint8)
    

    
    
    return s2_ds

**Function ID: F2** 

**Function Name:**

**Description:** 


This function load pre-saved Sentinel-2 Surface reflectance data, specified by the name of the data file

**Input:**

**Return:**

**called by:**




**Function ID: F6** 

**Function Name:**

findnbsa(vsa, k, N, dws_flags, vtsmask)

**Description:** 


This function find mean values of neighbour pixels for a specified segment in a time series

**Input:**

vsa: 1D array, time series
k: location of the specified segment
N: length of the segment
dws_flags: flags indicating that a pixel is either a non-shadow pixel or a water pixel
vtsmask: time series of cloud/shadow labels

**Return:**

**called by:**
Fuction F5, testpair

In [8]:
def findnbsa(vsa, k, N, vss, dws_flags, vtsmask):
    
    
    cc=0
    mvd=[0,0]
    
    lpt=k
    rpt=k+N-1
    
    dr=0
    mid=0.0
    
    while cc<2*N:
        
        
        if (dr==0):
            if (mvd[0]==0):
                while True:
                    lpt -= 1
                    if (lpt<0):
                        mvd[dr]=1
                        dr=1
                        break
                    elif (vtsmask[lpt]==1 and dws_flags[lpt]==1):
                        mid += vsa[lpt]
                        cc += 1
                        dr=1
                        break
            else:
                dr=1
        else: 
            if (mvd[1]==0):
                while True:
                    rpt += 1
                    if (rpt==vss):
                        mvd[dr]=1
                        dr=0
                        break
                    elif (vtsmask[rpt]==1 and dws_flags[rpt]==1):
                        mid += vsa[rpt]
                        cc += 1
                        dr=0
                        break
            else:
                dr=0

        #print(dr, lpt, rpt, mvd[0], mvd[1])
        if (mvd[0]==1 and mvd[1]==1):
            break
            
    
    if (cc<2*N):
        return 0
    else:
        return mid/(2*N)
    
    
        
    


**Function ID: F5** 

**Function Name:**

testpair(sa, dwi, N, tsmask)

**Description:** 


This function identifies cloud and shadow pixels in a time series by comparing its value to its neighbours

**Input:**

sa: 1D numpy array, time series of the mean of surface reflectance value of the 6 spectral bands
dwi: 1D numpy array, time series of MNDWI, modified normalised water difference index
tsmasak: time series of cloud/shadow labels

**Return:**

updated tsmask 

**called by:**
Function ID: F4, perpixel_filter(s2_ds, y, x)

In [9]:
#

def testpair(sa, dwi, N, tsmask):
    
    
  
    cspkthd = 0.42
    sspkthd = 0.42

    cloudthd = 0.10
    shadowthd = 0.055


    
    
    validx=np.where(tsmask==1)[0]
    
    
    vss=validx.size
    if vss<3*N:
        return
   
    ## Filter out invalid, cloud, shadow points in time series
    vsa=sa[validx]
    vdwi=dwi[validx]
    vtsmask=tsmask[validx]
    chmarker=np.zeros(vss, dtype=np.int8)
    
    dws_flags=np.logical_or(vsa>shadowthd, vdwi>0)
    
   # print(vss)
    #print(vsa)
   # print(vtsmask)
    
    
        
    numse=vss-N+1
    
    msa=np.zeros(numse, dtype=np.float32)
    
    #calculate mean values of the time series segments
    
    if (N==1):
        msa=vsa
        
    else:
        for i in np.arange(numse):
            msa[i]=vsa[i:i+N].sum()/N

            
    #print(msa)        
    
    #sort the time series of mean of the segemnts
    sts = np.argsort(msa)
    
    # reverse the order from ascending to descending, so that sts contains index number of msa array, from 
    #highest values to the lowest
    
    sts = sts[::-1]
    
    cc=0
    for k in sts:
    
        if (chmarker[k]==0):
            m2=msa[k]
            mid=findnbsa(vsa, k, N, vss, dws_flags, vtsmask)
            
            # 
            if (m2>mid and mid>0):
                if ((m2-mid)/mid> cspkthd and m2>cloudthd):
                    vtsmask[k:k+N]=2
                    chmarker[k:k+N]=1
                    
            elif (mid>m2 and mid>0):
                
                if ((mid-m2)/m2> sspkthd and m2<shadowthd):
                    vtsmask[k:k+N]=3
                    chmarker[k:k+N]=1

            
            cc=cc+1
            #print(cc, vss)
             
    #print(k, validx[k], msa[k])
    
    
    #print(msa)
    #print(sts)
    
    #print(vtsmask)
    tsmask[validx]=vtsmask

**Function ID: F4** 

**Function Name:**

perpixel_filter

**Description:** 


This function perform time series cloud/shadow detection for one pixel 

**Input:**

Surface reflectance time series data for the pixel 


**Return:**

Updated cloud/shadow time series data for the pixel


**called by:**


In [10]:
## Function ID: F4

def perpixel_filter(s2_ds, y, x):
    
    tsmask = s2_ds['tsmask'].values[:, y, x]
    
    tsmask[:] = 1
    
    scale=10000.0
    ivd=-999/scale
    
    # copy time series spectral data from the data set, scale the data to float32 and <1.0
    
    blue = s2_ds['blue'].values[:, y, x].copy().astype(np.float32) / scale
    
    tsmask[blue==ivd]=0
    
    green = s2_ds['green'].values[:, y, x].copy().astype(np.float32) / scale
    
    tsmask[green==ivd]=0

    red = s2_ds['red'].values[:, y, x].copy().astype(np.float32) / scale
    
    tsmask[red==ivd]=0
    
    nir = s2_ds['nir'].values[:, y, x].copy().astype(np.float32) / scale
    
    tsmask[nir==ivd]=0
        
    swir1 = s2_ds['swir1'].values[:, y, x].copy().astype(np.float32) / scale
    
    tsmask[swir1==ivd]=0
    
    swir2 = s2_ds['swir2'].values[:, y, x].copy().astype(np.float32) / scale
    
    tsmask[swir2==ivd]=0
    
    # Add section
    
    validx=np.where(tsmask==1)[0]
    
    #print(validx)
    
    nb=6
    
    # calculate indices
    
    #
    sa=(blue+green+red+nir+swir1+swir2)/nb
    mndwi=(green-swir1)/(green+swir1)
    msavi=(2*nir+1-np.sqrt((2*nir+1)*(2*nir+1)-8*(nir-red)))/2
    wbi=(red-blue)/blue;
    rgm=red+blue
    grbm=(green-(red+blue)/2)/((red+blue)/2)
    
    maxcldthd=0.45
    
    
    tsmask[sa>maxcldthd]=2
    
    
    
    #print(sa)
    #print(tsmask)
    testpair(sa, mndwi, 1, tsmask)
    testpair(sa, mndwi, 1, tsmask)
    testpair(sa, mndwi, 1, tsmask)
    testpair(sa, mndwi, 2, tsmask)
    testpair(sa, mndwi, 2, tsmask)
    testpair(sa, mndwi, 3, tsmask)
    testpair(sa, mndwi, 1, tsmask)
    
    

    shdthd = 0.05;

    dwithd=-0.05

    landcloudthd=-0.38

    avithd=0.06
    wtdthd=-0.2


    
    for i, lab in enumerate(tsmask):        
        if (lab==3 and mndwi[i]>dwithd and sa[i]< shdthd): #water pixel, not shadow
            tsmask[i] = 1
            
        if (lab==2 and mndwi[i]<landcloudthd): # bare ground, not cloud
            tsmask[i] = 1
        
        if (lab==3 and msavi[i]<avithd and mndwi[i]>wtdthd): # water pixel, not shadow
            tsmask[i] = 1

        if (lab==1 and wbi[i]<-0.02 and rgm[i]>0.06 and rgm[i]<0.29 and mndwi[i]<-0.1 and grbm[i] < 0.2): # thin cloud
            tsmask[i] = 2
            

    
    #print(tsmask)

**Function ID:** 

F3

**Function Name:**

cloud_mask_filter

**Description:** 


This function detects cloud and cloud shadows using time series of Sentinel-2 surface reflectance data



**Input:**

 An xarray dataset contains 6 spectral bands of time series surface reflectance data 

**Return:**

None. The tsmask in the input xarray dataset will be updated with cloud/shadow mask values  

**called by:**
Main program

In [24]:
def cloud_mask_filter(s2_ds):
    
    irow = s2_ds['y'].size
    icol = s2_ds['x'].size
    
    for y in np.arange(irow):
        print(y+1, irow, (y+1)/float(irow)*100.0, '%')
        for x in np.arange(icol):
           
            perpixel_filter(s2_ds, y, x)
    

In [12]:
## Start Main program

## Import modules 

import datacube
import sys
import numpy as np
import time
import os
import rasterio
import xarray as xr

import DEADataHandling






In [13]:
## Specify input parameters

#lat_top, lat_bottom, lon_left, lon_right =  -35.144, -35.505, 148.985, 149.284
lat_top, lat_bottom, lon_left, lon_right =  -35.244, -35.344, 149.055, 149.155
start_of_epoch, end_of_epoch = '2017-01-01', '2018-12-31'
    



In [14]:
## Load surface reflectance data

dc = datacube.Datacube(app='load_clearsentinel')


## Call F1
s2_ds = load_s2_nbart_ts(dc, lat_top, lat_bottom, lon_left, lon_right, start_of_epoch, end_of_epoch)

## Divide the whole data into small chunks, use xarray new(v0.16) map_block function
## 



Loading s2a pixel quality
    Loading 70 filtered s2a timesteps
Loading s2b pixel quality
    Loading 54 filtered s2b timesteps
Combining and sorting s2a, s2b data


In [25]:
## Call time series cloud / shadow detection for each chunk  

cloud_mask_filter(s2_ds)


1 613 0.1631321370309951 %
2 613 0.3262642740619902 %
3 613 0.48939641109298526 %
4 613 0.6525285481239804 %
5 613 0.8156606851549755 %
6 613 0.9787928221859705 %
7 613 1.1419249592169658 %
8 613 1.3050570962479608 %
9 613 1.468189233278956 %
10 613 1.631321370309951 %
11 613 1.794453507340946 %
12 613 1.957585644371941 %
13 613 2.1207177814029365 %
14 613 2.2838499184339316 %
15 613 2.4469820554649266 %
16 613 2.6101141924959217 %
17 613 2.7732463295269167 %
18 613 2.936378466557912 %
19 613 3.0995106035889073 %
20 613 3.262642740619902 %
21 613 3.4257748776508974 %
22 613 3.588907014681892 %
23 613 3.7520391517128875 %
24 613 3.915171288743882 %
25 613 4.078303425774878 %
26 613 4.241435562805873 %
27 613 4.404567699836868 %
28 613 4.567699836867863 %
29 613 4.730831973898858 %
30 613 4.893964110929853 %
31 613 5.057096247960848 %
32 613 5.220228384991843 %
33 613 5.383360522022838 %
34 613 5.5464926590538335 %
35 613 5.709624796084829 %
36 613 5.872756933115824 %
37 613 6.0358890701



93 613 15.171288743882544 %
94 613 15.334420880913541 %
95 613 15.497553017944535 %
96 613 15.660685154975528 %
97 613 15.823817292006526 %
98 613 15.98694942903752 %
99 613 16.150081566068515 %
100 613 16.31321370309951 %
101 613 16.476345840130506 %
102 613 16.6394779771615 %
103 613 16.802610114192497 %
104 613 16.965742251223492 %
105 613 17.128874388254488 %
106 613 17.29200652528548 %
107 613 17.45513866231648 %
108 613 17.61827079934747 %
109 613 17.781402936378466 %
110 613 17.94453507340946 %
111 613 18.107667210440457 %
112 613 18.270799347471453 %
113 613 18.43393148450245 %
114 613 18.59706362153344 %
115 613 18.76019575856444 %
116 613 18.92332789559543 %
117 613 19.086460032626427 %
118 613 19.249592169657422 %
119 613 19.412724306688418 %
120 613 19.575856443719413 %
121 613 19.73898858075041 %
122 613 19.9021207177814 %
123 613 20.0652528548124 %
124 613 20.22838499184339 %
125 613 20.39151712887439 %
126 613 20.554649265905383 %
127 613 20.717781402936378 %
128 613 20.



162 613 26.42740619902121 %
163 613 26.590538336052198 %
164 613 26.753670473083197 %
165 613 26.916802610114193 %
166 613 27.07993474714519 %
167 613 27.24306688417618 %
168 613 27.40619902120718 %
169 613 27.569331158238175 %
170 613 27.732463295269167 %
171 613 27.895595432300162 %
172 613 28.05872756933116 %
173 613 28.221859706362153 %
174 613 28.38499184339315 %
175 613 28.548123980424144 %
176 613 28.711256117455136 %
177 613 28.87438825448613 %
178 613 29.03752039151713 %
179 613 29.200652528548126 %
180 613 29.363784665579118 %
181 613 29.526916802610113 %
182 613 29.690048939641112 %
183 613 29.853181076672104 %
184 613 30.0163132137031 %
185 613 30.179445350734095 %
186 613 30.342577487765087 %
187 613 30.505709624796083 %
188 613 30.668841761827082 %
189 613 30.831973898858074 %
190 613 30.99510603588907 %
191 613 31.158238172920065 %
192 613 31.321370309951057 %
193 613 31.484502446982056 %
194 613 31.64763458401305 %
195 613 31.810766721044047 %
196 613 31.97389885807504 

453 613 73.89885807504079 %
454 613 74.06199021207178 %
455 613 74.22512234910276 %
456 613 74.38825448613376 %
457 613 74.55138662316476 %
458 613 74.71451876019576 %
459 613 74.87765089722676 %
460 613 75.04078303425776 %
461 613 75.20391517128874 %
462 613 75.36704730831974 %
463 613 75.53017944535073 %
464 613 75.69331158238172 %
465 613 75.85644371941272 %
466 613 76.01957585644372 %
467 613 76.18270799347472 %
468 613 76.3458401305057 %
469 613 76.5089722675367 %
470 613 76.6721044045677 %
471 613 76.83523654159869 %
472 613 76.99836867862969 %
473 613 77.16150081566069 %
474 613 77.32463295269169 %
475 613 77.48776508972267 %
476 613 77.65089722675367 %
477 613 77.81402936378467 %
478 613 77.97716150081567 %
479 613 78.14029363784665 %
480 613 78.30342577487765 %
481 613 78.46655791190864 %
482 613 78.62969004893964 %
483 613 78.79282218597064 %
484 613 78.95595432300163 %
485 613 79.11908646003263 %
486 613 79.28221859706362 %
487 613 79.44535073409462 %
488 613 79.608482871125

In [None]:
## Apply spatial cloud filter


## Output cloud/shadow mask results




## End Main program


In [26]:
s2_ds

In [27]:

tsmask=s2_ds['tsmask']
tsmask[:,2,2]

In [None]:
## Function Id: F1
## Description:
## Input:
## Return:
## Called by:


In [None]:
## Function Id: F1
## Description:
## Input:
## Return:
## Called by:


In [None]:
## Function Id: F1
## Description:
## Input:
## Return:
## Called by:
