## Growth stage classification by random forest, for a chosen date ('search date') over a specified area ('extent')

1. Specify parameters , 'search date' and 'extent (i.e., polygon: lampung)

    a. the usage of 'extent' is the same as that of vegetable classification (What is extent? Gjson?)

    b. String of Date

2. Search the corresponding S1A SAR data, and download and preprocess them.

    This part is the same as that of vegetable classification. The only difference is that we only download and preprocess S1A SAR. Data that are close to the search date ('dateStr'), e.g., S1A data on'20190430' and S1A data on '20190418'

   Q1: From Data in Vegetable classification, we take 2 sample? is it correct?
   Q2: Need stack or not? base on searchdate?
   Q3: search date is random or must have sequential patern
   
   Then, we obtain, 2 file with img extension:
   
   img_current, and 
   img_previous 
   
   The code will resul variable img_current and img_previous, where each image has two channels, VH and VV, 
    
    e.g.:
    - img_current(:,:,1) refers to the VH channel, while img_current(:,:,2) refers to VV channel in current date
    - img_previous(:,:,1) refers to the VH channel, while img_preious(:,:,2) refers to VV channel in previous date
   

In [250]:
# Provide TsList
import os, datetime
from datetime import timedelta, date
from scipy.io import loadmat, savemat
import numpy as np
import pandas as pd
import time
from pathlib import Path
from PIL import Image
from spectral import *
from __future__ import print_function
import tifffile as tiff
from tqdm import tqdm
home = os.getcwd()
folder= home+r'\Validation_of_products_2018\\S1A_timeseries\\Lampung_S1A_timeseries_2018Anual_Medium.data'

In [251]:
# Built in Function

def nearest_date(items, pivot):
    return min(items, key=lambda x: abs(x - pivot))

def get_list_date(files):
    temp = files.split('.')[0].split('_')
    return datetime.datetime.strptime(temp[len(temp)-1], '%d%b%Y')

def find_files(list_date, files, ext, j, dateStr, daysInterval):
    date_curr = nearest_date(list_date[ext[j]], datetime.datetime.strptime(str(dateStr), '%d%b%Y'))
    date_prev = date_curr - datetime.timedelta(daysInterval)
    idx_curr = list_date[ext[j]].index(date_curr)
    idx_prev = list_date[ext[j]].index(date_prev)
    return files[ext[j]][idx_curr], files[ext[j]][idx_prev]

def get_img_file(img_file, hdr_file, val):
    hdr_vh = folder +'/'+hdr_file[0][val]
    img_vh = folder +'/'+img_file[0][val]
    hdr_vv = folder +'/'+hdr_file[1][val]
    img_vv = folder +'/'+img_file[1][val]
    l = [[hdr_vh, img_vh], [hdr_vv, img_vv]]
    if Path(hdr_vh).is_file() and Path(img_vh).is_file():
        if Path(hdr_vv).is_file() and Path(img_vv).is_file():
            get_info = envi.read_envi_header(hdrPath)
            d1 = int(get_info['lines'])
            d2 = int(get_info['samples'])
            dim = (d1,d2,2)
            im_subset = np.zeros((dim))
    for ix in tqdm(range(len(l))):
        get_img = envi.open(l[ix][0])
        img_open = get_img.open_memmap(writeable = True)
        im = img_open[:d1,:d2,0]
        im_subset[:,:,ix] = im
    return im_subset

In [252]:
# Params 
folder
dateStr = '30Apr2018';
ext1 = 'img'
ext2 = 'hdr'
splitStr       = '_'
daysInterval   = 12
time_formatStr = "%d-%b-%Y"
bandName = ['Sigma0_VH','Sigma0_VV']
folder= home+r'/Validation_of_products_2018/S1A_timeseries/Lampung_S1A_timeseries_2018Anual_Medium.data'
ext = ['vh', 'vv']
numfiles = len(filter(lambda x: "Sigma0_" in x, os.listdir(folder)))/4

In [253]:
# Make Dictionary:
img = {
    'vh': [f for f in os.listdir(folder) if f.endswith('.' + 'img') and "_VH" in f], 
    'vv': [f for f in os.listdir(folder) if f.endswith('.' + 'img') and "_VV" in f]}
hdr = {
    'vh': [f for f in os.listdir(folder) if f.endswith('.' + 'hdr') and "_VH" in f],
    'vv': [f for f in os.listdir(folder) if f.endswith('.' + 'hdr') and "_VV" in f]}

date_img = {
    'vh': [get_list_date(i) for i in img[ext[0]]], 
    'vv': [get_list_date(i) for i in img[ext[1]]] }

date_hdr = {
    'vh': [get_list_date(i) for i in hdr[ext[0]]], 
    'vv': [get_list_date(i) for i in hdr[ext[1]]] }

In [254]:
img_file = [find_files(date_img, img, ext, j, 
                       dateStr, daysInterval) for j in range(len(date_img))]
hdr_file = [find_files(date_hdr, hdr, ext, j, 
                       dateStr, daysInterval) for j in range(len(date_hdr))]

In [255]:
img_current = get_img_file(img_file, hdr_file, 0) ## 0 for current, 1, previous
img_previous = get_img_file(img_file, hdr_file, 1) ## 0 for current, 1, previous

100%|██████████| 2/2 [00:00<00:00,  7.61it/s]
100%|██████████| 2/2 [00:00<00:00,  7.59it/s]


3. Extract features

    Here, we extract, 6 features: 
    
    - VH_img_current, 
    
    - VV_img_current,
    
    - VH_img_current/VV_img_current, 
    
    - VH_img_current-VH_img_previous,
    
    - VV_img_current-VV_img_previous, and
    
    - (VH_img_current/VV_img_current)-(VH_img_previous/VV_img_previous)
   
   then we will obtain : img_features

In [257]:
img_features = np.zeros((img_current.shape[0], img_current.shape[1], 6))

In [262]:
img_features[:,:,0] = img_current[:,:,0]
img_features[:,:,1] = img_current[:,:,1]
img_features[:,:,2] = img_current[:,:,0]/img_current[:,:,1]
img_features[:,:,3] = img_current[:,:,0]-img_previous[:,:,0]
img_features[:,:,4] = img_current[:,:,1]-img_previous[:,:,1]
img_features[:,:,5] = img_current[:,:,0]/img_current[:,:,1]-\
                      img_previous[:,:,0]/img_previous[:,:,1]
[d1,d2,d3] = img_features.shape
# where d1, d2, d3 refer to the row, colum, and feature dimension, and d3=6

4. Train random forest classifier, and do classification

   You can use 'sklearn.ensemble.RandomForestClassifier'

In [16]:
from sklearn.ensemble import RandomForestClassifier as rff

In [17]:
# e.g. Training Data set
import numpy as np
import pandas as pd

num_cls = 4;
num_obs_each_cls = 30;

train_labels = np.zeros((4*30,1)); 
train_features = np.zeros((4*30,6));

In [13]:
pd.DataFrame(train_features)

Unnamed: 0,0,1,2,3,4,5
0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0
5,0.0,0.0,0.0,0.0,0.0,0.0
6,0.0,0.0,0.0,0.0,0.0,0.0
7,0.0,0.0,0.0,0.0,0.0,0.0
8,0.0,0.0,0.0,0.0,0.0,0.0
9,0.0,0.0,0.0,0.0,0.0,0.0


In [15]:
325/24

13

In [None]:
17 file time series

ukuran Cropmap 1000*1000 need 9 hour
ukuran Cropmap 2000*2000 need 1 day 1 hour
