# Data-flow for Experiments

After testing with the previous notebooks, we are about to wrap up with a concrete plan of experiments.

## Input Data: NOAA GridSat in East Asia

### Raw Data
The details of the data is described in the [official document](https://developers.google.com/earth-engine/datasets/catalog/NOAA_CDR_GRIDSAT-B1_V2).

The downloaded NOAA-GridSta-B1 image are stored in netCDF4 format (.nc file). The main variable, brightness temperature data, was stored in int16 as 'irwin_cdr', with a scale factor of 0.01 and offset of 200. The missing values is flagged as -31999.

### Preprocessing
1. Read raw data: -70' to 69.93'N, -180' to 179.94'E, with 0.07' intervals. shape=(2000, 5143)
2. Crop to East Asia region (100-160E, 0-60N) -> shape=(858,858) 
3. Resize the cropped data into a domain of (2^N, 2^N) for better processing. (using opencv2)
    - cv2.resize(512,512) -> (512,512)
    - cv2.resize(256,256) -> (256,256)
4. Rescale the values of white temperature to (0,1). (divided by the max value)


## Output Data: Rainfall in Taiwan Area

The maximal rainfall in an area (Taipei basin) is categorized into *no-rain*, *precipitation lower than 5% (<10mm/hr)*, and *precipitation higher than 5% (>=10mm/hr)*.



## Methods

Feature extraction (dimension reduction) with generalized linear model (logistic regression).

### Principle Component Analysis
> python utils/ipca_transform_preproc_gridsatb1.py -i \[PATH_TO_DATA\] -o \[PREFIX_FOR_OUTPUT\] -m \[PATH_TO_MODEL\] 


### Convolutional Auto-Encoder
> python utils/cae_encode_preproc_gridsatb1.py -i \[PATH_TO_DATA\] -o \[PREFIX_FOR_OUTPUT\] -m \[PATH_TO_MODEL\] 


### Variational Auto-Encoder
> python utils/cvae_encode_preproc_gridsatb1.py -i \[PATH_TO_DATA\] -o \[PREFIX_FOR_OUTPUT\] -m \[PATH_TO_MODEL\] 



### Pre-trained ResNet50

- [ResNet50 for bigearth net](https://tfhub.dev/google/remote_sensing/bigearthnet-resnet50/1)
> python ../utils/pretrained_encode_preproc_gridsatb1.py -i ../data/256/ -o rn50bigearth -m "https://tfhub.dev/google/remote_sensing/bigearthnet-resnet50/1" -b 128


- [Feature vectors of images with ResNet 50](https://tfhub.dev/tensorflow/resnet_50/feature_vector/1)
> python ../utils/pretrained2_encode_preproc_gridsatb1.py -i ../data/256/ -o rn50bigearth -b 128

