# SDB Module

## Preface

This is a note of how to use SDB module available in [sdb_gui](https://github.com/rifqiharrys/sdb_gui).

## Setup and Preparation

### SDB GUI source code download

Download the source code from [sdb_gui](https://github.com/rifqiharrys/sdb_gui) and extract it to your local drive.

### Python and packages installation

There are many ways to install them, but I prefer using [Miniconda](https://docs.anaconda.com/miniconda/) because of its little installation size. You could refer to https://docs.anaconda.com/miniconda/install/ on how to install miniconda. After conda was installed, open anaconda prompt and create new environment using conda create below. If you read the [README.md](https://github.com/rifqiharrys/sdb_gui/blob/main/README.md) and follow its installation instructions, you could skip this process and continue to install Jupyter Lab.

```bash
    # Replace <ENV_NAME> with a name for your environment
    conda create --name <ENV_NAME>
```
Then activate your new environment and install python 3.12 and the packages. But, to ensure that the packages installed are the latest version, install them from conda forge. To ensure the packages are installed from conda forge, add conda forge as priority channel.

```bash
    conda config --add channels conda-forge
```
Then activate your new environment and install python 3.12 and the packages by typing prompts below.
```bash
    conda activate <ENV_NAME>
    conda install python=3.12 numpy scipy pandas xarray rioxarray geopandas scikit-learn matplotlib
```

In order to run this notebook, you need Jupyter Lab or Jupyter Notebook. In this case, I use Jupyter Lab and to install it in conda, type prompts below.
```bash
    conda install jupyterlab
```

Now that the software preparation is complete, you could open this notebook using browser or using VS Code or another tools of your choice.

### Import

In [None]:
# Add sdb module to path if you're using the source code
import sys
sys.path.append('../')

In [None]:
# Import sdb module
import sdb

### Dataset Information
Prepare your satellite image data and depth sample in a directory and another directory (or not) for the output data. Insert the necessary data information for processing purposes. There are six data identification that is important to the processing. Those are image filename, depth sample filename, depth header name, depth data direction whether it is positive up or down, and header name to differentiate between train and test data and its group name (if any). Type in those values into a suitable key in the dictionary. 

If you have another dataset you want to test, create another dictionary with the same structure and you can easily change between those datasets. It is for this reason that I create a function to administer file location names for input and output and another file tied identification and return it to a dictionary which contain all the necessary information.

In [None]:
# Input and output directory
dir_in = 'input/'
dir_out = 'output/'

## File information dictionary
# Pulau Karang Bongkok & Pulau Semak Daun
file_1 = {
    'img': 'image.tif',
    'sample': 'depth_sample.shp',
    'depth_header': 'Z_Koreksi',
    'depth_direction': 'down', # positive down
    'train_header': 'note',
    'train_group': 'train'
}

# Kalimantan Selatan (Pulau Laut)
file_2 = {
    'img': 'kalsel_mosaic_band_stack_raster_crop_2.tif',
    'sample': 'Kalsel_Merge_Pasut.shp',
    'depth_header': 'MSL',
    'depth_direction': 'up', # positive up
    'train_header': 'Note',
    'train_group': 'train'
}

# Morotai
file_3 = {
    'img': 'sentinel_morotai_4_bands.tif',
    'sample': 'sbes_morotai_sdb.shp',
    'depth_header': 'MSL', # belum dilihat
    'depth_direction': 'down', # positive up
    'train_header': 'Note',
    'train_group': 'train'
}

def input_metadata(location_dict:dict, input_dir:str, output_dir:str):
    """
    Function to insert all necessary data input metadata based on dictionary structure above.
    This returns to metadata dictionary that will be used in the main script.
    """

    import pprint

    image_location = input_dir + location_dict['img']
    sample_location = input_dir + location_dict['sample']
    new_image_location = output_dir + 'DEM_' + location_dict['img']
    depth_header = location_dict['depth_header']
    train_header = location_dict['train_header']
    train_group = location_dict['train_group']
    depth_direction = location_dict['depth_direction']

    metadata_dict = {
        'image_location': image_location,
        'sample_location': sample_location,
        'new_image_location': new_image_location,
        'depth_header': depth_header,
        'depth_direction': depth_direction,
        'train_header': train_header,
        'train_group': train_group
    }

    pprint.pp(metadata_dict)

    return metadata_dict

# Insert metadata dictionary here
input_dict = input_metadata(file_1, dir_in, dir_out) # Change dict name to change input dataset

## Processing

### Read Data

In [None]:
# Read geotiff
image = sdb.read_geotiff(input_dict['image_location'])
image

In [None]:
# Read shapefile
sample = sdb.read_shapefile(input_dict['sample_location'])
sample