# Tutorial for Using Split-Raster for Deep Learning

This demo we will split a large image into small tiles. It is useful for deep learning and computer vision tasks. The package can also be used to split a large image into small tiles for other applications.

For example, we have a large image of size 1000-by-1000, and we want to split it into 256-by-256 tiles. The `SplitRaster` package successfully generate 16 256x256 images tiles with automatic padding on the edges. You can adjust the tile size and the overlap of the tiles for your own applications.

Setup your local or cloud environment for this demo.

```Bash
conda create -n split_raster_py39 python=3.9 -y
conda activate split_raster_py39
conda install gdal -y
conda install ipykernel -y
pip install --upgrade pip
pip install splitraster
``` 

This demo we use the python 3.9, but the package is compatible with python 3.7, 3.8, 3.9, 3.10, 3.11. 

In [11]:
# Clean the output folder
!rm -rf ../data/processed/RGB_TIF
!rm -rf ../data/processed/GT_TIF


In [12]:
from splitraster import geo

input_image_path = "../data/raw/TIF/RGB5k.tif"
gt_image_path = "../data/raw/TIF/GT5k.tif"

save_path = "../data/processed/RGB_TIF"
save_path_gt = "../data/processed/GT_TIF"

crop_size = 256
repetition_rate = 0 # <----- change this value to 0.5 for 50% overlap
overwrite = True # <----- change this value to False for no overwrite demo

n = geo.split_image(input_image_path, save_path, crop_size,
                   repetition_rate=repetition_rate, overwrite=overwrite)
print(f"{n} tiles sample of {input_image_path} are added at {save_path}")


n = geo.split_image(gt_image_path, save_path_gt, crop_size,
                   repetition_rate=repetition_rate, overwrite=overwrite)
print(f"{n} tiles sample of {gt_image_path} are added at {save_path_gt}")

Input Image File Shape (D, H, W):(3, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(3, 5120, 5120)


Generating: 100%|[32m██████████[0m| 400/400 [00:00<00:00, 933.96img/s]


400 tiles sample of ../data/raw/TIF/RGB5k.tif are added at ../data/processed/RGB_TIF
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(1, 5120, 5120)


Generating: 100%|[32m██████████[0m| 400/400 [00:00<00:00, 1179.87img/s]

400 tiles sample of ../data/raw/TIF/GT5k.tif are added at ../data/processed/GT_TIF





In [14]:
# !ls ../data/processed/RGB_TIF

## Random Sampling Code

If you want to create a small data set at the early stage for exploaration. Use the random sampling code, you can use the following code. The following code shows to geneate a 20 tiles (256x256) from the 1000x1000 image.

In [6]:
# Clean the output folder
!rm -rf ../data/processed/Rand/RGB_TIF
!rm -rf ../data/processed/Rand/GT_TIF


In [7]:
from splitraster import geo
input_image_path = "../data/raw/TIF/RGB5k.tif"
gt_image_path = "../data/raw/TIF/GT5k.tif"

input_save_path = "../data/processed/Rand/RGB_TIF"
gt_save_path = "../data/processed/Rand/GT_TIF"

n = geo.random_crop_image(input_image_path, input_save_path,  gt_image_path, gt_save_path, crop_size=500, crop_number=20, img_ext='.png', label_ext='.png', overwrite=True)

print(f"{n} sample paris of {input_image_path, gt_image_path} are added at {input_save_path, gt_save_path}.")

Generating: 100%|[32m██████████[0m| 20/20 [00:00<00:00, 309.09img/s]

20 sample paris of ('../data/raw/TIF/RGB5k.tif', '../data/raw/TIF/GT5k.tif') are added at ('../data/processed/Rand/RGB_TIF', '../data/processed/Rand/GT_TIF').





In [8]:
!ls ../data/processed/Rand/RGB_TIF

0001.png 0004.png 0007.png 0010.png 0013.png 0016.png 0019.png
0002.png 0005.png 0008.png 0011.png 0014.png 0017.png 0020.png
0003.png 0006.png 0009.png 0012.png 0015.png 0018.png
