# Tile Generation Tutorial

Welcome to the tile generation tutorial!

As a whole slide image is too large for deep learning model training, a slide is often divded into a set of small tiles, and used for training. For tile-based whole slide image analysis, generating tiles and labels is an important and laborious step. With LUNA tiling CLIs and tutorials, you can easily generate tile labels and get your data ready for downstream analysis. In this notebook, we will see how to generate tiles and labels using LUNA tiling CLIs. Here are the main steps we will review:

1. Load slides
2. Generate tiles, labels
3. Collect tiles for model training

Through out this notebook, we will use different method parameter files. Please refer to the example parameter files in the `configs` directory to follow these steps.


In [None]:
import os
HOME = os.environ['HOME']

In [None]:
env DATASET_URL=file:///$HOME/vmount/PRO_12-123/

Initially, we'll walk through each CLI step manually-- then run them using the LunaCLIClient in parallel

First, we generate tiles given a slide image of size 128 at 20x, and save them

In [None]:
!generate_tiles \
file:~/vmount/PRO_12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs \
--tile_size 128 --requested_magnification 20 \
-o ~/vmount/PRO_12-123/tiling/test/tiles


In [None]:
!detect_tissue \
~/vmount/PRO_12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs \
~/vmount/PRO_12-123/tiling/test/tiles \
--requested_magnification 2 \
--filter_query "otsu_score > 0.1" \
-o ~/vmount/PRO_12-123/tiling/test/detect

In [None]:
!label_tiles \
../PRO_12-123/data/toy_data_set/table/ANNOTATIONS ~/vmount/PRO_12-123/tiling/test/detect \
-o ~/vmount/PRO_12-123/tiling/test/label

In [None]:
!save_tiles \
~/vmount/PRO_12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs \
~/vmount/PRO_12-123/tiling/test/label \
--num_cores 16 --batch_size 200 --dataset_id PRO_TILES \
-o ~/vmount/PRO_12-123/tiling/test/saved_tiles

In [None]:
from luna.common.utils import LunaCliClient

def pipeline (slide_id, input_slide, input_annotations):
    client = LunaCliClient("~/vmount/PRO_12-123/tiling", slide_id)
    
    client.bootstrap("slide", input_slide)
    client.bootstrap("annotations", input_annotations)
    
    client.configure("generate_tiles", "slide", 
        tile_size=128, 
        requested_magnification=20
    ).run("source_tiles")

    client.configure("detect_tissue", "slide", "source_tiles",
        filter_query="otsu_score > 0.1", 
        requested_magnification=2
    ).run("detected_tiles")

    client.configure("label_tiles", "annotations", "detected_tiles").run("labled_tiles")

    client.configure( "save_tiles", "slide", "labled_tiles",
        num_cores=16, batch_size=200, dataset_id='PRO_TILES_LABLED'
    ).run("saved_tiles")

In [None]:
from concurrent.futures import ThreadPoolExecutor
import pandas as pd

df_slides = pd.read_parquet("../PRO_12-123/data/toy_data_set/table/SLIDES/slide_ingest_PRO_12-123.parquet")
        
with ThreadPoolExecutor(5) as pool:
    
    for index, row in df_slides.iterrows():
        print (index)
        
        pool.submit(pipeline, index, row.slide_image, "../PRO_12-123/data/toy_data_set/table/ANNOTATIONS")
        

In [None]:
import pandas as pd
df_tiles = pd.read_parquet("~/vmount/PRO_12-123/datasets/PRO_TILES_LABLED/").query("intersection_area > 0")
print (df_tiles['regional_label'].value_counts())
df_tiles

Congratulations! Now you have 2120 tumor, 860 stroma, and 751 fat tiles images and labels ready to train your model.