# Example pf patch extraction from point list


*Note*: This notebook assumes that `tiatoolbox` has already been installed. If it isn't, you can install it to your python environment by following guideline from https://github.com/TIA-Lab/tiatoolbox or you can install the stable release by running the cell below.

In [None]:
!apt-get -y install libopenjp2-7-dev libopenjp2-tools openslide-tools

In [None]:
!pip install tiatoolbox

Welcome to tiatoolbox. In this example we will show how you can use tiatoolbox to extract patches from an images or a whole slide image providing a list of your desired points. This tool can be very useful for several applications, one practical example is patch extracton for training classification models.

We will start by importing some libraries required to run this notebook examples.

In [None]:
from tiatoolbox.tools import patchextraction
from tiatoolbox.utils.misc import imread
from tiatoolbox.utils.misc import read_point_annotations
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

mpl.rcParams['figure.dpi'] = 300 # for high resolution figure in notebook

For this tutorial, we use a sample image from MoNuSeg dataset for which we have already extracted the centroid location of nuclei. The sample image and point list file comes with tiatoolbox and can be read like below:

In [None]:
# Reding sample image and nuclei centroid list
input_img = imread("../tests/data/TCGA-HE-7130-01Z-00-DX1.png")
centroids_list = read_point_annotations(
                    "../tests/data/sample_patch_extraction.csv")

print('Size of input image is: {}'.format(input_img.shape))
print('And there is {} point annotations for this image'.format(
                                        centroids_list.shape[0]))

The function `read_point_annotations` returns a panda dataFrame which contains a list of points in the (x, y, class) format as they should be saved like this. To better see what we are dealing with here, we can show the image with the desired centroids overlaid on it

In [None]:
# overlay nuclei centroids on image and plot
plt.imshow(input_img)
plt.scatter(np.array(centroids_list)[:,0], np.array(centroids_list)[:,1], s=1)
plt.axis('off')
plt.show()

As you can see in the above figure, each nucleus is marked with a dot. Now, in order to train our nuclei classifier model, we need to extract patches for all nuclei and save them in different folders based on their class from the list. This can be easily done using functions in `PointsPatchExtractor` class. This is a patch extractor class which yields patches from the `input_img` based on the `centroids_list` one-by-one manner. Generally, to create a patch extractor instance, we use `get_patch_extractor` function as follows:

In [None]:
patch_extractor = patchextraction.get_patch_extractor(
        input_img=input_img, # input image path, numpy array, or WSI object
        labels=np.array(centroids_list)[300:350,:], # path to list of points (csv, json), numpy list, panda DF
        method_name="point", # also supports "fixedwindow" and "variablewindow"
        patch_size=(32, 32), # size of the patch to extract around the centroids from centroids_list
    )

As you can see, `patchextraction.get_patch_extractor` accepts several arguments:

- `input_img`: This referes to the image that we want to extract patches from. In this tutorial, we first read the image and passed it to this function, while you can just pass the image address to the function and it will automatically read the image (or WSI).
- `labels`: This refers to the list of points that we want to extract patches from it. In this tutorial, we first load the points list as a pandas DF and passed it to this function, while you can just pass the csv or json file address to the function and it will automatically load the point list.
- `method_name`: This important argument specifies the type of patch extractor that we want to build. As we are looking to extract patches around centroid points, we use `point` option. Right now, `fixedwindow` and `variablewindow` are also supported. Please refer to documentation for more information
- `patch_size`: Size of the patches. 

As we mentioned before, `patch_extractor` is some kind of generator. It has been design like this to be efficient when working with large list of points and avoiding RAM overusing. 

To extract patches using the `patch_extractor` one can use for loops like below, where we try to extract the first 16 patches from the `input_img` based on `centroids_list'.

In [None]:
i = 1
for patch in patch_extractor:
    plt.subplot(4,4,i)
    plt.imshow(patch)
    plt.axis('off')
    if i >= 16: # show only first 16 patches
        break
    i += 1
plt.show()


You can also use `__next__` method to get the next patch from the `patch_extractor` (only if you have not reach the end of the centroid list):

In [None]:
this_patch = patch_extractor.__next__()
plt.imshow(this_patch)
plt.axis('off')
plt.show()