| [![back](../../media/navigation/back.png)](../../exercises/tissue-classification-segmentation/ex2.html) | [![home](../../media/navigation/home.png)](../../index.html) | [![next](../../media/navigation/next.png)](../../exercises/tissue-classification-segmentation/ex4.html) |
| :---: | :---: | :---: |
| Ex 4.2: Threshold-based tissue segmentation | â€¢ | Ex 4.4: Semi-automatic segmentation using SAM |

# 4. Tissue segmentation and classification

## 4.3 Using a N-classes pixel classifier

In the last exercise, we discovered a simple way to extract our objects using a simple thresholder, but they are very limited in terms of what they can see (only based on intensity) and we are limited to two classes (true/false, +/-, FG/BG, ...)

We would like to have a way to segment areas in our tissue based on features more subtle than just the intensity.
We would also like to have the possibility to classify more than two types of tissue.


### 4.3.1 Epithelium area in colorectal cancer

#### Goals:

- The goal of this exercise is to measure the area of epithelium in images of colorectal cancers.
- To do so, you will need to:
    - Find the polygon corresponding to the whole tissue and give it the `Colon` classification using a thresholder.
    - Make a training image from a set of representative chunks of image.
    - Train a pixel classifier able to differentiate the epithelium from the rest of the tissue.
    - Apply the pixel classifier to find epithelium detections within the colon annotations.
    - Export the measurements to a TSV.

#### Required data:

| **Folder** | Description | Location | License |
| --- | --- | --- | --- |
| Colorectal Cancer IHC HE | Whole slide images of colorectal cancer stained with hematoxylin and eosin | [DOI: 10.18710/DIGQGQ](https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/DIGQGQ) | CC0 1.0 |

### A. Locate the tissue on the slides

- Before starting this step, make sure that you have the `Tumor` class available in your class list. It is part of the built-in classes of QuPath so it should already be present for everybody.
- Using the protocol described in [Ex 2: Threshold-based tissue segmentation](../../exercises/tissue-classification-segmentation/ex2.html#A.-Locate-the-lungs-tissue), create the threshold classifier segmenting the tissue on your slides. Make sure that you split objects into separate polygons so you can filter the polygons by area and give a name to the different parts so you can find in the final TSV which line corresponds to which object.
- Using the workflow recorder, apply the thresholder to all the images of your dataset and clean up the results where it's required:
    - Remove the undesired parts (bubbles, folded tissue, ...)
    - Merge fragmented tissue into unique polygons
    - Give a name to each chunk to recognize them later (In the annotation list: right-click > Set properties > Name)
- At this point, each image that you have should contain several polygons classified as `Tumor`.
- If you gave names to your annotations but they don't show up in the viewer, try to click on the ![QP toggle name](../../media/qp-icons/show-names.png) "toggle names display" button.

![segmented colon tumor](../../media/tissue-classification/segmented-colon-tumor.png)

### B. Create a training image

- The pixel classifier that we will use later is based on a machine learning algorithm, so we need to show it a representative sample of data.
- We can't simply search through our whole project and hope to find an image representative of the entire dataset.
- To address this need, we will create a new image being a patchwork of representative chunks selected throughout our project.
- In order to do that, make a checklist of the different textures that you can find your your images, for example:
    - Where the nuclei are very dense
    - Where there is almost no nuclei
    - Where you have small/huge crypts
    - Where the staining is lighter/darker than usual
    - ...
- Once you identified all the textures, for each of these textures:
    - Find an example of area where this texture is present
    - Activate your ![QP rectangle](../../media/qp-icons/rectangle-selection.png) rectangle selection tool and make a rectangle around this example
    - Give the `Region*` class to this newly created rectangle

**Note:**

Don't forget to include a couple of normal/clean areas in your `Region*` rectangles, it would be a shame if the classifier only behaved well on unusual cases!

- Once you made a few rectangles containing examples over your images, we will use them to create our training image summarizing most of what we can find in our dataset.
- To do that:
    - Save your project.
    - In the top-bar menu, go to: "Classify" > "Training images" > "Create training image...".
    - If you used the `Region*` class for your example rectangles, you don't have to edit the settings, you can just validate.
    - In your list of images, a new "Sparse image" should have appeared.

![sparse image](../../media/tissue-classification/training-image.png)

**Note:**

Don't forget to do the color deconvolution for all the images of this project, including the training image ("Sparse image")!!!

### C. Train the pixel classifier

- In QuPath, training a pixel classifier based on machine learning (K-nn, random forest, ...) is an iterative process. It means that we will start by providing it with a very few examples and see what it understands. According to its errors, we will add a few more examples and see how it helps it. We do that as many times as it takes to get a descent result.

**Note:**

For pixel classification, examples can be provided using any selection tool and any type of annotation. Despite that, we make the general assumption that in an image, two pixels located side by side (or very close from one another) carry very similar information. Based on that, we usually restraint ourselves to ![QP point selection](../../media/qp-icons/points-selection.png) point selections to provide examples. It makes it easier to see what the classifier uses by the end, and makes the training phase faster as the number of pixels to process is kept under control. 

Also keep in mind that QuPath represents points by a little circle, but it is just for visualization. Points are a (x, y) coordinate, they don't have any area: the pixels enclosed in the circle don't matter, only the one under the very center of the circle does.

#### a. Place your first set of examples

- It is now time to start our first iteration of the training process by creating our first set of examples.
- Before starting, make sure that you have the `Epithelium`, `Other` and `Ignore*` classes in your classes list.
- In the following instructions, we will use ![QP point annotations](../../media/qp-icons/points-selection.png) point annotations to make our examples, but feel free to experiment any type of annotation. You can mix all types of annotations if you wish.
- Go to your "Sparse image" and click on the ![QP point annotations](../../media/qp-icons/points-selection.png) point annotation tool. You should now see a new floating window allowing to create/edit point annotations.
- In this exercise, we will try to isolate:
    - the epythelium (including the crypts) (as `Epithelium`)
    - the regular tissue (as `Other`)
    - the enclosed background (as `Ignore*`)
- We don't have to take care of the non-enclosed background because we already eliminated it with our thresholder.
- In the floating window, you can click on "Add" three times (once for each item enumerated above). Each time you click, a new empty points cloud should appear in the list just above the "Add" button.
- Click on the first points cloud in the list, then on the `Epithelium` class, and bind them using "Set selected". Repeat this operation with the second points cloud and `Other`, as well as the third and `Ignore*`. If you did things correctly, the icon of each points cloud should now have the color of the class you bound it with, and the name of the class should be indicated next to it.
- Starting with the epithelium+crypt tissue, we will provide a few examples:
    - Limit yourself to 20 to 30 examples.
    - Place some points on all your patches.
    - Save often.
    - Don't place points close to the connection between two patches.
    - Don't place points next to the black background.
    - Place some points at the boundaries of each areas.
- Repeat this step for the regular tissue and the enclosed background.
- Save your project.

**Warning:**

During the training process, reprocessing everything is a rather heavy process causing QuPath to crash very often. Saving your project after placing points, removing points, tweaking settings, ... should be a reflex.

#### b. Instanciate the pixel classifier

- In the top bar menu, go to "Classify" > "Pixel classification" > "Train pixel classifier...".
- A new floating window should appear with everything you need to configure our new pixel classifier.
- You can start by settings the **classification algorithm** to "Random trees", other algorithms are not robust enough for what we are going to do (you are free to try them out though, the instructions don't change).
- We are looking for huge objects: nothing depending on a very fine texture or filaments, so can limit our working **resolution** to "Moderate" or "High".

**Tip:**

In the step described below, we will use the "Live prediction" that keeps reprocessing everything everytime you edit the settings or modify your examples. If you have a "modest" computer, zoom a lot before activating it to avoid refreshing the whole image at once and crashing QuPath.

#### c. Choose the features and the radii

- To determine what's present on a pixel, the classifier builds a vector (collection of numbers).
- To process the values contained in this vector, N filters (Gaussian, Weighted magnitude, ...) are applied at M different radii (2, 4, 8, ...) for your C channels. The vector contains CxMxN values. Such vector is processed for every single example pixel you provided.
- As you may expect, it is a heavy task, so we will try to avoid asking for useless filters and/or useless radii. The goal is to only keep the filters and radii that highlight what we are interested in.
- To start, you can:
    - Click on the "Edit" button at the **Classifier** line and activate the "Calculate variable importance".
    - Open the features list by clicking on the "Edit" button at the **Features** line, and only take the Hematoxylin and Eosin channels.
    - Activate the "Live prediction" to see in real time the effect of what you do.
- Now, we will iteratively choose our filters and radii:
    - Open the features list by clicking on the "Edit" button at the "Features" line.
    - In the "Scale", try to add/remove a few neighborings radii (whichever you want, but don't go higher than 12 included if you don't want to kill your PC)
    - In the "Features", try to add/remove some filters (whichever you want)
    - Click the "OK" button (that should close the "Select features" window)
    - In the lower right corner of the "Train pixel classifier" window, you should have a dropdown menu containing "Show classification" for now. Click on it. It contains the result of every filter for every channel for all radii.
    - For a same filter, look at all the channels and all the radii. If the result is plain gray all the time (in the main viewer), it means that this filter is useless, you can remove it.
    - If the result is plain gray only for some radii, remove these radii from your list.
    - If everything present on the slide is highlighted or if the result is chaotic, the filter is just as useless as if it was plain gray, you can remove it (see examples below).

| Original image | Empty / chaotic | Interesting |
| :---: | :---: | :---: |
| ![original](../../media/tissue-classification/rtrees-original.png) | ![empty-chaos](../../media/tissue-classification/rtrees-chaos-nothing.gif) | ![interesting](../../media/tissue-classification/rtrees-interesting.gif) |

- If you struggle finding your features and your radii, you can try with the following settings (they should be a descent starting point):
    - Resolution: "High"
    - Features window:
        - Channels: "Hematoxylin", "Eosin"
        - Scales: 4.0, 8.0, 12.0
        - Features: Gaussian, Laplacian of Gaussian, Weighted deviation, Gradient magnitude

#### d. Refine the classification

- You are now done with choosing your filters and your radii, it is time to add more examples to refine the classification.
- Make sure that the dropdown menu is back to "Show classification"
- Look where your classifier made mistakes and add a point there.
- You can use the ![QP toggle preview](../../media/qp-icons/show-preview.png) "toggle classification" button to show/hide the preview or use the slider beside it to make it semi-transparent.
- Try to keep the number of points for each class approximately in the same range.
- Every time you notice an improvement:
    - Save your project
    - Save your classifier by giving it a name and clicking "Save" (in the "Train pixel classifier" window)
- Once you are satisfied with the result, save the classifier a last time, save your project, and you can now close the "Train pixel classifier" window.