Skip to content

embl-cba/cats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example movies

Example images

Overview

CATS is a big image data compatible Fiji plugin for trainable image segmentation. The code is partly based on Fiji's Trainable Weka Segmentation (TWS) plugin.

Similar to Fiji's TWS and the stand-alone software ilastik, CATS learns an image segmentation from user drawn annotations by computing image features and feeding them into a Random Forest classifier. Again similar to TWS and ilastik, the image features of CATS are based on the Eigenvalues of the Hessian matrix and the Structure Tensor, which provide rotationally invariant information about edges and ridges in an image's gray-value distribution.

Deep features

In order to improve context learning, CATS however also computes "Deep Eigenvalue Features", such as, e.g.:

HS( Bin3( SM( Bin3( HL( image ) ) ) ) )

, with abbreviations

HS = Largest Eigenvalue of Hessian Matrix
HS = Smallest Eigenvalue of Hessian Matrix
SM = Middle Eigenvalue of Structure Tensor
Bin3 = 3x3x3 Binning 

, where the alteration between image feature computation and image binning is inspired by the architecture of deep convolutional neural networks (DCNN) as, e.g., used in the popular 3D U-Net.

Similar to ilastik, but in contrast to the TWS PlugIn, CATS internally uses a tiling strategy for image segmentation and thus can be applied to arbitrarily large image data sets.

Furthermore, CATS supports:

  • 2D and 3D images
  • Time-lapse image data
  • Multi-channel image data
  • Anisotropic image calibration
  • Cluster computing (Slurm)
  • Label review

Installation

CATS runs as a PlugIn within Fiji.

  • Please install Fiji
  • Within Fiji, please enable the following Update Sites:
    • CATS
    • ImageScience
    • BigDataProcessor
      • Improves the performance of segmenting images larger than the RAM of your computer (see below).

Starting CATS

CATS can be found within Fiji's menu tree:

  • [Fiji > Plugins > Segmentation > CATS > CATS]

Using CATS

Input Image Data

As it is common in Fiji, CATS operates on the currently active image window. Thus, before CATS can be started, one must open an image. As CATS supports multi-channel 2D and 3D time-lapse data any image can be used as input.

Big Image Data

For analyzing "big" image data (i.e., data that exceeds the available RAM) ImageJ's VirtualStack functionality should be used; see

Input Image Calibration

Upon start-up CATS asks the user to confirm the image calibration. It is very important that this information is fully correct.

Specifically, please pay attention:

  • Are the number of z-planes and frames correct?
    • Sometimes z and t are mixed up...
  • Are the voxel sizes in x,y, and z correct?
    • CATS is able to compute anisotropic image features!

Results Image Setup

Next, CATS is providing the user with the opportunity to allocate the results image (containing the pixel probability maps) either in RAM or on disk. We generally recommend using the disk, because (i) there is hardly any performance loss, (ii) results are always saved, and (iii) data larger than RAM can be handled. The data format for storing the temporary pixel probabilities are single plane Tiff files; once the segmentation is finished there are multiple options for export (s.b.).

Tip

The following folder structure proved to work well:

  • my-segmentation-project
    • input-image
      • put your input image data here
    • probabilities
      • containing the temporary pixel probability maps generated by CATS, i.e. many Tiff files
    • training
      • containing training data, i.e. text files ending with ".ARFF"
    • export
      • containing exported probabilities, e.g. Hdf5 or Tiff stacks

Training and applying a classifier

To train a pixel classifier, the freehand line tool from ImageJ is used to draw labels on the image. One can assign labels to a certain class by either

  • clicking the respective button in the "Labels" window, or
  • typing respective the keyboard short-cuts 1,2,...

Once a couple of labels have been added one can click [Train Classifier], which will compute image features and train a random forest classifier. Every time the training is finished a file dialog will prompt the user to save the training data to disk. This ensures that the valuable training data, containing all the manually drawn labels is always safe.

Subsequently, one can use either ImageJ's freehand line or the rectangle selection tool to select a region of interest and click [Apply Classifier]. The Classification Range selection can be used to further specify the extend of the region to be classified.

Keyboard shortcuts

  • [P] Toggles pixel probability overlay
  • [R] Toggles results (segmentation) overlay
  • [U] Toggles uncertainty overlay

Actions

There are several actions available, which can be executed by clicking the [Execute Action] button.

Add class

Adds a new pixel probability class.

Change class names

Changes the class names.

Change class colors

Changes the colors of the pixel probability overlays.

Export results

Presents several options to export the current pixel probabilities.

Review labels

When adding many labels in a large data set, it is very possible to make mistakes, i.e. by accident assign a label to a wrong class. The review labels action allows to browse through all added labels and remove false labels, using ImageJ's ROI Manager user interface.

Save classifier

Saves the current pixel classifier to disk such that it can be reused, e.g. using the Java API of CATS or using the "Apply classifier on cluster" action (s. b.).

Load classifier

Loads a classifier such that it can be applied on the current image.

Change feature settings

Shows a user interface for changing the image feature settings.

Apply classifier on slurm cluster

Applies the current cluster by sending jobs to a slurm managed computer cluster. In addition to access to such a cluster, this also requires that all data (input image, output probabilities, and a classifier) are stored in a location that is accessible to the cluster nodes.

Training a classifier on multiple data sets

Let's say you have three different files: em00.tif, em01.tif, and em02.tif; and want to train a classifier on all of them. This is how you can do it.

First train on the first image, this will result in an em00.tif.ARFF file on your disk. Close CATS and close the first image.

Now open the second image, e.g., em01.tif, and start CATS again, on the second image. Next, within CATS choose Load labels and point to the labels file from the previous image, i.e. em00.tif.ARFF. If you closely inspect the log window you will find this message:

Loaded instances relation name: em00.tif
does not match image name: em01.tif
Labels will thus not be populated.

This means CATS realised that the labels in this file belong to another image and will thus not show the labels as ROIs on this image. However, the labels file also contains training data from which a classifier can be trained and CATS will do so.

If you now click Apply classifier you will see how well the training data on the first image works on this image. Most likely there will be mistakes in the classification. Thus, you can now, as usual, paint labels in the regions where improvements are needed.

Now, when you select Train classifier it will save a new file on disk: em01.tif.ARFF, containing the training data specific to this image. However, also a window will pop up with the title Instances selection where you can select either/or training instances from the current or previous image. The point is that CATS now has access to two different training data sets and would like to know from which one (or both) it should train a classifier. Typically you would select both. The classifier you have now is trained on both training data. That means if you now choose to Save classifier, this will contain the training information of all selected data sets.

To combine training data from more data sets (e.g. em02.tif) you can simply add as many as you want by the Load labels action.

Instances selection

image

Data management

If you follow above workflow for training on multiple data sets, your data on disk might look somewhat like this:

image

image

The main idea is that you have one .ARFF file per image data set, but you can combine those .ARFF files into one combined classifier.

Further information

Continue working on an existing project

  • Open your input image in ImageJ
  • Start CATS
  • Choose the corresponding probabilities folder
  • Use the "Load labels" action to load your previous annotations (stored in a ".ARFF" text file).

Logging

Information about what is happening is printed into ImageJ's log window. In addition, when you chose to save your classification results to disk (see above), another folder with the ending "--log" will be automatically created next to your results folder. The content of the logging window will be constantly written into a file in this folder.

Cite

This github repository can be cited (registered at ZENODO):