Stain Normalizer #186

drbeh · 2021-12-18T01:24:54Z

This PR adds an stain extractor and normalizer for Hematoxylin and Eosin implemented in CuPy

GPUtester · 2021-12-18T01:24:57Z

Can one of the admins verify this patch?

quasiben · 2021-12-18T04:01:50Z

add to allowlist

quasiben · 2021-12-18T04:05:50Z

Thanks for adding this @drbeh . However, I'm trying to find if there is an equivalent of this in scikit-image, do you know if there is ?

drbeh · 2021-12-19T04:23:59Z

Hi @quasiben, I am not aware of an equivalent to this stain normalizer. The most relevant thin on scikit-image that I know is separate_stain, which is related to stain extractor part of this PR but as far as I know it is based on different methods.

It might be useful to see how we can connect these two and maybe leverage skimage for improving our stain normalizer.

grlee77 · 2022-03-02T13:24:44Z

There has been a recent, related discussion on just having a normalization to Hematoxylin & Eosin in scikit-image at:
https://github.com/scikit-image/scikit-image/discussions/6065#discussioncomment-2278816

drbeh · 2022-03-24T19:45:29Z

@gigony @quasiben taking a closer look at scikit-image, I realized that although the functionality of this PR is very related to stain separation in skimage, it is orthogonal/complementary to that.

In skimage, @grlee77 please correct me if I'm wrong, the current functionality is based on a given a Stain Matrix (hed_from_rgb), it separates the stains in an RGB image using skimage.color.rgb2hed.
However, here we first extract the Stain Matrix from the image (HEStainExtractor), and then use this matrix to normalize the RGB image (StainNormalizer).

grlee77 · 2022-03-24T20:08:40Z

Yes, in scikit-image there is a fixed matrix that gets applied to do the stain normalization. The user could also apply their own matrix or use one of the other predefined ones with skimage.color.separate_stains/skimage.color.combine_stains. My understanding is that the same values are not ideal across different labs/experiments and need to be adjusted to some extent? (my research background is in medical imaging / radiology as opposed to microscopy)

grlee77 · 2022-03-24T20:12:26Z

Ah, I see there is a reference to a Macenko et. al. paper in the docstrings. I will take a look at that to understand the method proposed here a bit better.

grlee77 · 2022-05-09T18:53:49Z

python/cucim/src/cucim/core/operations/color/stain_normalizer.py

+        absorbance = absorbance[cp.all(absorbance > self.beta, axis=1)]
+        if len(absorbance) == 0:


Suggested change

absorbance = absorbance[cp.all(absorbance > self.beta, axis=1)]

if len(absorbance) == 0:

absorbance = absorbance[:, cp.all(absorbance > self.beta, axis=0)]

if absorbance.size == 0:

based on the prior line, it looks like this method expects an image with channels along the first axis.

I have been refactoring things while testing locally. I think we can support specification of a channel_axis argument to make it clear which axis of the input is expected to be channels.

It is not clear from the Marcenko et. al. publication whether this should ideally be all(absorbance > self.beta) as here or if some other operation like the mean or max absorbance would be used.

It seems that everything that they have used for this filtering is based on heuristics. Even the proposed value for beta doesn't have any scientific justification.

grlee77 · 2022-05-09T18:59:11Z

python/cucim/src/cucim/core/operations/color/stain_normalizer.py

+        """
+        # calculate absorbance
+        image = image.astype(cp.float32, copy=False) + 1.0
+        absorbance = -cp.log(


Marcenko et. al. uses log10 rather than log in the publication, but existing open source implementations I found all use the natural log. I guess either is valid and it only effects the absolute scale of the beta threshold. Empirically, computation of log is better optimized than log10, so we should probably keep this as is.

You are right! It only affects beta and as long as we are using the correct inverse (exp for log and 10** for log10) both are fine.

grlee77 · 2022-05-09T19:09:45Z

I agree that this Marcenko method is highly cited an seems like a good starting point. I see that in determining the stain vectors, least squares is used here, which is readily available in CuPy.

Other, related techniques use different solutions, but those can potentially be added later.

For example, there is a technique based on non-negative matrix factorization here:
https://github.com/InsightSoftwareConsortium/ITKColorNormalization

The Marcenko method is also implemented (under an MIT license) here: https://github.com/Peter554/StainTools
It differs from the one in this PR in that the normalization uses LASSO (via the SPAMS library) instead of least squares. I don't think we already have an equivalent solver on the GPU (at least it is not in CuPy or cuCIM, but I haven't fully checked other RAPIDS projects)

I will update this PR with specific suggestions later today after testing a bit more with some actual images (such as the ones used for the demo in the ITKColorNormalization repo).

drbeh · 2022-05-12T14:25:29Z

Closing this in favor of #273

closes #96 This PR resumes work that was started in #186. Given the large overall refactoring, it was not feasible to make the suggestions there as individual comments. Overall the approach is the same Macenko method that was was proposed in #186. I spent quite a bit of time refactoring for performance and to separate out some aspects so that it will be easier to add additional related methods in the future. I find about 3x improvement for the case here vs. the one in #186. A summary of the changes relative to #186 are: ### Enhancements - added a `channel_axis` argument that can be used to specify which axis of the input array corresponds to color channels. - more descriptive docstrings ### General refactoring - Now also provides a function-based interface as well to more closely match the typical `cucim.skimage` style. The existing class-based interface was kept as well. There is a small amount of redundancy in providing both, so we should decide if this is worth it. - Aspects like conversion to/from absorbance units was separated out into separate functions. ### Performance Related Changes - Used `cupy.fuse` to fuse multiple kerenel operations needed for absorbance calculations into a single GPU kernel. This gives ~4x improvement in conversions to/from absorbance space. - Added an `image_type` argument that defaults to 'intensity', but can be set to 'absorbance' to indicate that the image is already in absorbance space. This is used to avoid redundant conversions during stain normalization. - Added a `_covariance` function that is a simplified and optimized version of `cupy.cov`. It runs 4x faster for me for the float32 test case I tried on a roughly size 2000x2000 image. - Added a second 'ortho' method aside from least-squares for estimating the raw concentrations. The `_complement_stain_matrix` helper adds a third column that is orthogonal to the two estimated stain vectors, so that a standard matrix inverse can be used. This is much faster in practice than calling `cupy.linalg.lstsq` and gives identical result for almost all voxels in test images. A tiny fraction of voxels differed in uint8 intensity by a magnitude of 1, but this is likely just due differences in the rounding result of finite precision floating point values. This approach is based on the one used by HistomicsTK software (Apache 2.0 licensed). ### Test Changes - refactored to use `pytest.mark.parametrize` instead of adding a dependency on [parameterized](https://github.com/wolever/parameterized) - There is an increase by 1 in some expected pixel values in the "expected" result for some test cases due to replacement of casting to integers (floor operation) with the use of rounding upon conversion of `float32` back to `uint8` during color normalization. Authors: - Behrooz Hashemian (https://github.com/drbeh) - Gregory Lee (https://github.com/grlee77) Approvers: - Gigon Bae (https://github.com/gigony) - https://github.com/jakirkham

drbeh added 2 commits December 18, 2021 01:23

Implement H&E statin extractor and normalizer

74c4059

Implement unittests

0c679cb

gigony self-assigned this Jan 11, 2022

gigony mentioned this pull request Feb 10, 2022

[FEA] Add Stain Normalization operation (in Python) #96

Closed

grlee77 mentioned this pull request Apr 9, 2022

Enhance gallery example on color separation of stained tissues scikit-image/scikit-image#6333

Open

grlee77 reviewed May 9, 2022

View reviewed changes

grlee77 mentioned this pull request May 11, 2022

Stain Normalizer (v2) #273

Merged

drbeh closed this May 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stain Normalizer #186

Stain Normalizer #186

drbeh commented Dec 18, 2021

GPUtester commented Dec 18, 2021

quasiben commented Dec 18, 2021

quasiben commented Dec 18, 2021

drbeh commented Dec 19, 2021

grlee77 commented Mar 2, 2022

drbeh commented Mar 24, 2022 •

edited

Loading

grlee77 commented Mar 24, 2022 •

edited

Loading

grlee77 commented Mar 24, 2022

grlee77 May 9, 2022

grlee77 May 9, 2022

grlee77 May 9, 2022

drbeh May 12, 2022

grlee77 May 9, 2022

drbeh May 12, 2022 •

edited

Loading

grlee77 commented May 9, 2022

drbeh commented May 12, 2022

		absorbance = absorbance[cp.all(absorbance > self.beta, axis=1)]
		if len(absorbance) == 0:

Stain Normalizer #186

Stain Normalizer #186

Conversation

drbeh commented Dec 18, 2021

GPUtester commented Dec 18, 2021

quasiben commented Dec 18, 2021

quasiben commented Dec 18, 2021

drbeh commented Dec 19, 2021

grlee77 commented Mar 2, 2022

drbeh commented Mar 24, 2022 • edited Loading

grlee77 commented Mar 24, 2022 • edited Loading

grlee77 commented Mar 24, 2022

grlee77 May 9, 2022

Choose a reason for hiding this comment

grlee77 May 9, 2022

Choose a reason for hiding this comment

grlee77 May 9, 2022

Choose a reason for hiding this comment

drbeh May 12, 2022

Choose a reason for hiding this comment

grlee77 May 9, 2022

Choose a reason for hiding this comment

drbeh May 12, 2022 • edited Loading

Choose a reason for hiding this comment

grlee77 commented May 9, 2022

drbeh commented May 12, 2022

drbeh commented Mar 24, 2022 •

edited

Loading

grlee77 commented Mar 24, 2022 •

edited

Loading

drbeh May 12, 2022 •

edited

Loading