DesignNotes

What is this document for?

This is a design and discussion doc for Coloc 2 - the next generation Colocalization analysis tool for Fiji/imageJ2 It needs a bunch of tidying up and use of wiki markup....

Here we outline a description of features and old/depreciated/current implementations of ImageJ1 colocalization analysis plugins (Colocalization Threshold, Colocalization Test, JaCOP, etc.). We also store a wish list of desired features for Coloc 2 - the next generation Colocalization analysis tool for Fiji/ImageJ2 which uses the new imglib generic image data container.

Instructions for Editing: It's a wiki, go for it. Sign up for a github account to be able to edit. If we get spammed, we can restrict it to contributors only...

To Do list for Coloc 2 plugin:

Merge with JaCoP?????

Combine forces with JaCoP developers, as there is much duplicated functionality, and lots of good ideas in JaCoP - should merge the two projects into 1 - Dan should contact the developers, see http://imagejdocu.tudor.lu/doku.php?id=plugin:analysis:jacop_2.0:just_another_colocalization_plugin:start#changelog and also Oliver Burri at EPFL who wrote code for JaCoP automation:

Date: Fri, 10 Jul 2015 14:23:37 +0000 From: Burri Olivier olivier.burri@EPFL.CH Subject: Re: JaCoP Tools

Hi Patrizia, hi all,

So as a quick step by step

Use Fiji to have access to the update sites
Install JaCoP http://rsb.info.nih.gov/ij/plugins/track/jacop.html
Install ActionBar http://imagejdocu.tudor.lu/doku.php?id=plugin:utilities:action_bar:start
Subscribe to the PTBIOP Update site by going to the following menu Help - Update - Manage Update Sites Check the box next to PTBIOP

After restarting ImageJ, the Plugins Menu should have a BIOP Folder where you will find, among others JaCoP Tools

I made a quick documentation to help you get started https://drive.switch.ch/public.php?service=files&t=7c814b8369803ddbf07279596a5186ed

Standardisation of output format:

What goes in the PDF (or other format) output file? And how:

Images:
Max Projections of both z stack (thumbnails, visual cue as to what image it was) If there is a mask in use, or an ROI, then we should show only those pixels in the max projection images.
Mask channel if any: (Max Intensity Projection(MIP)? Average proj? 3D off axis render?)
2D histogram / fluorogram / scatterplot 1. log scale - fire LUT, 256x256 bins. 1. show thresholds and linear regression
Coloc map: some options...
average or sum projection of “colocalized” pixels - thumbnail image.
magenta/green merge, plus coloc map as white overlay: 1.All > both thresholds pixels white (255)
1. or can use the mean intensity of the 2 pixel values as greyscale.
Text:
Table of numerical results: 1. Pearsons (global) 1. Pearsons(coloc, which is Pearsons r for pixels above both thresholds) 1. Pearsons below thresholds (approx = 0 means auto thresh method worked.)
Table or warnings/checks: 1. did auto thresh method work, is Pearsons<thresholds close to zero. 1. % zero zero pixels (should be small?) 1. intercept far from 0 1. regression slope far from 1 (could cause numeral problems in auto threshold?) 1. populations of pixel values for each channel are far from Gaussian distribution?
1. meaning assumptions for Pearsons r do not hold?

Standardisation of threshold calculation methods

Lots of folks want to use manual thresholds, which is dangerous for reasons that are hopefully obvious: not robust against variations in image/staining/dye/laser/lamp/etc intensity, detector zero offset, illumination power etc.) Costes' autothreshold is a nice method, but sometimes folks seem to think the need something a little less strict, as they feel it sets the thresholds too high, or vice versa. Perhaps, top appease folks, and nudge them towards Costes' autothreshold and awy from manual set thresholds, we could have some other simple autothresholds based on eg: 10% of image mean or mode value Mean minus 2 or 3 standard deviations or some other histogram method like Otsu etc. Is this discussed in the literature... eg. Costes paper?

Bugs/Missing Features:

GUI

	Proper GUI - (probably done in swing?)

Make a proper GUI to replace the prototype singlewindowdisplay GUI Could look a bit like JACoP - with user warnings for missing required info

encourage proper use.
Multichannel images in a hyperstack, composite image - in put image support shouldn’t need to split channels, just choose the right ones. Numerical / Text Results in a normal imageJ results window: Checkbox in GUI (or make default behaviour) so results go into a normal imageJ results window, for scriptability Progress bar for everything (esp. the Costes statistical significance test, since thats the slow bit) Need axes markers on 2D scatterplot or show user if the axes are not on same scale eg ch1 goes 0-255 and ch2 from 0-4095 so slope is not what it seems looking at the plots. Calibration bar next to scatterplots, so reader sees how many pixels in the bins. 2D histogram needs auto thresholds and regression line shown (toggle on and off) these should be line ROIs as overlays? Maybe put the straight line equation in the display also, so when gradient is not 1 but looks like it, this is clear. Also need a 2D histogram accessible as a plain image for further processing...

Single Window Display - prototype gui (currently in use) Prevent Window flicker on run. Copy results button paste into apple mail - hang. Proper pack() so log checkbox text is not falling out of the GUI. Maybe get a better name.... like Colocalization Analysis (Pixel Intensity Based). Test if images are actually identical - warn user (cause numerical problem div 0 in Pearson/Manders?) Write Docs on Fiji wiki (Dan)

Algorithms

Spearman's rank correlation coefficient http://en.wikipedia.org/wiki/Spearman_correlation to find non linear correlations... biology probably has some. our images hardly ever have Gaussian intensity distributions which is an assumption for Pearsons r. Does Spearman get round that? Johannes Schindelin added it (or was it Tom Kazimiers)

Kendal's Tau - Johannes Schindelin added it?

Fixed Bugs:

Fixed: Update image result so its not black until you choose another result image (fixed)

Fixed: Thresholded Manders can give values > 1, this must be a bug (Fixed). Use dataset Btn2deltaNLSGFP_Sis1mCh9_RDR_D3D.dv or an area of the clown’s face that is bright.

Notes from the literature:

Manders et al 1993 J Microscopy. Vol 169 pt3 march 1993 pp 375-382 Re: Pearsons coefficient (cf. Costes below): Quote "(Pearsons) It provides information about the similarity of shape without regard to the average intensity of the signals" Manders' coefficients: these are calculated for each channel under these conditions: A pixel's intensity is added to the sum for that channel only if it is non zero in the other channel. Note, this definition means that pixels with zero red are added to the sum if their green value is non zero. One doesn’t apply a 2 channel threshold simultaneously here as one does for thresholded Manders’ a la Costes. But since adding 0 does nothing, no problem.

Costes et al 2004 Biophysical Journal 86(6) 3993–4003 Quote "We approximate this linear behavior by a least-square fit in the two-dimensional histogram based on orthogonal regression. The slope derived from the least-square fit is directly proportional to the Pearson correlation coefficient r, and therefore takes into account the overall correlation present in the image." (Dan) Not quite sure what the point of this statement is. Remember here that Pearson’s r does not depend on alpha ( the ratio of the mean intensities of the 2 channels)! But if the gradient of the orthogonal regression is proportional to Pearson’s r, what is the constant of proportionality here? Can any one comment? I made the mistake before of saying Pearsons r IS dependent on the slope of the regression, which it is not, and was corrected by at least Jose Vina from SVI (Huygens). (Pete B) I also don’t see how this quote can be correct. The slope is dependent upon the relative intensities of each channel: multiply a channel by a constant and the slope changes, but Pearsons’ correlation coefficient stays the same.

Thresholded Manders' tM1 and tM2, use the Costes threshold(s) Different from original M1 and M2 in, these ARE calculated using BOTH channels' thresholds according to the text: Quote "Hence, our approach for identifying colocalized pixels proceeds by successively classifying pixels as being colocalized if their intensities, IR, IG are both above the threshold pair: T (Tgreen) and aT + b (Tred), respectively."

But according to equation 6, i see no mention of BOTH thresholds in each of the M1 and M2 calculations, the equations only use the threshold for that channel, so it's bit confusing here?

Thresholds actually can be represented as 2 threshold values, one for each channel, or since we have the regression fit line (y = ax + b) coefficients m and c so ThresCh2 = a*ThreshC1 + b
since the thresholds are found by moving along the regression line! We dont need to store Ch2 threshold as its readily calculated from ch1thresh , a and b - which, once calculated, never change for a pair of images.

Adler et al 2008 (Dan) Pearsons correlation coeff. is very sensetive to noise That's bad news for coloc analysis of fluorescence microscopy images, since especially confocal images can be pretty noisy, so Pearsons r is almost always underestimated, and small real diffrences may be lost in this problem: From http://www.colocalize.com/ RBNCC for Replicate Based Noise Corrected Correlation and a patent has been applied for. A detailed description of RBNCC can be found at: Replicate Based Noise Corrected Correlation for Accurate Measurements of Colocalization. Journal of Microscopy 230, 121-133 (2008) Adler, J., Pagakis, S. & Parmryd, I. Journal of Microscopy Mathematical proof of why and how RBNCC works can be found in Analysis of bias in the apparent correlation coefficient between image pairs corrupted by severe noise. Journal of Mathematical Imaging and Vision 37, 204-219 (2010) Bergholm, F., Adler, J. & Parmryd, I.Journal of Mathematical Imaging and Vision

Adler et al 2010 (Dan) Dont bother reporting MOC - its not much use vs Pearson's r: A comparison of correlation coefficients argues that three correlation coefficients, due to their inadequacy, should be abandoned: Quantifying colocalization by correlation: the Person correlation coefficient is superior to the Mander's overlap coefficient. Cytometry Part A (2010) Adler, J. & Parmryd, I. Cytometry Part A

Demandolx and Davoust 1996 Color code the coloc spatial map using a colour code to map the ch1/ch2 pixel intensity ratio They used a HSB colour space H(x,y) = 100 . arctg(Ch1x,y / Ch2x,y) B(x,y) = Kb . (G(x,y) + R(x,y)) where Kb = 255/(Rmax + Gmax). Could do it in a rainbow or whatever LUT for the ratio and use mean of the 2 pixel values for brightness.

Colocalization Colormap An automated method to quantify and visualize colocalized fluorescent signals Frédéric Jaskolskia, Christophe Mullea, Olivier J. Manzonib https://sites.google.com/site/colocalizationcolormap/home nMDP and Icorr measurements? What advantages do these have?

Features of ImageJ1/depreciated/current plugins (not Coloc_2)

JACoP:

-The Pearson correlation includes 0-0 pairs. -Problem- In Costes P-Value test, it DOES shuffle existing image info - but randomised image is created destructively so info is not the same as in the original image? -No manual thresholds for Pearsons, only costes auto threshold (this is likely a good thing) thresholds in JACoP are exclusive(>). -JACoP v.2 calculates M1 by calculating the sum of ch1 pixels > ch1 threshold that overlap with ch2 pixels > ch2 threshold divided by the sum of ch1 pixels > ch1 threshold. -No thresholded ICQ -For K1, K2, and Manders' Overlap JACoP will only include pixels that are above threshold for both channels when thresholds are applied. This includes the numerator values (A*B) and the denominator values (A^2 and B^2). Problem: For A^2 and B^2 all pixels above threshold in the single channel should be counted, only counting pixels above threshold in both channels provides an artificially high overlap coefficient.

Colocalization Threshold:

It seems it works faster than JACoP. I tried it with a very noisy image and could not calculate any threshold (JACoP did it). 16 bit data now working properly for coloc map in latest Fiji version (overflow fixed in creation of RGB image) Bug? When calculating Pearsons correlation coeffient as it does in several places independently, but apparently the same way :-( // N=Ncoloc; at line 720 seems to be a bug.. this should not be commented out? Why is it? The docs say other wise: http://www.macbiophotonics.ca/imagej/colour_analysis.htm#6.3 Colocalisation Threshold it clearly states, quote: "Pearson’s for image above thresholds – Rcoloc Returns Pearson’s correlation coefficient for pixels where both Ch1 and Ch2 are above their respective threshold (yellow area in Scatterplot 1)." As pointed out to Dan by Al MacLeod from Perkin Elmer. This is done correctly in current Coloc_2 version, as per Costes personal comunication.

Colocalisation Test:

Calculates Pearsons coeff P-value for Costes, Fay and Van Steensel methods of image randomisation. Problem- As per the name of the method in the GUI selection, Costes Approximation, it doesnt shuffle existing image info - but creates a new random image with "similar" but not the same info to the original image this is similar to what Imaris does, but its not what Costes describes. Problem - Old version Only works for 8 Bit images - but current Fiji version does work for 16 bit images.

Colocalization Finder: is a fun toy to explore the different parts of the scatterplot - its a nice feature. but setting threshold manually s the way to a subjectivity disaster - since you get what you think you want, not what is there.

ICA plugin:

The Pearson correlation DOES NOT include 0-0 pairs. -Can have manual thresholds for Pearsons. -Thresholded ICQ avaialble -Thresholds in ICA are inclusive (>=), -ICA calculates M1 by calculating the sum of all ch1 pixels that overlap with ch2 pixels > ch2 threshold divided by the sum of all ch1 pixels; the minimum threshold ('use threshold' unchecked) is 1. This sounds wrong? Comments? - I agree, JACoP v.2 seems to be appropriate for M1 & M2, I believe it was corrected to its current state from this method (Bryan).

PSC Colocalization plugin:

-provides Pearson and Spearman coefficients.
-seems to only work on single slices, not stacks.

http://www.nature.com/nprot/journal/v3/n4/full/nprot.2008.31.html

CDA Algorithm for Colocalization:(Veronica)

To quantify true and random colocalization based on image correlation spectroscopy in combination with Manders Colocalization Coefficients.

http://imagejdocu.tudor.lu/doku.php?id=plugin:analysis:confined_displacement_algorithm_determines_true_and_random_colocalization_:start
You need 5 images before analysis: raw images, segmented images, a segmented image of the compartment where you want to analyze colocalization.
Slow calculation.
(Dan) I wonder what the difference is here between the older methods of just translating the whole image(Was it Fay or Van Steensel or both.... need to check)
    (Dan) By Radial shift do they mean that the image is translated in x, and wraps around to the left when it falls of the right side?

Colocalization ColorMap (Dan)

http://sites.google.com/site/colocalizationcolormap/home Colour the image at each pixel by its correlation value. Nice alternative to a grey scale average intensity for both channels coloc map. Or by its product of differences from mean per channel

List of discrepancies between plugins sharing the "same" feature:

From Bryan:

The following are some differences I believe exist:

ICA allows you to place a threshold on ICQ, JACoP does not.
ICA does not include pixels less than one when calculating the mean for each channel (n = total pixels - zero pixels), while JACoP does (n = total pixels).
ICA does not include zero-zero pairs in calculation of the ICQ ((positive ICQ pairs)/(total pairs - zero/zero pairs)), JACoP does (positive ICQ pairs/total pairs)
JACoP normalizes the intensity range of each pixel to 0-1 ((A[i] - Amin)/Amax) prior to calculating ICQ value for each pair , this will skew your data slightly if Amin is not 0.
different use of exclusive (<) and inclusive thresholds (<=)

Next Gen Coloc Tool, Coloc 2, for ImageJ2/imglib: a Wish List / Design:

Speed

It needs to be able to process gigabytes of data in a reasonable time (less than the time taken to collect the images) Multithreaded where appropriate Costes P-Value statistical test is a same instruction - multiple data (each randomised image) problem and so is paralleleisable at that level. Use efficient algorithms Costes P-value - split image into PSF slices blocks, make a list of their identities, then scramble the order of the list eg. with javas list shuffle function. This should be faster than the implementations that are in JACoP and Coloc Test. This adheres to the idea in Costes paper saying the image data should be scrambled spatially, but otherwise be the same. implementations that potentially overwrite destructively the same part of the output image cant contain the "same" data as the input image since that which is overwritten is lost. The implementation must use the image data in the original image, not just generate random but similar looking data as done by imaris? and Coloc Threshold? else you might as well simply use a Gaussian distributed white noise....(like imaris?) . That's not what the Costes paper describes. Use the rearranged Pearsons correlation equation from Coloc Threshold plugin as it should be faster and more numerically precise then the original form of the equation. Done Make sure image order doesn't matter for Costes' auto threshlds calc. Commutativity Test - Done. (Pete B) Be sure to use at least double precision (64-bit) in this case. Because Colocalisation_Threshold method involves summing pixels and their squares, you can easily overshoot the range that can be stored accurately with 32-bit floating point, even when only using integers. At around 2^24 (16 777 216), 32-bit floating point can only store every second integer. This has caused me problems before... Auto threshold speed use a method like bisection (as in Coloc Threshold) to do the interations when finding the thresholds where below the Pearsons r is close to 0. instead of starting at the highest intensity levels and moving down with the thresholds one intensity level each iteration (obviously slow - but make the default method - Done 2015) Maybe use a different method to find the regresion line, instead of orthogonal regression. Pete B suggests a scale invariant method where the means and ranges of pixel intensities dont really need to be the same, as orthog expects? Least Products Regression (Pete B) He does! If that is possible. The problem with orthogonal regression is a really bright and a less bright channel will have unequal contributions to the slope of any fit. This is an unwelcome extra influencing factor on the results, since the relative brightness of channels shouldn’t affect colocalization measurements (although the fact that dimmer channels are also noisier is sadly unavoidable... but that is a different issue). When and how to use thresholds / auto thresholds (as per Costes) Should all methods use thresholds and no thresholds, or just some doing one, the other or both? Should one use inclusive or exclusive thresholds? Coloc2 will use exclusive thresholds always! (Check this....thre is code to choose already implemented!) Regardless if its a lower or upper threshold. Same must be true for the "in range" thresholds. Values defining the range should NOT be included. but there are tricky cases... eg for Costes auto threshold you look below the thresholds for Pearsons r = 0 but then look above threshold for thresholdedM1.... here maybe we get it wrong.... one of these calculations should include the threshold level pixels... probably the thresholdedM1 and thresholdedM2.... not sure if we got that right.. need to check Should we subtract the auto thresholds from the images before we measure Manders coefficients ? to remove the intensity offset that kills proportionality (but not correlation) (No, just measure in pixels above thresholds and report that and Manders'coloc) Better to remove that before we even start??? - but how? Give warning when there is a detectable 0 offset. Perhaps rescale the image data into float with the images having mean intensity of 1 and hopefully the same or similar range? Then the least squares regression will be more valid? We might avoid some numerical problem in the Costes auto threshold. (Pete B) Potentially. But I think it also depends on what the offset of the regression is, i.e. how well the background subtraction has been prior to normalizing the means. Of course, for extremely unbalanced channels any reasonable normalization effort will probably help, even with imperfect background subtraction. But we can expect (somewhat) different results for the same coefficients when comparing with software that does no normalization, so need to be very clear about this. A more robust approach to the regression, which doesn’t need extra normalization, would be better. (Dan) doesnt this also assume the mean intensity means something... that it's a Gaussian distribution of intensities... but its usually not! cf. Spearmann correlation.

What methods / statistics to include/output

Pearsons-full image (no thresholds) Pearsons-colc (above/within thresholds) Spearman full image Spearman w/ thresholds? Kendals Tau Manders' split coefficients M1 and M2 Thresholded split Manders' Coefficients tM1 and tM2 Manders' overlap coefficient MOC (is it useful? - no! Not the most important but…why not? It is easy to understand: 0.3 = 30% colocalization (Considering Ch1: Ch2 ratio is close to 1) But, according to Aldler et al (see above) its pretty much worthless compared to Pearsons r. ) K1 and K2 from Manders (are they useful?) Not the most important but…why not? I (Bryan?) find them useful. If they are not similar then intensity between both channels is not well balanced (one channel brighter than the other). Relative brightness of channels should not affect Pearson or Manders' coefficients as these are insensitive to relative brightness of channel? right? (veronica + Dan) Right! So actually its not critical to have the channels very similar in mean brightness.... right? So long as they are both well in range ICQ (done) with it's statistical significance test (TODO?) RBNCC for Replicate Based Noise Corrected Correlation Adler et al Correct the Pearsons's r for underestimation due to noise, see Adler et al 2008 above (TODO - need to be able to accept 4 input images!) Can tM1 and tM2 be corrected for noise the same way? but patent applied for... not perhaps a problem but its an old idea from Spearmann anyway?) Mutual information / entropy? As suggested by Stephan Preibisch / Johannes Schindelin

Costes P-Value test

randomisation (Should be parallelizable and as fast or faster than in Coloc Test plugin (which is not the right way anyhow?)) Spearmann Rank and Kendall_tau_rank_correlation_coefficient (look at wikipedia).

Object based methods

as in JACoP These require robust object segmentation, which is an art in itself and very dependent on the nature od the images. maybe provide tools for coloc/overlap analysis of binary/segmented/labelled images?

Use ImgLib

(actually need to use imglib2 - done) The next generation of ImageJ see http://imagejdev.org/ (aka imagej2) is using imglib as its core. see http://pacific.mpi-cbg.de/wiki/index.php/Imglib imglib is a generic image container library - meaning you can implement an algorithm once, then it will work for whatever image data types you designed it for. One does not need to make a new implementation for 8 bit, 16 bit 32 bit float et. etc.

Imglib uses cursors for traversing the data and getting values from it. It is n dimensional!

GUI

Multiple pane/tab? As per JACoP helps the user through the correct setting of parameters etc. The "red options must be checked before OK" at idea is good from JACoP should be explicit which measurements 'red options' apply to Output Make results as a pdf page suitable for submission as supp. info with little modification including all input image thumbnails, scatter plots, merg images, coloc maps, parameters and results/statistics. Multiple output options would be beneficial: text, spreadsheet, SQL/database file

About the results:

What do you think of including a Ch1:Ch2 ratio calculation as in Manders’ coefficients plugin? If this value is not close to 1 then Pearson, and Overlap values will be affected (also Spearman coefficient?). -This seems necessary, as you mentioned it is critical for interpretation of the results. (Pete B) Pearsons’ isn’t affected though?

Maybe this is not very useful, but I have found people who prefer color scatterplots rather than B&W ones. The scatterplot from Colocalization Threshold plugin is nice enough, and it shows you the thresholds applied to the image.

An image with colocalized pixels (like the one in Colocalization Threshold plugin) would also be appreciated.

Background Correction:

What do you think about implementing a background correction method in the plugin? I apply one with these plugins (usually the rolling ball algorithm) before image processing because most authors recommend it. (Dan) rolling ball is not a good plan here. One should know the CCD offset or the PMT offset from the Image Histogram, or mean of a blank/background area of image. Then subtract that. Rolling ball can be way too harsh: one might loose real info from the bottom end of the intensity range.
I think the autothresholding method of costes should take care of this, but it can be confused by high and low offsets and intensity clipping at both ends of the range. (Pete B) Don’t think Costes’ should be confused by offsets; if it is, then it is an implementation bug (Dan agrees). Though if the channels are rescaled for Costes’ benefit (e.g. division by the mean) then the offsets matter because they adjust how much each channel is scaled. And this does affect Costes’ threshold. With regards clipping, maybe a clipped background is desirable at times - when the clipping represents a good background subtraction (applied to data that was acquired unclipped, of course)? And might the inclusion of many background pixels cause Costes' threshold to be less accurate, because the influence of the brighter, colocalizing pixels on the fit is reduced? I think relying on Costes’ alone to identify the background may be dangerous. Background subtraction is then a good idea, but I vote for putting it in a separate plugin - and then suggesting its use if Coloc 2 spots too much background in the images. This is both to avoid over-complicated the implementation & dialog options for Coloc 2, and because background subtraction is a commonly-required task not specific to colocalization. There should be a range of options, e.g.

rolling ball
subtract image minimum (optionally after smoothing)
subtract mean of selected ROI (optionally + k standard deviations)
subtract histogram mode (would this work when the background dominates?)

Region of interest

It seems that a biologically relevant ROI is absolutely key to using these methods (esp % pixel or intensity measures), to keep the result specific to the biology, with out killing them with background junk. Should a ROI or Mask image choice be forced on the user so they don't blindly throw in the whole image will all its background to interfere with the maths? We can use ROI manager and a 2D or 3D mask image (well done TOM), as well as a selected ImageJ ROI in either input image.

STUFF that needs to be put in the right place above

EMPTY right now

Provide feedback

Saved searches

Use saved searches to filter your results more quickly