# Tissue classification using MAP-MRF #670

Closed
wants to merge 107 commits into
from

## Conversation

Projects
None yet
9 participants
Contributor

### villalonreina commented Jun 26, 2015

 This is an implementation of the MAP-MRF (Maximum a posteriori - Markov Random Field) by using the Iterative Conditional Modes (ICM) minimization algorithm. There are three inputs, the grey level T1 brain image, its corresponding mask and an initial three class segmentation of the T1. For now we have hard-coded paths and I can share the images with whoever wants to test the code. This code is still experimental and needs to be extended to estimate the partial volumes of the tissue classes.

### villalonreina and others added some commits Jun 8, 2015

 first commit PVE 
 e4b0086 
 Modified the ICM function and added code to the Otsu s 
threshold function
 06e38b4 
 Modified the ICM code 
 8f2e852 
 Corrected stats for classes 
 0de50d9 
 Updated test 
 b5bf3b7 
 Merge pull request #1 from Garyfallidis/pve 
Correcting the stats for classes
 f868551 
 Fixed the energy function 
 5139d32 
 Changed likelihood and corrected style 
 b0a7a4f 
 RF: changed variable in gibbs energy 
 3735c9a 
 Reducing beta 
 f4b5109 
Member

### matthew-brett commented Jun 26, 2015

 Can you put the images into their own repo, so we can add them as a submodule for testing?
Member

### Garyfallidis commented Jun 26, 2015

 I believe that Julio cannot share these images right now. I mean he cannot share those specific images publicly. We can try to change them with some existing ones in the following days. Until then he can give you a dropbox link so you can have a look.

### MrBago reviewed Jun 26, 2015

 """ mu = np.zeros(3) std = np.zeros(3)

#### MrBago Jun 26, 2015

Contributor

You have nclass as an argument than you hard code 3 here.

#### MrBago Jun 26, 2015

Contributor

Try this:

import numpy as np

def seg_stats(input_image, seg_image, ddof=0):
img = input_image.ravel()
seg = seg_image.ravel()
if not isinstance(seg.dtype, np.integer):
seg = seg.astype(np.int64)

seg_vol = np.bincount(seg)
sum_img = np.bincount(seg, img)
sum_sq = np.bincount(seg, img ** 2)

mean = sum_img / seg_vol
var = sum_sq / (seg_vol + ddof) - mean ** 2
std = np.sqrt(var)
return mean[1:], std[1:]

Contributor

### omarocegueda commented Jun 26, 2015

 For testing purposes, we may stack a few copies of this coronal slice of a T1-weighted image: https://github.com/nipy/dipy/blob/master/dipy/data/t1_coronal_slice.npy
Member

### Garyfallidis commented Jun 26, 2015

 Thx @omarocegueda :)
Contributor

### MrBago commented Jun 26, 2015

 It seems like most of functions are getting their own files, is there a reason most of this isn't in one file?

### arokem reviewed Jun 27, 2015

 It is intended to work on a masked T1 brain image. It is intended to give threshold values for CSF/WM and WM/GM. Copied from scikit-image to remove dependency.

#### arokem Jun 27, 2015

Member

Please don't forget to copy over the license from scikit-image as well

#### samuelstjean Jun 29, 2015

Contributor

There is also some stuff you can probably reuse from http://nipy.org/dipy/reference/dipy.segment.html#dipy.segment.mask.median_otsu

### omarocegueda reviewed Jun 27, 2015

 return energytotal def log_likelihood(img, mu, var, index, label):

#### omarocegueda Jun 27, 2015

Contributor

This is actually the negative log-likelihood, right?

#### villalonreina Jun 27, 2015

Contributor

Yes it is. My understanding is that this is the one you minimize. The positive one is the one that you would maximize, is that right?

Thanks

#### omarocegueda Jun 27, 2015

Contributor

Yes, that's right, I just thought that something like neg_log_likelihood would be a better name for this function

### omarocegueda reviewed Jun 27, 2015

 return loglike def gibbs_energy(seg, index, label, beta):

Contributor

#### villalonreina Jun 27, 2015

Contributor

This is great. Yes I saw this somewhere else, but first wanted to try the simpler version of it to test it out. Thanks a lot.

### omarocegueda reviewed Jun 27, 2015

 segmented = np.zeros(dataimg.shape) while True:

#### omarocegueda Jun 27, 2015

Contributor

With for N in range(niter): you no longer need the break at lines 60-61

### omarocegueda reviewed Jun 27, 2015

 energy = 0 if label == seg[index[0] - 1, index[1], index[2]]:

#### omarocegueda Jun 27, 2015

Contributor

I think you need to check if you are at the boundary of the image and do not accumulate the energy in that case, otherwise this will wrap to the other side in Python (not the desired behavior) and may cause a memory access violation in Cython (I think we will definitely need to cythonize)

Contributor

### omarocegueda commented Jun 27, 2015

 I think we definitely need to code this in cython. Can we define a base class? like ImageSegmenter that has a method segment(image) and returns the segmented image. Then we can derive FASTImageSegmenter(ImageSegmenter) which contains this particular implementation of the Maximum A-Posteriori estimation of the HMRF with ICM (ImageSegmenter doesn't sound good, any ideas?). I agree with @MrBago, I think you can put most of this code into one single file.

### omarocegueda added some commits Jun 27, 2015

 Merge branch 'pve' of https://github.com/villalonreina/dipy into vill… 
…alonreina-pve
 8051df0 
 RF: refactor the ICM algorithm into a single class 
 cb45ebc 
Contributor

### samuelstjean commented Jun 29, 2015

 As a late reply to all the comments at the top for a t1 image (isn't there a way to reply to a specific comment, or this is a linear discussion?) : There is already one from the standford dataset, see this example http://nipy.org/dipy/examples_built/linear_fascicle_evaluation.html. It seems to be at 2 mm iso instead of the more common 1 mm, but at least anyone can play with it easily and get the same results you would have.

### omarocegueda reviewed Jul 1, 2015

 else: energy = energy + beta if label == seg[index[0] + 1, index[1], index[2]]:

#### omarocegueda Jul 1, 2015

Contributor

Just curious: here, the Ising/Potts potential is -beta if the labels are equal and +beta if the labels are different. I remember that I have seen this "symmetric" version before, but it is not the standard Ising/Potts potential, it should be -beta*Delta(x_{i}-x_{j}). In other words: it should be zero if the labels are equal instead of -beta, as appears in Zhang's paper, first paragraph page 49. Any reason why we prefer the symmetric version here?

#### villalonreina Jul 1, 2015

Contributor

Hi Omar,

Thanks for you comment. Can you point me to where exactly in the Zhang paper they mention the "non-symmetric" version?

Thanks

#### omarocegueda Jul 1, 2015

Contributor

Sure!, it's 6th line, first paragraph $V_{C}(x) = -\delta(x_{i}-x_{j})$ (the Dirac delta)

#### omarocegueda Jul 1, 2015

Contributor

Sorry, sixt line, first paragraph, page 49

#### villalonreina Jul 1, 2015

Contributor

Yes, you are right. I completely missed that one. Thanks!

#### villalonreina Jul 1, 2015

Contributor

Hi Omar, one question regarding the Delta function. Do you know if anyone has implemented it in Dipy?

#### omarocegueda Jul 1, 2015

Contributor

I don't know, but it would be something like

return 1 if x == 0 else 0

Contributor

### villalonreina commented Jul 1, 2015

 Hi all, I got confused when trying to follow the algorithm proposed in the Zhang 2001 paper (http://www.ncbi.nlm.nih.gov/pubmed/11293691). I am not sure about which equation(s) goes into each of the steps of the EM algorithm. So, for step E should I use the function for Q (equation 24) and then I maximize in step M equations 25 and 26? This is a list of steps of what I am thinking about: Estimate label parameters for L range(1, 3) from input segmentation First Loop of ICM, which will give me a segmentation (\hat{x}) and P(\ell | N_i) (the probability of L given the neighborhood) EM loop to estimate the parameters mu and sigma Back to ICM Does this sound reasonable? What do you think?
Contributor

### omarocegueda commented Jul 1, 2015

 yes, Q in equation 24 is the function that needs to be maximized with respect to the parameters (the parameters in this case are the mu and sigma for each class). The good news is that the authors already did it for us! they did the math and found the the mu and sigma that maximize Q are given by equations 25 and 26. Note that the key quantities you need to compute in preparation for eqs. 25 and 25 are given by eq. 27: for each voxel i (with coordinates (x,y,z), say), and for each label k (in equation 27 they use \ell instead of k, but I will use k because it reads better in plain text) you need to compute $P^{(t)}(k | y_{i})$, let's denote that P[i, k]. Now in order to compute P[i,k] you will need another quantity first, which is the second factor in the numerator of eq. 27, wich is $P^{t}(k, x_{N_{i}})$, let's denote this quantity Q[i, k], this quantity is the probability that the label of voxel i is $k$ given that its neighbors are known (remember that you already know all the labels because that's the output of ICM). You can actually find how to compute Q[i,k] in equation 2.18 of Stan Z. Li book "Markov Random Field Modeling in Image Analysis" (page 27) . In that equation V1 doesn't concern to us and V2 is, in our case, the potential of the Ising/Potts model (exponential of -beta or +beta, depending on the neighboring voxels having the same or different label, respectively). The following pseudo code may be more clear: The following computes Q[i, k] for only one voxel i, and for all labels k Initialize Q[i, k] = 0 for all labels k normalization = 0 for each label k: for each neighbor j of voxel i: Q[i, k] += PottsIssing(k, x_{j}) Q[i,k] = exp(Q[i,k]) normalization += Q[i,k] Q[i,k]/=normalization  here, PottsIssing(a, b)= -beta if a==b, else beta, and x_{j} is the label of voxel j (remember, the output from ICM). From these Q, you can compute P[i,k] using eq. 27 and then use that to compute the new mus and sigmas from eq. 25 and 26. In summary, you initialize the mu and sigma for all models using median otsu, and then iterate many times the following 2 steps(we can discuss when to stop later): ICM to find the labels using your current mus and sigmas Update mu and sigma according to eq. 25 and 25, using the current labels I hope it helps
Contributor

### villalonreina commented Jul 1, 2015

 Hi Omar, I trying to implement equation 27 from the same paper and I am not sure about g^t(y_i) and P(y_i). I am assuming that g^t(y_i) comes from equation 11 and that P(y_i) comes from equation 23. Is that right? Thanks!
Contributor

### omarocegueda commented Jul 1, 2015

 Hi Julio, exactly, $g^{(t)}(y_{i}; \theta_{k})$ is in this case the Normal density (eq. 11) with parameters \theta_{k}, which means the mu and sigma corresponding to class k (denoted \ell in the paper). Regarding $p(y_{i})$, please note that it cannot be given by eq. 23, because eq. 23 is conditioned on the knowledge of $\theta$, but $p(y_{i})$ in eq. 27 is not conditioned on $\theta$. Here, $p(y_{i})$ is the probability that intensity $y_{i}$ is observed at voxel $i$ (without knowing anything about the parameters $\theta$). Also note that in theory you could obtain $p(y_{i})$ by integrating $p(y_{i} | \theta)$ (eq. 23) for all possible values of $theta$ (a huuuge amount of possible values, uncountable many) but, a much simpler way is to observe that $p(y_{i})$ is just a normalization constant that ensures that if you sum all the probabilities $p(k | y_{i})$ (eq. 27) for all the classes k, the sum is equal to 1. You can see this for example in Bishop's Book page 61, eq. 2.76 (which is reference [9] in the paper, and the ebook is here: http://cs.du.edu/~mitchell/mario_books/Neural_Networks_for_Pattern_Recognition_-_Christopher_Bishop.pdf). So, in summary, you can first ignore $p(y_{i})$ in eq 27, compute only the numerator of eq. 27 for all classes $\ell$ and divide all of them by their sum.

### samuelstjean reviewed Nov 20, 2015

 def seg_stats(self, input_image, seg_image, nclass): r""" Mean and standard variation for N desired tissue classes

#### samuelstjean Nov 20, 2015

Contributor

you probably mean standard deviation

Member

### Garyfallidis commented Dec 13, 2015

 ping @villalonreina :)

### omarocegueda reviewed Feb 12, 2016

 mu = np.zeros(nclass) std = np.zeros(nclass) for i in range(0, nclass):

#### omarocegueda Feb 12, 2016

Contributor

This algorithm is Theta(nclass * N^3). We can compute everything in Theta(N^3).

#### villalonreina Apr 28, 2016

Contributor

Hi Omar,

Thanks for your comment. What exactly do you mean with "Theta(N^3)" ? What specific line is it that you are referring to?

Thanks!

#### omarocegueda Jun 16, 2016

Contributor

For each label k=0, 1, ..., K-1 you are extracting all voxel values in input_image which are marked as k in seg_image. This operation needs to iterate over all voxels, which has complexity Theta(N^3) for an image of size NxNxN. Since you are doing this once for each label, the full complexity is Theta(K*N^3).

#### omarocegueda Jun 16, 2016

Contributor

Instead of doing that you can initialize mu[...] and std[...] to zero. Then iterate over each voxel (x, y, z), grab its label s = seg_image[x, y, z] and its value v = input_image[x, y, z]. Then update only the entry in mu, std corresponding to labe s: mu[s] += v, std[s] += v*v. Since you are sweeping the full volume only once and for each voxel you are updating only one entry of the arrays mu and std, the full complexity is just Theta(N^3).

#### omarocegueda Jun 16, 2016

Contributor

Of course, at the end you have the sum of the values and the sum of their squares (not the mean and std), but from that information and the number of voxels for each label you can directly compute the means and the standard deviations.

Member

### Garyfallidis commented Mar 8, 2016

 Hey guys and especially @villalonreina and @omarocegueda, any progress with this PR? We need to move on for the next release. The clock is ticking!
Member

### Garyfallidis commented Apr 26, 2016

 @villalonreina any progress with the validation? Keep it up!

### villalonreina added some commits Apr 29, 2016

 Added tolerance value as a percentage to stop iterations of the ICM l… 
…oop. Changed the tissue_classification example to reflect the absence of number of iterations as an input.
 57ef121 
 Merge remote-tracking branch 'nipy-dipy/master' into pve 
Need to update pve branch with latest commits to nipy-dipy/master
 578e965 
 Included convergence criterion (tolerance) as an input to function cl… 
…assify.
 df47d21 
 Fixed and modified the test file that was erroring. Made small change… 
…s to the example file to include the tolerance value for the tissue classification
 caeee84 

### coveralls commented Jun 16, 2016 • edited

 Changes Unknown when pulling caeee84 on villalonreina:pve into * on nipy:master*.

## Current coverage is 80.41%

No coverage report found for master at c5103fd.

### omarocegueda reviewed Jun 16, 2016

 mu, sigma = com.initialize_param_uniform(image, nclasses) sigmasq = sigma ** 2 npt.assert_equal(mu, np.array([0., 0.25, 0.5, 0.75]))

#### omarocegueda Jun 16, 2016

Contributor

In general, we should not compare floating point values for equality, we should use assert_array_almost_equal instead.

### omarocegueda reviewed Jun 16, 2016

 npt.assert_equal(sigmasq, np.array([1.0, 1.0, 1.0, 1.0])) neglogl = com.negloglikelihood(image, mu, sigmasq, nclasses) npt.assert_equal((neglogl[100, 100, 1, 0] != neglogl[100, 100, 1, 1]),

#### omarocegueda Jun 16, 2016

Contributor

Sorry, I don't understand what is the purpose of these tests. Why is it important to verify that the negative log-likelihood is different at these voxels?

#### villalonreina Jun 25, 2016

Contributor

Hi Omar. So basically I am trying to test if the neg-loglikelihood is different for that one voxel across the different tissue types (the last axis of the array). Do you think that the neg-loglikelihood will invariably be always different across the tissue types?

### omarocegueda reviewed Jun 16, 2016

 PLN = icm.prob_neighborhood(initial_segmentation, beta, nclasses) npt.assert_equal(PLN.all() >= 0.0, True) npt.assert_equal(PLN.all() <= 1.0, True)

#### omarocegueda Jun 16, 2016 • edited

Contributor

For example, these tests really are meaningful! but I think the above (the ones that check for inequality) are unlikely to catch any bugs.

### omarocegueda reviewed Jun 16, 2016

 npt.assert_equal(PLN.all() <= 1.0, True) if beta == 0.0: npt.assert_equal((PLN[50, 50, 1, 0] == 0.25), True)

#### omarocegueda Jun 16, 2016 • edited

Contributor

We can put all these coordinates in a list and just iterate over it. And we should use ...almost_equal instead of equal to compare floating point values

### omarocegueda reviewed Jun 17, 2016

 """ Created on Wed Apr 20 15:57:42 2016 @author: jvillalo

#### omarocegueda Jun 17, 2016

Contributor

We should remove this auto-generated code by spyder

### omarocegueda reviewed Jun 17, 2016

 """ import numpy as np #import nibabel as nib

#### omarocegueda Jun 17, 2016

Contributor

Remove import if unnecessary.

### omarocegueda reviewed Jun 17, 2016

 print('t1.shape (%d, %d, %d)' % t1.shape) """ Once we have fetched the T1 volume, we proceed to denoise it with the NLM algorithm.

#### omarocegueda Jun 17, 2016 • edited

Contributor

Why is this necessary? Note that if we denoise the volume first, then the effect of the regularization parameter will be less noticeable. Also this process is quite heavy, people may just want to see segmentation results quickly.

#### villalonreina Jun 25, 2016

Contributor

Ok, yes I agree. Got rid of the denoising of the T1. It makes the example complicated and made the run time longer.

### omarocegueda reviewed Jun 17, 2016

 final_segmentation = np.empty_like(image) initial_segmentation = seg_init.copy() for i in range(100):

#### omarocegueda Jun 17, 2016

Contributor

We should not hard-code the number of iterations. I think users will expect this to be an input parameter.

### omarocegueda reviewed Jun 17, 2016

 self.energies.append(energy) self.energies_sum.append(energy[energy > -np.inf].sum()) if i % 10 == 0 and i != 0:

#### omarocegueda Jun 17, 2016

Contributor

Why we only check convergence every 10 iterations?

### omarocegueda reviewed Jun 17, 2016

 self.segmentations.append(final_segmentation) self.pves.append(PVE) self.energies.append(energy) self.energies_sum.append(energy[energy > -np.inf].sum())

#### omarocegueda Jun 17, 2016

Contributor

Sorry, I don't understand what the stop criterion is, could you please comment on this?

#### villalonreina Jun 25, 2016

Contributor

The converge is tested on the total energy given by ICM at every loop. Every 10 iterations (it could be 20, 25 or 30), a percentage of the total range of change (the tolerance) is compared to the range of change of the last 5 iterations. If this range of change of the last 5 iterations is smaller than the full range of change then the loop stops. Note that the range of change of the last 5 iterations is compared to the total range of change as the loop continues in multiples of 10 (10,20,30, etc....). Shall we change it to be tested every 20 iterations?

### omarocegueda reviewed Jun 17, 2016

 pass def classify(self, image, nclasses, beta, tolerance):

#### omarocegueda Jun 17, 2016

Contributor

tolerance documentation missing in the docstring

### omarocegueda reviewed Jun 17, 2016

 beta : float, smoothing parameter, the higher this number the smoother the output will be. max_iter : float,

#### omarocegueda Jun 17, 2016

Contributor

max_iter is currently not an input parameter (I think it should be)

### omarocegueda reviewed Jun 20, 2016

 return PLN cdef void _initialize_maximum_likelihood(double[:,:,:,:] nloglike, double[:,:,:] seg) nogil:

#### omarocegueda Jun 20, 2016

Contributor

seg should be of an integral type (int or short), not double

### omarocegueda reviewed Jun 20, 2016

 cdef void _icm_ising(double[:,:,:,:] nloglike, double beta, double[:,:,:] seg, double[:,:,:] energy, double[:,:,:] new_seg) nogil:

#### omarocegueda Jun 20, 2016

Contributor

Same here, seg and new_seg don't need to be double, they should be of an integral type

Member

### Garyfallidis commented Jul 2, 2016

 Hello @villalonreina. Ready with the rebase and the new PR?
Member

### Garyfallidis commented Jul 2, 2016 • edited

 How did the squashing/interactive rebase go?
Member

### Garyfallidis commented Sep 8, 2016

 This PR is replaced by #1115