New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output cord segmentation is empty on image with reasonable contrast #1740
Comments
I experience the same problem with |
Investigating further, the issue with |
@jcohenadad : Yes it is certainly caused by this line: |
So, this volume doesn't seem to be unnormalized (the intensities seems to be zero centered, so they probably did a zscore normalization). Our models assume that the input isn't already normalized because we have to do normalization on the input, so if we feed this normalized data, there will be another normalization done by the algorithm, shifting the distributions depending on the normalization employed in the algorithm. Are they able to send the original data without pre-processing ? We had the same issue with the marmoset dataset, where they sent the data with bias corrections, etc. As a side note, I'm working actively to find a better normalization (both for input of the model and also inside the model) as part of the domain adaptation work, so after finishing this we can change the way we do normalization, but right now it's risky to change this just for this subject because it will involve retraining and re-validating the entire pipeline again. I spoke with @charleygros and on his case he is doing a different normalization so he will not have the same problems that I have on |
@perone two things
|
Oka, I'll work on finding something that would be robust to it and that it will not cause issues when applied to data already normalized. 👍 |
If we are worried that those modifications will impact the results in a large dataset, |
@perone maybe you could use what is already implemented there for safe intensity rescaling and to avoid code duplication in SCT. The code is currently not working (as of 2018-05-18 10.25) but @charleygros is working on it. |
So, we can use the same thing, however, the model has to be trained with this same normalization. There is also interaction with the model internal way of doing normalization or not (batchnorm, etc), especially when using multiple distributions (multiple centers, etc). So what I'm doing now is testing different ways of doing the initial normalization together with different ways of normalizing the internal representations of the network in a way that can improve generalization to other centers data. I want to be sure of that, otherwise we might solve some issues for one center and then not for another, etc. I also have to retrain the model later, validate, update the code, the model, test everything, so I don't want to to all of this without making sure that it will improve for other cases too. Right now I can't just change the normalization because the model wasn't trained with this normalization and it will just give poor results. |
@perone I don't understand this argument. To me, so far your function has been working fine with uint16 input, right? So, if that's the case, then how could a rescaling+changeType (float-->uint16), which produces similar intensity distribution as "unprocessed data which are uint16 at the first place", be problematic? I don't understand how this has to do with the way the model has been trained. To me this is a completely separate issue. Maybe I am missing something? |
I think we're discussing two different things probably. If we look the file
What I was referring to, is that this volume has float data type but it was normalized to be zero-cenetered (the intensities are all around 0.0). Now, the |
@perone My understanding is that the mean/std of this particular dataset is very different from the average mean/std of your trained dataset, right? So my suggestion was to rescale the input data at the beginning of |
The first phrase is correct, the mean/std is very different with this dataset, however we don't expect a range but a zero-centered and unit-variance distribution normalized by the training set parameters, and this normalization is done by normalizing data with training dataset estimated parameters, that is where the problem lies. For instance, if I remove the pre-processing now it won't work (the distribution of this volume is very different then the other trained volumes and is actually not centered/unit-variance). However, if I normalize this volume to be zero-centered and unit-variance (it wasn't before, I really don't know why these intensities are like that in this volume), it yelds good results: However, if we change the normalization to that scheme, it might work for this case, but not for others. That's why we need to retrain the model with another normalization that is robust to it, maybe per-slice but I need to test to compare results and validate, otherwise we might end up solving the problem for some cases and not for the others. My proposal is: let me finish the testing with the different normalizations for the domain adaptation and then I can use the same normalization in the PS: The network will always expect float's, I don't think we should use uint16 for anything here, even if the volume is using integers, we should use the |
@perone ok, I get it about the range. I also get it that you would like to work on another intensity normalization method, based on the discussion we had previously this week. I agree this has to change, so it's great that you are looking into it. What I still don't understand however, is why you don't want to standardize all input images at the beginning of your pipeline, without touching the model. You mentioned that "it might work for this case, but not for others", but unless we try it, we'll never know. Maybe it is a good fix for now. The reason I am pushing for that route, is that the deadline for the GM challenge is coming in a few days, and we need to have a stable version of SCT ready for it. My feeling is that your investigations will take more than 1-2 days, which is why I would like to at least try the standardization. If it works, then great! We'll have a working SCT to present to the community, and you can continue working on better normalization strategies for the coming weeks/months. Now about the question: "how can we make sure it will not break with other images?". I'm working on it (#1750, #1747, working branch: |
Ah ok I didn't know about the GM challenge deadline ! I'll work on a PR to change the normalization of the model then and submit it soon. There is also another alternative, instead of changing it, we can add a flag that will switch the normalization, what do you think it's better ? A flag that can be used or changing the current normalization we have ? |
how about you try with systematic standardization, we test in on the large database, and if we are happy we won't need the flag (might be confusing for users) |
Oka, by systematic standardization you mean volume-wise standardization by making it zero-centered and unit-variance right ? Just to be sure. |
While we are working on this issue, a workaround is to multiply by 1000 before running deepseg:
also see: https://sourceforge.net/p/spinalcordtoolbox/discussion/help/thread/66475b3e/ |
Description
The function runs successfully, but the resulting segmentation image is empty.
Steps to Reproduce
Data: sct_testing/issues/20180517_issue1740
Run:
State of spinalcordtoolbox
The text was updated successfully, but these errors were encountered: