New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Centralized all deep learning segmentation tasks with new function "sct_deepseg" and refactored sct_download_data #2639
Conversation
scripts/sct_deepseg.py
Outdated
# This command-line tool is the interface for the deepseg API that performs segmentation using deep learning from the | ||
# ivadomed package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write
# This command-line tool is the interface for the deepseg API that performs segmentation using deep learning from the | |
# ivadomed package. | |
""" | |
This command-line tool is the interface for the deepseg API that performs segmentation using deep learning from the | |
ivadomed package. | |
""" |
To get pretty module-level docstrings. (python treats leading comments as module docstrings as well, but retains the #
s, so I think using a triple-string is more idiomatic.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed-- fixed here 021f205
spinalcordtoolbox/deepseg/core.py
Outdated
:param param_deepseg: class ParamDeepseg: Segmentation parameters. | ||
:return: fname_out: str: Output filename. | ||
""" | ||
nii_seg = segment_volume(folder_model, fname_image) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: import ivadomed as imed
spinalcordtoolbox/deepseg/models.py
Outdated
|
||
|
||
MODELS = { | ||
'cord-t2star': {'url': 'https://osf.io/8nsa6/', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other day we said tha the version is "encoded" in the url --> is it correct? (just checking if we do not need a version or date key)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in c03c647
spinalcordtoolbox/deepseg/models.py
Outdated
""" | ||
Check if model (.pt file) is installed under SCT directory | ||
""" | ||
if os.path.exists(os.path.join(self.folder, self.name + '.pt')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we test if the .json
is also present? or could we reasonably assume that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was thinking about that. My reasoning here was that the json file is not "mandatory" to execute a segmentation. Only the model is, which is why I didn't check for it. Open for discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning here was that the json file is not "mandatory" to execute a segmentation.
ah! actually i am wrong. It is mandatory! See: https://github.com/neuropoly/ivado-medical-imaging/blob/master/ivadomed/utils.py#L624.
Oh well, I will check for it then ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in b469daa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great !
spinalcordtoolbox/deepseg/models.py
Outdated
'uqueensland_mice_sc': | ||
{'url': 'https://osf.io/nu3ma/download?version=2', | ||
'description': 'Cord segmentation on mouse MRI. Data from University of Queensland.'}, | ||
'uqueensland_mice_gm': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the convention this should be
'uqueensland_mice_gm': | |
'mice_uqueensland_gm': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 57b99e3
Minor comment: |
I think the field |
@jcohenadad : Please find herebelow the terminal output I got. I followed the installation process you described in the first post of this PR.
|
indeed-- i added it manually-- we will need to add it to all models (along with the other postproc relevant fields) |
spinalcordtoolbox/deepseg/core.py
Outdated
|
||
# Postprocessing | ||
metadata = sct.deepseg.models.get_metadata(folder_model) | ||
postproc = PostProcessing(param, metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kousu is that an overkill to create a Class for this or should we simply define each postprocessing function with nii, param and metadata as input parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a class doesn't offer anything here. A class is useful if you need to keep some state around for a while, but here you want to use the data immediately and get out.
I think what I would do is write it as
def postprocess(nii_seg, param, metadata):
options = {**DEFAULTS, **metadata, **{k: v for k,v in param.__dict__.items() if v is not None}}
if options['threshold']:
nii_seg = threshold(nii_sig, options['threshold'])
if options['keep_largest_object']:
nii_seg = keep_largest_object(nii_sig)
if options['fill_holes']:
nii_seg = fill_holes(nii_sig)
return nii_seg
or, if you take my suggestion about dropping ParamDeepseg
, then
def postprocess(nii_seg, param, metadata):
options = {**DEFAULTS, **metadata, **param}
if options['threshold']:
nii_seg = threshold(nii_sig, options['threshold'])
if options['keep_largest_object']:
nii_seg = keep_largest_object(nii_sig)
if options['fill_holes']:
nii_seg = fill_holes(nii_sig)
return nii_seg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the "merge" approach is brilliant. Currently trying to implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented in bea7a3c
spinalcordtoolbox/deepseg/core.py
Outdated
nii_seg = imed.postprocessing.fill_holes(nii_seg) | ||
else: | ||
logger.warning("Algorithm 'fill holes' can only be run on binary segmentation. Skipping.") | ||
return nii_seg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return is tabbed in one level too deep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in bea7a3c
spinalcordtoolbox/deepseg/core.py
Outdated
"Neither I nor S is present in code: {}, for affine matrix: {}".format(code, affine)) | ||
nii_seg = imed.postprocessing.keep_largest_object_per_slice(nii_seg, axis=axis_infsup) | ||
else: | ||
logger.warning("Algorithm 'keep largest object' can only be run on binary segmentation. Skipping.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this warning be an exception? "Errors should never pass silently. Unless explicitly silenced." advocates the zen of python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, technically it is not an error but just a warning that this step will be skipped. If we stop the program here, i anticipate that all users who try the flag "-thr 0" will see a crash, and since most users don't read their crash report, they will wrongly think that "-thr 0" doesn't work... so i'd rather give them an output and informing the small percentage of users who actually read reports that some algo have not run because of -thr 0.
@charleygros do you have any opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree that this is more a warning than an error, and would then agree to skip it.
Another approach would be to call the thresholding with a default value (eg 10e-3) in case the object is not binary (ie thr=0) when entering in keep_largest_object
and then use the mask_prediction
function to remove the small objects from the soft pred based on the binary output of keep_largest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach would be to call the thresholding with a default value (eg 10e-3) in case the object is not binary (ie thr=0) when entering in keep_largest_object and then use the mask_prediction function to remove the small objects from the soft pred based on the binary output of keep_largest.
yup i like this idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be implemented in ivadomed though, issue opened: ivadomed/ivadomed#205
spinalcordtoolbox/deepseg/core.py
Outdated
# TODO: This if/elif below is ugly. Cannot think of something better for now... | ||
do_process = DEFAULT_KEEP_LARGEST_OBJECT | ||
if self.param.keep_largest_object is True: | ||
do_process = True | ||
elif self.param.keep_largest_object is None: | ||
if 'keep_largest_object' in self.metadata: | ||
do_process = self.metadata.keep_largest_object | ||
if do_process: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about what possible values everything could take here. It looks like you want self.param
taking precedence, then self.metadata
, then falling back to DEFAULT_
, with None
or not in ...
meaning "undefined, so try the next option".
I would say you want do_process = param or metadata or default
, except that will misinterpret False
as "continue to fallback" instead of "stop here".
I rewrote this expression as a function I could test exhaustively with all the possible values; I made one simplification that I don't think would change the meaning -- to represent 'keep_largest_object' not in self.metadata
as metadata is None
.
from pprint import pprint
def do_process(param, metadata, default):
if param is True:
return True
elif param is None:
if metadata is not None:
return metadata
return default
pprint(
[(param, metadata, default, "=>", do_process(param, metadata, default))
for param in [True, False, None]
for metadata in [True, False, None]
for default in [True, False]])
this gave me this truth table:
[(True, True, True, '=>', True),
(True, True, False, '=>', True),
(True, False, True, '=>', True),
(True, False, False, '=>', True),
(True, None, True, '=>', True),
(True, None, False, '=>', True),
(False, True, True, '=>', True),
(False, True, False, '=>', False),
(False, False, True, '=>', True),
(False, False, False, '=>', False),
(False, None, True, '=>', True),
(False, None, False, '=>', False),
(None, True, True, '=>', True),
(None, True, False, '=>', True),
(None, False, True, '=>', False),
(None, False, False, '=>', False),
(None, None, True, '=>', True),
(None, None, False, '=>', False)]
In all cases where param
is True, it's True (of course, that's written right into the logic). But when it's False it skips metadata
and uses default
instead.
Is that what you really meant? param=False
and param=None
both mean "continue", but metadata=False
means "stop" yet metadata=None
means "continue"? I think that's weird. I think the actual logic you want is
def do_process(param, metadata, default):
if param is True:
return True
elif param is False:
return False
elif param is None:
if metadata is not None:
return metadata
return default
which can be written shorter as
def do_process(param, metadata, default):
if param is not None:
return param
if metadata is not None:
return metadata
return default
The truth table for this is
[(True, True, True, '=>', True),
(True, True, False, '=>', True),
(True, False, True, '=>', True),
(True, False, False, '=>', True),
(True, None, True, '=>', True),
(True, None, False, '=>', True),
(False, True, True, '=>', False),
(False, True, False, '=>', False),
(False, False, True, '=>', False),
(False, False, False, '=>', False),
(False, None, True, '=>', False),
(False, None, False, '=>', False),
(None, True, True, '=>', True),
(None, True, False, '=>', True),
(None, False, True, '=>', False),
(None, False, False, '=>', False),
(None, None, True, '=>', True),
(None, None, False, '=>', False)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other tricky thing I see is that 'undefined' is represented in two different ways: for self.param
, it's represented by None
, for self.metadata
it's represented by an actual missing value. What should happen if someone set self.metadata['keep_largest_object'] = None
? Or if someone did del self.param.keep_largest_object
?
If you could change the representation this would be easier: delete ParamDeepseg
and just tell users to pass a dictionary -- you could even write def __init__(self, metadata, **params)
. Without that, we're going to have to accept some baseline ugliness.
Anyway, if you accept my changing of your logic above and the need to deal with different representations, you can rewrite this with the ternary hook:
# TODO: This if/elif below is ugly. Cannot think of something better for now... | |
do_process = DEFAULT_KEEP_LARGEST_OBJECT | |
if self.param.keep_largest_object is True: | |
do_process = True | |
elif self.param.keep_largest_object is None: | |
if 'keep_largest_object' in self.metadata: | |
do_process = self.metadata.keep_largest_object | |
if do_process: | |
if (self.param.keep_largest_object | |
if self.param.keep_largest_object is not None | |
else self.metadata['keep_largest_object'] | |
if 'keep_largest_object' in self.metadata | |
else DEFAULT_KEEP_LARGEST_OBJECT): |
Python has a shortcut for the if .. in .. else
part, dict.get(...[, default])
:
# TODO: This if/elif below is ugly. Cannot think of something better for now... | |
do_process = DEFAULT_KEEP_LARGEST_OBJECT | |
if self.param.keep_largest_object is True: | |
do_process = True | |
elif self.param.keep_largest_object is None: | |
if 'keep_largest_object' in self.metadata: | |
do_process = self.metadata.keep_largest_object | |
if do_process: | |
if (self.param.keep_largest_object | |
if self.param.keep_largest_object is not None | |
else self.metadata.get('keep_largest_object', | |
DEFAULT_KEEP_LARGEST_OBJECT)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also map None
=> missing directly, and then make use of dict.get()
:
# TODO: This if/elif below is ugly. Cannot think of something better for now... | |
do_process = DEFAULT_KEEP_LARGEST_OBJECT | |
if self.param.keep_largest_object is True: | |
do_process = True | |
elif self.param.keep_largest_object is None: | |
if 'keep_largest_object' in self.metadata: | |
do_process = self.metadata.keep_largest_object | |
if do_process: | |
if ({k: v for k,v in self.param.__dict__.items() if v is not None}).get('keep_largest_object', self.metadata.get('keep_largest_object', DEFAULT_KEEP_LARGEST_OBJECT)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or what about viewing it explicitly as a merge? Apparently in Python3.9 you will be able to write defaults | metadata | param
, but in the meantime we have this {**defaults, **metadata, **param}
syntax:
DEFAULTS = {
'threshold': 0.9,
'keep_largest_object': True,
'fill_holes': True,
}
class PostProcessing:
"""
Deals with post-processing of the segmentation. Consider param (i.e. user's flag) with more priority than
metadata (i.e. from model's json file).
"""
def __init__(self, param, metadata):
"""
:param param: dict: Defined by user's parameter
:param metadata: dict: From model's json metadata
"""
# param overrides metadata overrides defaults
self.param = {**DEFAULTS,
**metadata,
**param}
[...]
def keep_largest_object(self, nii_seg):
"""
Only keep largest object
"""
if self.param['keep_largest_object']:
[...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merged strategy implemented in bea7a3c
spinalcordtoolbox/deepseg/core.py
Outdated
|
||
def threshold(self, nii_seg): | ||
""" | ||
Threshold the prediction. For no prediction, set 'threshold' to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a typo?
Threshold the prediction. For no prediction, set 'threshold' to 0. | |
Threshold the prediction. For no threshold, set 'threshold' to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes-- Fixed in ce8a184
spinalcordtoolbox/deepseg/core.py
Outdated
Threshold the prediction. For no prediction, set 'threshold' to 0. | ||
""" | ||
if self.param.threshold: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use threshold = None
to mean "no threshold" instead of letting any falsey value -- namely 0.0 -- trigger this. That would be more uniform with the other inputs.
Editorializing: It's a remnant of C's limited type system that we have this instinct to repurpose the meanings of specific values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no more relevant with current implementation (ce8a184)
spinalcordtoolbox/download.py
Outdated
# if urls is not a list, make it one | ||
if not isinstance(urls, (list, tuple)): | ||
urls = [urls] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a bit more reliable:
# if urls is not a list, make it one | |
if not isinstance(urls, (list, tuple)): | |
urls = [urls] | |
# if urls is not a list, make it one | |
if isinstance(urls, str): | |
urls = [urls] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed-- fixed in 88a04b5
scripts/sct_deepseg.py
Outdated
args = parser.parse_args(args=None if sys.argv[1:] else ['--help']) | ||
# TODO: instead of assigning each args param, we could pass args while instanciating ParamDeepseg(args), and the | ||
# class would deal with assigning arguments to each field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using ParamDeepseg
, you could do something like:
args = parser.parse_args(args=None if sys.argv[1:] else ['--help']) | |
# TODO: instead of assigning each args param, we could pass args while instanciating ParamDeepseg(args), and the | |
# class would deal with assigning arguments to each field. | |
args = parser.parse_args(args=None if sys.argv[1:] else ['--help']) | |
args ={k: v for k in vars(args).items() if v is not None} | |
# separate out the segmentation `param` args from the top level args | |
input = args.pop('i') | |
list_models = args.pop('list_models') | |
install_model = args.pop('install_model') | |
install_default_models = args.pop('install_default_models') | |
... |
You'd have to remove all the default values/make sure they're set to None
; default=param_default.keep_largest_object
is an extra layer of defaults which might accidentally override a model's metadata
even when the the user doesn't ask for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it fails because v is not defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to watch out because the argparse names don't match the postprocess() names. You have args.thr
vs param.threshold
and args.keep_largest
vs param.keep_largest_object
. You can use dest
to fixup the naming, or just change the names of the flags in the CLI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could be explicit about which subset are destined for algorithm parameters instead of being using .pop()
after the fact to pull off the other args:
params = ['keep_largest_object', 'fill_holes', 'threshold']
params = {k: vars(args)[k] for k in params if vars(args)[k] is not None}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry!! Let me try that again:
args = parser.parse_args(args=None if sys.argv[1:] else ['--help']) | |
# TODO: instead of assigning each args param, we could pass args while instanciating ParamDeepseg(args), and the | |
# class would deal with assigning arguments to each field. | |
args = parser.parse_args(args=None if sys.argv[1:] else ['--help']) | |
args ={k: v for k, v in vars(args).items() if v is not None} | |
# separate out the segmentation `param` args from the top level args | |
input = args.pop('i') | |
list_models = args.pop('list_models') | |
install_model = args.pop('install_model') | |
install_default_models = args.pop('install_default_models') | |
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kousu how about not "popping" fields from args
, and just passing args
inside deepseg.core(args)
, because the two first arguments (fname_input and path_model) are already included in args.
Are you worried that we would be passing a bunch of irrelevant fields inside deepseg.core
?
Travis is crashes for two reasons: |
An important subtlety: disables pip's progress bar on Travis. This is a small change but it does mean Travis logs are not quite comparable after this point. |
The build is failing at
This is because does not parse pip VCS URLs. My best suggestion is to remove My second best suggestion is to edit that code to parse VCS URLs; but that seems tricky because the docs don't even document the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good now. Let it in!
EDIT ..no it's not. CI is failing now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to figure out why CI is broken.
Fixes recent crashes in CI due to new ivadomed 1.1 version
yess! CI is now passing. Issue was caused by ivadomed being recently bumped to 1.1, and this changed created compatibility issues with the current models (good thing our CI tests are sensitive 🙏). Issue was fixed by updating the URLs, which are now pointing to up-to-date models. |
With the need to more custom deep learning segmentation models and architectures, as well as new tasks (e.g. multi-class segmentation, CSF segmentation, etc.), it becomes more convenient to have a single CLI function, called
sct_deepseg
, that calls a specific model.One notable feature of this new function is that it will rely on the actively-maintained ivadomed package. Another notable change is that we will now rely on the Pytorch framework (vs. Tensorflow).
For now, this new function will not replace
sct_deepseg_X
, but will be complementary to the existing functions, so we can gather feedback from users before making these functions deprecated. The "big" change will likely be associated with a major release (e.g. 5.0.0).Changes include:
sct_deepseg
that can point to a specific model. Fixes Centralize deep-learning segmentation tools into a single entry point and module #2626$SCT_DIR/models
sct.__models_dir__
sct.utils
:add_suffix()
,extract_fname()
,tmp_create()
sct_deepseg -download-default-models
requirements.txt
sct_download_data
and move some functions underspinalcordtoolbox.download
Todo in subsequent PRs:
threshold
: Populate deepseg models and fill metadata fields #2695Test this branch on a development ivadomed version