Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separation of functionality in pyradiomics and pyradiomicsbatch command line tools #203

Closed
fedorov opened this issue Feb 16, 2017 · 15 comments

Comments

@fedorov
Copy link
Collaborator

fedorov commented Feb 16, 2017

@Radiomics/developers: is there a good reason to have these two as opposed of having a single pyradiomics command line tool that would support operation for both single input and directory?

Among other things, having a single processing script might help make things more straightforward with Docker deployment.

@JoostJM
Copy link
Collaborator

JoostJM commented Feb 17, 2017

@fedorov, I made 2 scripts at first because they represent two types of pyradiomics usage, with the difference mainly being in how the input and output is provided. I think it shouldn't be much of a problem to create 1 script. Main issue would be how to handle the fact that in pyradiomics, there are two separate input files required (representing one combination) and in pyradiomicsbatch, only 1 file is required (representing the combinations in a csv file).

@fedorov
Copy link
Collaborator Author

fedorov commented Feb 17, 2017

We could just have different command line flags, but perhaps you want to simplify by allowing usage when no options are needed. Alternatively, we could make the script automatically detect whether the input is a directory or file. I don't have strong preference, just wanted to discuss this.

@JoostJM
Copy link
Collaborator

JoostJM commented Feb 17, 2017

@fedorov, currently input files are positional arguments, without output file optional (and parameterized) in pyradiomics and positional and required in pyradiomicsbatch. As to automatically detecting folders, this would not work, as the input for the batch is also a file, not a directory. We could check for extensions, with the extension .csv pointing to batch processing and otherwise to single image processing.

@fedorov
Copy link
Collaborator Author

fedorov commented Feb 22, 2017

As discussed at the meeting, this is postponed for further discussion and for another release.

@fedorov
Copy link
Collaborator Author

fedorov commented Feb 22, 2017

Todo for myself - rebase after v1.1.0 is out

@alannavial
Copy link

Could you please provide an example input file for the pyradiomicsbatch command? The formatting for the batch file is confusing. It makes sense that you need to provide a path to the image and mask along with a patientID to identify the separate files. However I don't understand what you are supposed to put for sequence name (image identifier) and 3) reader (segmentation identifier).

"The input file for batch processing is a CSV file where each row represents one combination of an image and a segmentation and contains 5 elements: 1) patient ID, 2) sequence name (image identifier), 3) reader (segmentation identifier), 4) path/to/image, 5) path/to/mask."

@JoostJM
Copy link
Collaborator

JoostJM commented Feb 28, 2017

@alannavial, This is due to the fact that a patient ID alone is not enough. A patient can have more sequences (images, e.g. multiparametric MRI, multiple phase CT) and each image can have multiple segmentations (different structures, different readers). To ensure each extraction has a unique identifier, it is comprised of patient-sequence-segmentation. These are the first three elements of each line and are not 'active' fields, they are just copied to the output. Only element 4 and 5 (image and label location) are used for the extraction. If you don't have separate readers or sequences, you can fill in anything you want (I usually use "N/A" in these cases). However, the code expects 5 elements, so for the moment, ensure that you don't omit the sequence and reader.

I will update the batch processing to be more flexible (i.e. copy every line, and use the last two fields as image and mask location). This will be part of a new release.

@alannavial
Copy link

Hi @JoostJM, thank you for clearing that up. My main confusion was, what you meant by the terms reader and sequences. I think it would be best to provide examples of what would go in these fields to make it clearer. As it is, I'm still unsure by what would go in the reader (segmentation identifier) field.

Also as an additional question, have you looked into adapting your toolbox to read DICOM-RT file formats? Most institutions seem to be using DICOM-RT more commonly now.

@JoostJM
Copy link
Collaborator

JoostJM commented Feb 28, 2017

@alannavial, simplest example for reader: the filename of the mask.

As to your additional question, we are currently looking into this, but have no support currently. It is possible to build an extension for 3D slicer which enables use of pyradiomics via the slicer interface. This could potentially be combined with other slicer modules which can read DICOM-RT. This is not tested yet however.

@JoostJM
Copy link
Collaborator

JoostJM commented Feb 14, 2018

@fedorov As previously discussed, I'm going to take a second look at the commandline scripts with the goal of having 1 entry point with different subcommands (like the git commandline tools).
Currently I'm thinking about the following. Do you have any additions/comments?

  • pyradiomics General entry point, only provides some information on the possible tools, maybe an XML description for use in SlicerCLIs?
  • pyradiomics single Extract features for 1 set of image + segmentation
  • pyradiomics batch Extract features for multiple sets of image + segmentation (supplied in .csv input; "batch mode")
  • pyradiomics voxel (after merging ADD: Add voxel-based extraction #337) Extract voxel-based parameter maps for 1 set of image + segmentation
  • pyradiomics voxel-batch (after merging ADD: Add voxel-based extraction #337) Extract voxel-based parameter maps in batch mode
  • pyradiomics model (after merging Add PyRadiomics Models #338) Apply a model to 1 set of image + segmentation
  • pyradiomics model-batch (after merging Add PyRadiomics Models #338) Apply a model in batch mode

@fedorov
Copy link
Collaborator Author

fedorov commented Feb 19, 2018

Few ideas to consider/discuss:

  • instead of using "single/batch/model", maybe better to use "--operation [single|batch|model]"? This would also match better the capabilities of the Slicer CLI XML
  • how about instead of adding the "voxel" mode, assume that voxel map is requested when label is not specified, and log warnings for those features that required presence of the label?
  • I really like the idea of merging pyradiomics and pyradiomicsbatch, since this will reduce duplication and simplify maintenance

@JoostJM
Copy link
Collaborator

JoostJM commented Feb 20, 2018

instead of using "single/batch/model", maybe better to use "--operation [single|batch|model]"? This would also match better the capabilities of the Slicer CLI XML

The main issue I see here is that --operation identifies it as an optional argument, whereas it's arguably the most important one. Moreover, the different operating modes also have different commandline requirements. For example: in single mode, you specify the image and mask directly, but in batch mode, you specify a csv file that lists the respective cases.

how about instead of adding the "voxel" mode, assume that voxel map is requested when label is not specified, and log warnings for those features that required presence of the label?

Also in voxel-based extraction, a labelmap is required (for now, later we can make it optional). This has a 2-fold reason: 1) especially in large images, you don't want to perform a voxelbased extraction on the whole image, this is much too computationally intensive and 2) in voxel-based extraction, it is possible to mask the kernel with the ROI, ensuring the features are still only calculated on the ROI intensities.

I really like the idea of merging pyradiomics and pyradiomicsbatch, since this will reduce duplication and simplify maintenance

+1

@JoostJM
Copy link
Collaborator

JoostJM commented Feb 20, 2018

Alternatively, I think I can tweak it around a bit to have similar commandlines for single and batch: I can make the labelmap argument optional and 'detect' wheter to operate in batch mode by checking if Image argument is a csv file. (will rename to "Input").

In that case we can also use the --operation argument you suggested: is omitted, extract segment-based (segment), and the total list of options: --operation=[segment|voxel|model]

@JoostJM JoostJM added this to the PyRadiomics 2.0 Release milestone Feb 20, 2018
@JoostJM JoostJM added this to In Progress in PyRadiomics build/installation Feb 20, 2018
@fedorov
Copy link
Collaborator Author

fedorov commented Feb 20, 2018

Indeed, I didn't think about those points you raised. I agree with your points.

I suggest we should not optimize the command line parameters to deal with the limitations of Slicer CLI. The main goal should be to simplify the process for the users. Maybe it is indeed better to have separate command line tools and not overload one making it too complicated to understand. Good discussion topic for the tomorrow call.

@JoostJM
Copy link
Collaborator

JoostJM commented Mar 13, 2018

Fixed by #347

@JoostJM JoostJM closed this as completed Mar 13, 2018
PyRadiomics build/installation automation moved this from In Progress to Done Mar 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants