Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance C GLSZM #257

Merged
merged 3 commits into from May 29, 2017
Merged

Enhance C GLSZM #257

merged 3 commits into from May 29, 2017

Conversation

JoostJM
Copy link
Collaborator

@JoostJM JoostJM commented May 22, 2017

Enhance C extension algorithm for calculating GLSZM. Instead of searching over entire image for voxels to process when growing a region, instantiate a stack (max size equal to number of voxels) to hold indices for voxels that have to be processed for the current region.

When an unprocessed voxel, belonging to the ROI is found, a region is started by pushing the index of that voxels to the stack. Then a loop is started where the last added voxel in the stack is popped and its neighbours are checked. If unprocessed and belonging to the same region (same gray level), indices for those voxels are also pushed to the stack. The loop then runs until stack is exhausted (i.e. no new neighbours available, indicating the region is complete). The number of loops are counted and are equal to the size of the zone.

Reset data type for mask from signed char back to char, as negative values are not needed anymore.

Profiling results (from previous to new C extension implementation):
5 test cases, normal mask 56 ms --> 6 ms
brain1 test case, full mask 1340015 ms --> 4855 ms

Performance of GLSZM calculation now comparable to that of GLRLM (approx. 2.5x slower than GLCM)

profiling_full_mask_new

cc @Radiomics/developers
Fixes #256

@JoostJM
Copy link
Collaborator Author

JoostJM commented May 22, 2017

This algorithm change does require more memory, as it instantiates a temporary array (size equal to number of voxels segmented) to hold the found indices.

@pieper
Copy link
Contributor

pieper commented May 22, 2017

Big win! Nice work.

I don't suspect the memory will be a constraining factor in most use cases, but somewhere in the documentation we should have a list of practical considerations about estimated resource requirements for various processing scenarios (expected time and memory requirements for different input types).

@JoostJM
Copy link
Collaborator Author

JoostJM commented May 22, 2017

I don't suspect the memory will be a constraining factor

I don't think so too. Especially if you consider it's just one array , equal to the ROI size. Alternative performance enhancements, such as multithreading, etc, would require at least as much extra memory, if not more.

@fedorov
Copy link
Collaborator

fedorov commented May 22, 2017

Nice! Should we ask Dan to test this branch with the data that was problematic for him in the past?

@pieper
Copy link
Contributor

pieper commented May 22, 2017

Good idea: @blezek can you have a look?

Oh, and by the way, @JoostJM have you re-run some datasets to confirm the results match? Would be good to do a few real-world examples in case there are edge cases.

@JoostJM
Copy link
Collaborator Author

JoostJM commented May 22, 2017 via email

@pieper
Copy link
Contributor

pieper commented May 22, 2017

I know it's not possible to run exhaustive testing, but if it's just a matter of re-running an existing script with the with the updated code it would be a useful sanity check. I know in some cases changing the implementation might lead to slightly different results due to rounding errors or whatever but here I think we'd expect identical results so checking should be easy.

@blezek
Copy link
Collaborator

blezek commented May 23, 2017

@pieper, I took a look at @JoostJM's code changes. For my testing I was simply looking at some of the whole brain segmentations from the http://www.oasis-brains.org/ study. Not terribly representative of what we might wish to do with pyradiomics... Was gearing up to look at the ADNI hippocampus as a test.

Very interesting results. If I comment out line 29 of shape.py, the pyradiomics runs nearly instantly. With the call to SetComputeFeretDiameter included, it takes 25 minutes! I would give a +1 to not computing the Feret diameter. The results were identical except for "original_shape_Maximum3DDiameter": 176.6918221084383 vs "original_shape_Maximum3DDiameter": 0, but took only 3 seconds to run.

@JoostJM
Copy link
Collaborator Author

JoostJM commented May 23, 2017

@blezek, I found similar results (#256). I'll take a look at maximum 3D diameter and see if I can optimize.

@fedorov
Copy link
Collaborator

fedorov commented May 23, 2017

As a related issue, I think it may be misleading to the user of the library, that although we create an impression that calculation of individual features is possible, the reality is that it depends on the feature class. Most of the shape features are calculated in the constructor, and it is not possible for the user to disable, for example, Feret diameter calculation.

Should we consider passing the list of features that should be calculated to the feature class constructor?

@pieper
Copy link
Contributor

pieper commented May 23, 2017

Also for reference here's the ITK Feret diameter calculation. I'm guessing it could be optimized.

https://github.com/InsightSoftwareConsortium/ITK/blob/9543d93dc85db3d67c6085664cffc08d946d2da2/Modules/Filtering/LabelMap/include/itkShapeLabelMapFilter.hxx#L390-L462

@JoostJM
Copy link
Collaborator Author

JoostJM commented May 23, 2017 via email

@jcfr
Copy link
Collaborator

jcfr commented May 23, 2017

Also for reference here's the ITK Feret diameter calculation. I'm guessing it could be optimized.

@thewtex Just to keep you in the loop, no action item from your side.

@JoostJM
Copy link
Collaborator Author

JoostJM commented May 24, 2017

Also for reference here's the ITK Feret diameter calculation. I'm guessing it could be optimized.

This looks very similar to what I implemented in C. Main difference is that I also calculate the maximum 2D diameters in each direction and require it to have 3 dimensions. (Latest profiling +/- 2 mins for fully masked brain 1)

In essence, it compares all possible combinations of voxels on the border which is equal to numVox! calculations. The PyRadiomics implementation compared only combinations of voxels on the upper bound (for either z, y, or x) to voxels on the lower bound (for either z, y or x). This requires significantly less calculations. However, there are edge cases where the maximum diameter is NOT between a voxel on the lower bound and a voxel on the upper bound.

Let's discuss this at the hangout today.

@JoostJM
Copy link
Collaborator Author

JoostJM commented May 24, 2017

Latest profiling (brain1, full mask)

profiling

Enhance C extension algorithm for calculating GLSZM. Instead of searching over entire image for voxels to process when growing a region, instantiate a stack (max size equal to number of voxels) to hold indices for voxels that have to be processed for the current region.

When an unprocessed voxel, belonging to the ROI is found, a region is started by pushing the index of that voxels to the stack. Then a loop is started where the last added voxel in the stack is popped and its neighbours are checked. If unprocessed and belonging to the same region (same gray level), indices for those voxels are also pushed to the stack. The loop then runs until stack is exhausted (i.e. no new neighbours available, indicating the region is complete). The number of loops are counted and are equal to the size of the zone.

Reset data type for mask from signed char back to char, as negative values are not needed anymore.
Add a C extension function to calculate maximum diameters in 2D and 3D.

Additionally, python implementation of maximum diameter can return incorrect values in some edge cases.
After discussion, we decided to remove support for python calculation of maximum diameters and have PyRadiomics return NaN when a user tries to compute these features in full-python mode.
@JoostJM JoostJM merged commit 75baf7b into AIM-Harvard:master May 29, 2017
@JoostJM JoostJM deleted the update-c-glszm branch May 29, 2017 06:51
@JoostJM JoostJM added this to Implemented in PyRadiomics Performance Feb 20, 2018
JoostJM added a commit to JoostJM/pyradiomics that referenced this pull request Oct 2, 2018
Related to [AIM-Harvard#257 in pypa/wheel](pypa/wheel#257).

Newest version of wheel now expects the requirements object returned by `pkg_resources.Requirement.parse(req)` to have an url attribute.
Force the `pkg_resources` version to at least 0.3.0 to ensure this is the case.
JoostJM added a commit to JoostJM/pyradiomics that referenced this pull request Oct 3, 2018
Related to [AIM-Harvard#257 in pypa/wheel](pypa/wheel#257).

Newest version of wheel now expects the requirements object returned by `pkg_resources.Requirement.parse(req)` to have an url attribute.
Force the `pkg_resources` version to at least 0.3.0 to ensure this is the case.
JoostJM added a commit to JoostJM/pyradiomics that referenced this pull request Oct 3, 2018
Related to [AIM-Harvard#257 in pypa/wheel](pypa/wheel#257).

Newest version of wheel now expects the requirements object returned by `pkg_resources.Requirement.parse(req)` to have an url attribute.
Change the installation of setuptools to ensure correct setuptools version is used.
JoostJM added a commit to JoostJM/pyradiomics that referenced this pull request Oct 3, 2018
Related to [AIM-Harvard#257 in pypa/wheel](pypa/wheel#257).

Newest version of wheel now expects the requirements object returned by `pkg_resources.Requirement.parse(req)` to have an url attribute.
Change the installation of setuptools to ensure correct setuptools version is used.
JoostJM added a commit to JoostJM/pyradiomics that referenced this pull request Oct 3, 2018
Related to [AIM-Harvard#257 in pypa/wheel](pypa/wheel#257).

Newest version of wheel now expects the requirements object returned by `pkg_resources.Requirement.parse(req)` to have an url attribute.
Change the installation of setuptools to ensure correct setuptools version is used.
JoostJM added a commit to JoostJM/pyradiomics that referenced this pull request Oct 3, 2018
Related to [AIM-Harvard#257 in pypa/wheel](pypa/wheel#257).

Newest version of wheel now expects the requirements object returned by `pkg_resources.Requirement.parse(req)` to have an url attribute.
Change the installation of setuptools to ensure correct setuptools version is used.
JoostJM added a commit to JoostJM/pyradiomics that referenced this pull request Oct 3, 2018
Related to [AIM-Harvard#257 in pypa/wheel](pypa/wheel#257).

Newest version of wheel now expects the requirements object returned by `pkg_resources.Requirement.parse(req)` to have an url attribute.
Change the installation of setuptools to ensure correct setuptools version is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
PyRadiomics Performance
  
Implemented
Development

Successfully merging this pull request may close these issues.

None yet

5 participants