New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#120 Lung Segmentation (Hounsfield based) #133

Merged
merged 2 commits into from Sep 26, 2017

Conversation

Projects
None yet
4 participants
@WGierke
Copy link
Contributor

WGierke commented Sep 21, 2017

I refactored the current identification algorithm that's based on the approach of Julian de Wit which uses Hounsfield values between -1000 and 400 to segment lungs.
As a byproduct, the algorithm already saves images of pre-processed scans and the related lung masks. I refactored the structure so the segmentation method can be accessed easier by other algorithms as well.

Reference to official issue

This addresses #120 .

Motivation and Context

Every nodule identification algorithm is supposed to somehow segment the lung first. Thus, it makes sense to have one separate function that is shared among the different identification algorithms.

How Has This Been Tested?

I ran the nodule identification test which saved files of the segmented lungs as PNG files to data/extracted/USER_ID/. I constructed the following GIF based on them, which shows how well the lung got segmented.
result1
I'm very open for suggestions for more "code-heavy" tests ;)

CLA

  • I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well
EXTRACTED_IMAGE_DIR = "data/extracted/"
TARGET_VOXEL_MM = 1.00

target_dir = EXTRACTED_IMAGE_DIR + patient_id + "/"

This comment has been minimized.

@lamby

lamby Sep 22, 2017

Contributor

os.path.join??

for i in range(image.shape[0]):
patient_dir = target_dir
if not os.path.exists(patient_dir):
os.mkdir(patient_dir)

This comment has been minimized.

@lamby

lamby Sep 22, 2017

Contributor

I think there is mkdirs with some kind of ignore flag? :)

if not invert_order:
image = numpy.flipud(image)

for i in range(image.shape[0]):

This comment has been minimized.

@lamby

lamby Sep 22, 2017

Contributor

Typically one uses enumerate instead

slices = []
for s in os.listdir(src_dir):
try:
slices.append(dicom.read_file(src_dir + '/' + s))

This comment has been minimized.

@lamby

lamby Sep 22, 2017

Contributor

os.path.join etc

@WGierke WGierke referenced this pull request Sep 22, 2017

Closed

Lung Segmentation #120

0 of 3 tasks complete
@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 22, 2017

whoa... that gif is awesome!

@isms

This comment has been minimized.

Copy link
Contributor

isms commented Sep 22, 2017

@WGierke Simple (dumb?) idea for testing accuracy here: what if you used this segmentation to create a mask for every single image in the LIDC-IDRI data and then looked at what % of identified nodules fell inside the mask vs. outside the mask?

NB, this wouldn't be part of the ordinary test suite because it assumes that all the images are available but it would be helpful to know for science 🔬

@WGierke

This comment has been minimized.

Copy link
Contributor

WGierke commented Sep 22, 2017

@isms Thanks for your input. If I understand you correctly, the issue with the current identification algorithm by De Wit will be that it only considers tissue that is inside this mask, so it's not possible at all to find nodules outside the lung. I could rewrite the identification algorithm to consider the whole image and not just the lung but what would we expect to happen in that case? Only detecting nodules in an area that corresponds to the lung would be nice. However, if the algorithm detects nodules in areas that do not belong to the lung, the reason could also be that the model simply wasn't trained to distinguish between lung and not lung on its own - so the model could also be "false" which doesn't guarantee that the lung segmentation is wrong.

@isms

This comment has been minimized.

Copy link
Contributor

isms commented Sep 23, 2017

@WGierke Actually, I was only remarking that if the goal is to see how well this segments lung vs not lung, and we assume that all labeled lung nodules are actually inside the lungs, seeing how many of the labeled nodules are inside these segments would tell us more about how good the lung/not-lung segmentation is.

Does that make sense?

@isms isms assigned WGierke and unassigned isms Sep 23, 2017

@WGierke

This comment has been minimized.

Copy link
Contributor

WGierke commented Sep 24, 2017

@isms Oh alright, sorry, now I got it. Yes, that definitely makes sense. I'll add that :)

@WGierke WGierke force-pushed the WGierke:120_lung_segmentation branch from 0f0e8e1 to c4d936c Sep 25, 2017

@WGierke

This comment has been minimized.

Copy link
Contributor

WGierke commented Sep 25, 2017

@isms I added a test that checks whether the annotations of the local LIDC images are inside the segmented lung masks. I'm using pylidc for that which makes it easier to query attributes of the LIDC images (like annotations). However, the library has to be installed after the other requirements, otherwise it throws some ModuleNotFoundErrors.
Update: I contacted the author and he fixed it in the new version. Now, pylidc can be installed like any other dependency.

@WGierke WGierke force-pushed the WGierke:120_lung_segmentation branch from 21b1a71 to 5de82d8 Sep 25, 2017

@WGierke WGierke force-pushed the WGierke:120_lung_segmentation branch from 5de82d8 to ad98632 Sep 25, 2017

@isms

This comment has been minimized.

Copy link
Contributor

isms commented Sep 25, 2017

@WGierke Excellent, thanks for the footwork there.

Pending @lamby's satisfaction with the requested changes and @reubano's review, I'm in favor of merging 👍

@isms isms requested a review from reubano Sep 25, 2017

for s in os.listdir(src_dir):
try:
slices.append(dicom.read_file(os.path.join(src_dir, s)))
except Exception as e:

This comment has been minimized.

@reubano

reubano Sep 25, 2017

Contributor

This is the only that really caught my eye. Specific exceptions are usually best. But in this case, the error is being logged, so it isn't that big of a deal. If you twisted my arm, I'd probably say something like this may be a bit cleaner (if SpecificException is in fact the only expected exception).

try:
    dicom_slice = dicom.read_file(os.path.join(src_dir, s))
except SpecificException:
    logging.error("{} is not a valid DICOM".format(s))
else:
    slices.append(dicom_slice)

This comment has been minimized.

@WGierke

WGierke Sep 25, 2017

Contributor

Good point, thanks! I replaced the twisted code by your suggestion ;)

@lamby

This comment has been minimized.

Copy link
Contributor

lamby commented Sep 26, 2017

LGTM :)

@reubano reubano merged commit c448e44 into drivendataorg:master Sep 26, 2017

2 checks passed

concept-to-clinic/cla @WGierke has signed the CLA.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@isms isms unassigned lamby and reubano Oct 17, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment