Skip to content
This repository has been archived by the owner on Feb 22, 2020. It is now read-only.

Nodules augmentation #294

Open
1 task
vessemer opened this issue Jan 23, 2018 · 8 comments
Open
1 task

Nodules augmentation #294

vessemer opened this issue Jan 23, 2018 · 8 comments

Comments

@vessemer
Copy link
Contributor

This issue dedicated to one possible improvement in the current pipeline of grt123 algorithm:
As it was said, they used under and oversampling w.r.t. nodules' diameters combined with hard negative mining to train over imbalanced data. The data have been augmented on each iteration, to preserve generalisation capability.

For both networks data augmentation is used to artificially increase the amount of data on which they can be trained.

The augmentation described in their pipeline is trivial affine transformations which have been described and implemented in PR #132. Though it's good enough to achieve eminent results, I think there is an area of investigation and the potential gap to be filled. My proposal is to classify nodules by their type of appearance and proximity to other structures: Juxta Plural, on the coastal line of lungs, Juxta Vascular, which appears on the blood vessels and typically grow most rapidly 1, and Isolated which placed in the lung. These three types are depicted below, accordingly.

Then, since we can well segment nodules, crop them out and based on vessels & lungs segmentation (described in #138) find appropriate spots to place them in the aim to artificially enlarge dataset and therefore improve generalisation capability of the grt123 model.

Any thoughts will be highly appreciated!

Acceptance creteria

  • at least 3-folds cross-validation should be performed, demonstraiting logloss and CPM.
@vessemer
Copy link
Contributor Author

I'm going to work on this issue in a while,
but first, I'm looking for public opinion and cooperation :)

@reubano
Copy link
Contributor

reubano commented Jan 25, 2018

I get the part about segmenting nodules based on type (1 of the 3 described). I don't get how that leads to better data augmentation though.

@vessemer
Copy link
Contributor Author

@reubano, suppose, we have two patches, with and without nodule:

init

If we can segment out the nodule carefully, then it can be cropped and inserted in a free spot, like that:

transformed

@reubano
Copy link
Contributor

reubano commented Jan 25, 2018

I see. So you mean to say that knowing which of three types of nodules we are dealing with leads to better nodule cropping?

@WGierke
Copy link
Contributor

WGierke commented Jan 25, 2018

Are you sure this does not introduce too much bias? How do we know that nodules can also occur at the positions we're basically planting them?

@vessemer
Copy link
Contributor Author

@WGierke, that was my concern, though we may approach it by extracting probability maps of possible locations or just try as is.

@vessemer
Copy link
Contributor Author

@reubano, I've meant, that knowing which of three types of nodules we are dealing with leads to better nodule inserting :)

@reubano
Copy link
Contributor

reubano commented Jan 25, 2018

ahhh ok, makes sense!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants