New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodules augmentation #294

Open
vessemer opened this Issue Jan 23, 2018 · 8 comments

Comments

Projects
None yet
3 participants
@vessemer
Copy link
Contributor

vessemer commented Jan 23, 2018

This issue dedicated to one possible improvement in the current pipeline of grt123 algorithm:
As it was said, they used under and oversampling w.r.t. nodules' diameters combined with hard negative mining to train over imbalanced data. The data have been augmented on each iteration, to preserve generalisation capability.

For both networks data augmentation is used to artificially increase the amount of data on which they can be trained.

The augmentation described in their pipeline is trivial affine transformations which have been described and implemented in PR #132. Though it's good enough to achieve eminent results, I think there is an area of investigation and the potential gap to be filled. My proposal is to classify nodules by their type of appearance and proximity to other structures: Juxta Plural, on the coastal line of lungs, Juxta Vascular, which appears on the blood vessels and typically grow most rapidly 1, and Isolated which placed in the lung. These three types are depicted below, accordingly.

Then, since we can well segment nodules, crop them out and based on vessels & lungs segmentation (described in #138) find appropriate spots to place them in the aim to artificially enlarge dataset and therefore improve generalisation capability of the grt123 model.

Any thoughts will be highly appreciated!

Acceptance creteria

  • at least 3-folds cross-validation should be performed, demonstraiting logloss and CPM.
@vessemer

This comment has been minimized.

Copy link
Contributor

vessemer commented Jan 23, 2018

I'm going to work on this issue in a while,
but first, I'm looking for public opinion and cooperation :)

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Jan 25, 2018

I get the part about segmenting nodules based on type (1 of the 3 described). I don't get how that leads to better data augmentation though.

@vessemer

This comment has been minimized.

Copy link
Contributor

vessemer commented Jan 25, 2018

@reubano, suppose, we have two patches, with and without nodule:

init

If we can segment out the nodule carefully, then it can be cropped and inserted in a free spot, like that:

transformed

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Jan 25, 2018

I see. So you mean to say that knowing which of three types of nodules we are dealing with leads to better nodule cropping?

@WGierke

This comment has been minimized.

Copy link
Contributor

WGierke commented Jan 25, 2018

Are you sure this does not introduce too much bias? How do we know that nodules can also occur at the positions we're basically planting them?

@vessemer

This comment has been minimized.

Copy link
Contributor

vessemer commented Jan 25, 2018

@WGierke, that was my concern, though we may approach it by extracting probability maps of possible locations or just try as is.

@vessemer

This comment has been minimized.

Copy link
Contributor

vessemer commented Jan 25, 2018

@reubano, I've meant, that knowing which of three types of nodules we are dealing with leads to better nodule inserting :)

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Jan 25, 2018

ahhh ok, makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment