Using Convolutional Neural Networks (CNN) for Semantic Segmentation of Breast Cancer Lesions (BRCA). Master's thesis
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
code Updated results in report. Discussion and conclusion still to go Nov 7, 2016
database_info Usage info Apr 22, 2017
docs
results Updated results in report. Discussion and conclusion still to go Nov 7, 2016
.gitattributes changed the name of folders in gitattributes, Sep 19, 2016
README.md
log.md

README.md

cnn4brca

Using Convolutional Neural Networks (CNN) for Semantic Segmentation of Breast Cancer Lesions (BRCA). Master's thesis documents. Bibliography, experiments and reports.

Most articles in the Bibliography folder were obtained directly from the authors or via agreements with my home institution. Please consider any copyright infringement before using them.

Contact info:

Erick Cobos Tandazo
a01184587@itesm.mx

Usage

Data set

  1. You can obtain the BCDR database online (Moura et al.). I used the BCDR-DO1 data set, this one has around 70 patients(~300 digital mammograms) with breast masses and their lesion outlines. fileOrganization has some info on how is this images ordered.

  2. To obtain the masks (from the outlines provided in the database) you can use createMasks.m. This reads the mammogram info from a couple of files provided in the database: sample bcdr_d01_img.csv and sample bcdr_d01_outlines.csv

    Output should look like this:

  3. Use prepareDB to enhance the contrast of the mammograms and downsample them to have a manageable size (2cmx2cm in the mammogram in 128x128).

    Output looks like this:

  4. Finally you would need to divide the dataset into training, validation and test patients. You would need to produce a .csv with image and label filenames as this for each set.

Training

  1. You would need to install Tensorflow
  2. Run train or train_with_val_split to train networks. These train the network defined in model_v3, a fully convolutional network with 10 layers (900K parameters) that uses dillated convolution and is modelled in a ResNet network. Training is done image by image (no batch, but cost is computed in every pixel of the thousand of pixels) and uses dropout among other things Note: Code was written for tensorflow 1.11.0 so it would need to be modified to make work in tf1.0

Evaluation

  1. You can use compute_metrics or compute_FROC to compute evaluation metrics or the FROC curve.

You are invited to check the code for more details, I tried to document it nicely.