Skip to content
The companion code for the paper "Segmenting Microarrays with Deep Neural Networks"
Python
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
sources
temporary
.gitignore
caffe_tools.py
experimental_evaluation.py
experimental_tools.py
experimental_training.py
interactive_console_options.py
license.txt
readme.md
requirements.txt
simulated_evaluation.py
simulated_tools.py
simulated_training.py
visualization_tools.py

readme.md

This is the companion code for the paper "Segmenting Microarrays with Deep Neural Networks".

Installation

To use this code, you'll need to

  • Install Caffe, along with its Python interface. Be warned that you should have a compute-capable GPU, else training the network and segmenting images will be unfeasibly slow.
  • Install a variety of other Python modules. This is most simply done with pip install requirements.txt.
  • Download the experimental (real) and simulated datasets from Lehmussola et al.'s website. The experimental images should be placed in sources/experimental and the simulated .mat files in the sources/simulated folder.

Usage

This project contains two 'pipelines', one for processing the experimental data and one for processing the simulated data. We recommend you interact with them from an interactive console, such as the one provided by Spyder.

Experimental Pipeline. The experimental pipeline consists of the experimental_training and experimental_evaluation modules. Running the experimental_training.make_training_files() function will read the hand-labellings from sources/labels along with the corresponding images from sources/experimental and construct two LMDB files that contain all the data needed to train the classifier. This will take about an hour on a modern (2015) machine. The classifier can then be trained by calling

caffe train -solver sources/definition/experimental_solver.prototxt

from the project directory. Once the classifier is trained (a few hours), it can be used to segment the experimental images by calling experimental_evaluation.score_experimental_images(), which will process each image in turn and store the results in temporary/scores/experimental_scores.hdf5. This can take a day or more to finish. Once finished, calling measure_all() will use the scores from the .hdf5 file to measure the expression ratio of each spot in each image. Finally, the functions correlations and mean_absolute_errors in experimental_evaluation can be called on the expression ratios in order to calculate the results reported in the paper.

Simulated Pipeline. The simulated pipeline consists of the simulated_training and simulated_evaluation modules. Running the experimental_training.make_training_files() function will read the datafiles from sources/simulated and construct two LMDB files that contain all the data needed to train the classifier. This will take about an hour on a modern (2015) machine. The classifier can then be trained by calling

caffe train -solver sources/definition/simulated_solver.prototxt

from the project directory. Once the classifier is trained (a few hours), it can be used to segment the simulated images by calling simulated_evaluation.score_simulated_images(), which will process each image in turn and store the results in temporary/scores/simulated_scores.hdf5. This can take a day or more to finish. Once finished, calling measure_all() will use the scores from the .hdf5 file to measure the expression ratio of each spot in each image. Finally, the functions calculate_error_rate and calculate_discrepency_distance in simulated_evaluation can be called on the expression ratios in order to calculate the results reported in the paper.

Interrupting training early. In both cases, you can interrupt training the network early if you don't mind losing a small amount of performance. In that case, set the MODEL_PATH in simulated_evaluation/experimental_evaluation to point towards one of the snapshots that can be found in temporary/models.

You can’t perform that action at this time.