Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of CODEXProcessing on GPU #1

Merged
merged 12 commits into from Apr 19, 2018

Conversation

eric-czech
Copy link
Collaborator

@eric-czech eric-czech commented Apr 19, 2018

Hi @nsamusik (cc: @ashstatic),

I was working on getting the CODEXProcessor part of the pipeline onto GPUs (or at least most of it) and thought I'd share some results of that:

  • I put together implementations of each CODEXProcessor step as individual operations mostly representing TensorFlow graphs and found, in particular, that moving drift compensation onto a GPU makes a big difference
  • For the tonsil dataset, here are the differences in times it takes to run the whole thing on a 2xGPU Windows 10 machine (drift compensation, cropping, best focus, deconvolution, etc):
    • CODEXProcessor w/ Microvolution - ~50 minutes
    • This project w/ TensorFlow - ~15 minutes
      • Should be about ~10 minutes on Linux though because there was one step, applying the translations from drift compensation, that I had to do on CPU because CUDA is missing an operation on Windows that TensorFlow needs for it
  • For best focal plane selection (best_focus.py), I used the deep learning model in this Google Research blog post about Using Deep Learning to Facilitate Scientific Image Analysis. Here's an example of how it works I made when testing it out, showing how it scores images as they are blurred more and more (lower scores = better focus/quality):

img_example

  • I also tried orchestration of the tasks using Dask which made it easy to use multiple GPUs and theoretically easy to distribute the whole process amongst a cluster. There certainly might be better things out there for this though it was super simple to integrate and is surprisingly good at automatically profiling the whole application as part of its built in Dashboard. Here's what it shows for the CODEX steps:

dask_profiling

Apparently, it can also be used to report progress too which could be nice in a distributed cluster.

Please let me know if you have any thoughts or suggestions, otherwise I'll try to wrap this up and document a script to run the whole CODEXProcessor component like before with the deconvolution script.

Thanks!

@nsamusik nsamusik merged commit 55ed39a into hammerlab:master Apr 19, 2018
@nsamusik
Copy link
Collaborator

Hi @eric-czech, @ashstatic,

Eric, truly amazing job! I think the easiest way to move forward with it would be if you could pack it into a Python script with the following parameters:

  1. Path to the input folder that contains Experiment.json and the image stacks as separate subfolders
  2. name of the current image stack subfolder (typically reg###_X##_Y##). This subfolder contains the full image stack stored as single TIFF files - this one will be created by the ImageFormatter, our software that transforms images from multiple microscope formats into uniform TIFF format. The names will be C###_Z###_T###.tif (channel, z, timepoint (cycle)). Your script will simply have to open that stack and
  3. absolute path to the output folder where the resulting stack will be written in the same C###_Z###_T###.tif format
  4. zero-based index of the GPU device to use for this Python process - this will allow the parent application (CODEXServer)

We are planning on calling this Python script from a command line, the parent process being CODEXServer, a small and simple image processing/storage server that we are writing together with @vishal266 that will be accepting the raw images over the network, processing them, segmenting/quantifying the results, storing on an internal RAID | local NAS | Cloud and serving them to the CODEXViewer FIJI plugin

here's the repo
https://github.com/nolanlab/CODEXServer

Does that sound reasonable? Let me know your thoughts!

Nikolay

@nsamusik
Copy link
Collaborator

@eric-czech, as for the best focusing, it looks amazing, the only thing that I am curious about is whether it could get thrown off by some bright but irrelevant objects (like pieces of dust or tissue debris) in the image field of view.

also, could we talk at some point about the image segmentation? I have been trying to impove the segmentation accuracy using deep leaning and TF. For now I have implemented some basic feature enhancement using a ResNet model in tflearn/tensorflow, but I am not sure how to proceed with some downstream steps, i.e. what's the most efficient way to turn this into cell objects also and how to use TF gradient descent optimization to figure out the single-cell intensities - perhaps you could help us think about it?

image

@eric-czech
Copy link
Collaborator Author

eric-czech commented Apr 20, 2018

Thanks @nsamusik !

Yep, I'll work on some CLI with those arguments (which all sound reasonable). Do you have datasets in S3 or elsewhere from ImageFormatter with the new naming convention that can be shared in a public project like this? No worries if not -- I'll just keep using simulated experiment data.

re Focus Classifier: It works by averaging predictions from 84 pixel patches so I haven't tested it enough to say how fragile it is in a single patch but regardless, as long as there aren't anomalies in every patch then it certainly offers at least some resilience to that.

re Segmentation: Nice! What data did you use to train the resnet? Or was it something pre-trained on a similar problem?

I definitely have some thoughts on what I've seen for getting instance/object segmentation based on pixel classification though so should we schedule a call for next week sometime? I had a question about CODEXServer too.

@nsamusik
Copy link
Collaborator

nsamusik commented Apr 20, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants