Implementation of CODEXProcessing on GPU #1

eric-czech · 2018-04-19T12:11:57Z

I was working on getting the CODEXProcessor part of the pipeline onto GPUs (or at least most of it) and thought I'd share some results of that:

I put together implementations of each CODEXProcessor step as individual operations mostly representing TensorFlow graphs and found, in particular, that moving drift compensation onto a GPU makes a big difference
For the tonsil dataset, here are the differences in times it takes to run the whole thing on a 2xGPU Windows 10 machine (drift compensation, cropping, best focus, deconvolution, etc):
- CODEXProcessor w/ Microvolution - ~50 minutes
- This project w/ TensorFlow - ~15 minutes
  - Should be about ~10 minutes on Linux though because there was one step, applying the translations from drift compensation, that I had to do on CPU because CUDA is missing an operation on Windows that TensorFlow needs for it
For best focal plane selection (best_focus.py), I used the deep learning model in this Google Research blog post about Using Deep Learning to Facilitate Scientific Image Analysis. Here's an example of how it works I made when testing it out, showing how it scores images as they are blurred more and more (lower scores = better focus/quality):

I also tried orchestration of the tasks using Dask which made it easy to use multiple GPUs and theoretically easy to distribute the whole process amongst a cluster. There certainly might be better things out there for this though it was super simple to integrate and is surprisingly good at automatically profiling the whole application as part of its built in Dashboard. Here's what it shows for the CODEX steps:

Apparently, it can also be used to report progress too which could be nice in a distributed cluster.

Please let me know if you have any thoughts or suggestions, otherwise I'll try to wrap this up and document a script to run the whole CODEXProcessor component like before with the deconvolution script.

Thanks!

…et for best focus

…o pipeline.v1

nsamusik · 2018-04-19T23:19:22Z

Hi @eric-czech, @ashstatic,

Eric, truly amazing job! I think the easiest way to move forward with it would be if you could pack it into a Python script with the following parameters:

Path to the input folder that contains Experiment.json and the image stacks as separate subfolders
name of the current image stack subfolder (typically reg###_X##_Y##). This subfolder contains the full image stack stored as single TIFF files - this one will be created by the ImageFormatter, our software that transforms images from multiple microscope formats into uniform TIFF format. The names will be C###_Z###_T###.tif (channel, z, timepoint (cycle)). Your script will simply have to open that stack and
absolute path to the output folder where the resulting stack will be written in the same C###_Z###_T###.tif format
zero-based index of the GPU device to use for this Python process - this will allow the parent application (CODEXServer)

We are planning on calling this Python script from a command line, the parent process being CODEXServer, a small and simple image processing/storage server that we are writing together with @vishal266 that will be accepting the raw images over the network, processing them, segmenting/quantifying the results, storing on an internal RAID | local NAS | Cloud and serving them to the CODEXViewer FIJI plugin

here's the repo
https://github.com/nolanlab/CODEXServer

Does that sound reasonable? Let me know your thoughts!

Nikolay

nsamusik · 2018-04-19T23:25:18Z

@eric-czech, as for the best focusing, it looks amazing, the only thing that I am curious about is whether it could get thrown off by some bright but irrelevant objects (like pieces of dust or tissue debris) in the image field of view.

also, could we talk at some point about the image segmentation? I have been trying to impove the segmentation accuracy using deep leaning and TF. For now I have implemented some basic feature enhancement using a ResNet model in tflearn/tensorflow, but I am not sure how to proceed with some downstream steps, i.e. what's the most efficient way to turn this into cell objects also and how to use TF gradient descent optimization to figure out the single-cell intensities - perhaps you could help us think about it?

eric-czech · 2018-04-20T19:21:32Z

Thanks @nsamusik !

Yep, I'll work on some CLI with those arguments (which all sound reasonable). Do you have datasets in S3 or elsewhere from ImageFormatter with the new naming convention that can be shared in a public project like this? No worries if not -- I'll just keep using simulated experiment data.

re Focus Classifier: It works by averaging predictions from 84 pixel patches so I haven't tested it enough to say how fragile it is in a single patch but regardless, as long as there aren't anomalies in every patch then it certainly offers at least some resilience to that.

re Segmentation: Nice! What data did you use to train the resnet? Or was it something pre-trained on a similar problem?

I definitely have some thoughts on what I've seen for getting instance/object segmentation based on pixel classification though so should we schedule a call for next week sometime? I had a question about CODEXServer too.

nsamusik · 2018-04-20T20:43:13Z

Hi Eric, Here's the example formatter output, right now it's slighly out of sync with what I said, but I think we will eventually change it to just T###_C###_Z###, where the numbers will represent timepoint(cycle), channel, z-plane. The point is that if you open it in ImageJ in as an image sequence in the alphanumeric order, you can then easily convert it into a hyperstack in the default CZT order. https://drive.google.com/file/d/1B-ggmrPKKFfjL2SyFMev8yt7JgyM_2cy/view?usp=sharing My next week is free! When would you like to talk? Nikolay

On Fri, Apr 20, 2018 at 12:21 PM Eric Czech ***@***.***> wrote: Thanks @nsamusik <https://github.com/nsamusik> ! Yep, I'll work on some CLI with those arguments (which all sound reasonable). Do you have datasets in S3 or elsewhere from ImageFormatter with the new naming convention that can be shared in a public project like this? No worries if not -- I'll just keep using simulated experiment data. *re Focus Classifier:* It works by averaging predictions from 84 pixel patches so I haven't tested it enough to say how fragile it is in a single patch but regardless, as long as there aren't anomalies in every patch then it certainly offers at least some resilience to that. *re Segmentation:* Nice! What data did you use to train the resnet? Or was it something pre-trained on a similar problem? I definitely have some thoughts on what I've seen for getting instance/object segmentation based on pixel classification though so should we schedule a call for next week sometime? I had a question about CODEXServer too. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADacL6bG8eQarkMM4ifKhXa9xPVp1VLcks5tqjU9gaJpZM4TbrKn> .

-- Nikolay

…

---------------------------------------------------------------------------- This message is intended for the named addressee(s) only. It may contain privileged and confidential information and protected by a copyright. Any disclosure, copying or distribution of this message is prohibited and may be unlawful. If you are not the intended recipient, please destroy this message and notify me immediately. Thank you for your cooperation.

Eric Czech and others added 12 commits April 17, 2018 11:20

Dask experiments

1196821

Testing dask for multi-gpu

8dce672

Implementation for drift compensation and initial tests with google n…

3ba518b

…et for best focus

Testing pipeline with multiple gpus

d5e29a9

Fixing gpu setting in workers

24d475c

First working full pipeline

1333c06

Testing full run

96609dc

Testing file loader queue

26be455

Merge branch 'pipeline.v1' of https://github.com/eric-czech/codex int…

cfae5a5

…o pipeline.v1

Testing pipline with tile prefetch thread

6c5b910

Dask dashboard tests

df1a9f9

Merge branch 'pipeline.v1' of https://github.com/eric-czech/codex int…

28a6f85

…o pipeline.v1

nsamusik merged commit 55ed39a into hammerlab:master Apr 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of CODEXProcessing on GPU #1

Implementation of CODEXProcessing on GPU #1

eric-czech commented Apr 19, 2018 •

edited

nsamusik commented Apr 19, 2018

nsamusik commented Apr 19, 2018

eric-czech commented Apr 20, 2018 •

edited

nsamusik commented Apr 20, 2018 via email

Implementation of CODEXProcessing on GPU #1

Implementation of CODEXProcessing on GPU #1

Conversation

eric-czech commented Apr 19, 2018 • edited

nsamusik commented Apr 19, 2018

nsamusik commented Apr 19, 2018

eric-czech commented Apr 20, 2018 • edited

nsamusik commented Apr 20, 2018 via email

eric-czech commented Apr 19, 2018 •

edited

eric-czech commented Apr 20, 2018 •

edited