Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image processing workflow #806

Merged
merged 3 commits into from
Aug 2, 2016
Merged

Conversation

gheinrich
Copy link
Contributor

This adds data and view extensions to train image processing networks in DIGITS. This may be used for de-noising, super-resolution, segmentation, etc.

The image processing data extension creates datasets for which both the input and the label are images.

The image view extension displays the network output as an image.

Only works with Caffe for now. Torch wrappers need to be updated to deal with image labels (currently only scalar or vector labels are supported).

@pansk
Copy link

pansk commented Jun 2, 2016

👍 for image labels in torch.
Is it possible also to have multiple image labels, or multiple image sources?

E.G. providing a normal map and a diffuse map and produce a shaded image (multiple input), or provide a single image and produce a super-resolution and a segmentation map (multiple output)

As far as you know, can this be done by creating the DB manually, and adding multi-channel images (for labels, sources, or both)?

@gheinrich
Copy link
Contributor Author

Hi @pansk we are using Caffe Datum objects to store data in LMDB. We may have one LMDB for features (inputs) and one LMDB for labels. That Datum format is used for both Caffe and Torch. We support only one Datum for the features and only one Datum for the labels. Datum objects require your data to be either actual ".png/.jpg/..." images ("encoded" case) or any 3D (Channels x Height x Width) tensor ("unencoded" case).

You can store multiple images in your unencoded Datum if you put them side by side across the channel dimension. For example you can store 2 RGB images by constructing a 6-channel tensor.

Given this Datum limitation what I'm planning to do for the moment in Torch is to accept anything that can be stored in Datum objects. Later on we might consider using HDF5 for those generic datasets and extend support to N-dimensional tensors though that is a longer-term project.

Would that work for you?

@gheinrich gheinrich force-pushed the dev/image-processing branch 2 times, most recently from fb6277d to e51aba6 Compare June 8, 2016 08:17
from bs4 import BeautifulSoup
import json
import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major nitpick here, but PEP8 likes to separate standard library imports and 3rd party imports:

Imports should be grouped in the following order:

  1. standard library imports
  2. related third party imports
  3. local application/library specific imports
    You should put a blank line between each group of imports.
    https://www.python.org/dev/peps/pep-0008/#imports

I've been trying to follow that format in our code since #501.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I am not sure I get the difference between a standard library import and a 3rd party import. Is it correct to say that numpy and PIL.Image are 3rd party imports and json, os, tempfile are standard library imports?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what they're distinguishing between are packages that come with a standard Python install (i.e. apt-get install python) vs. add-on packages (i.e. pip install Flask). At least, that's been my interpretation. I'm open to push-back if you think it's dumb.

import json
import os
import tempfile

from bs4 import BeautifulSoup
import numpy as np
import PIL.Image

from digits import extensions
import digits.test_views
from digits.utils import constants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it, thanks!

@gheinrich gheinrich force-pushed the dev/image-processing branch 2 times, most recently from 427dea0 to 2deb27c Compare July 4, 2016 11:55
@gheinrich gheinrich force-pushed the dev/image-processing branch 2 times, most recently from f50bc5c to 6f14123 Compare July 28, 2016 10:14
To be used with networks where the input and the output are images
@gheinrich
Copy link
Contributor Author

Rebased and updated according to comments:
#830 (comment)
#830 (comment)

@lukeyeager lukeyeager self-assigned this Aug 2, 2016
@gheinrich gheinrich merged commit cd5a050 into NVIDIA:master Aug 2, 2016
@gheinrich gheinrich deleted the dev/image-processing branch November 30, 2016 16:49
@m5061125
Copy link

m5061125 commented Mar 9, 2017

hi, gheinrich.
I am using digits 5 for medical segmentation, so far it is works fine for 2D images, I am wondering that whether if the digits support 3D image classiffication or segmentation. I know that caffe support N-D convolution and pooling by feeding with the hdf5 data format. Pls let me know if digits already support it, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants