Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve and polish pycaffe #816

Merged

Conversation

shelhamer
Copy link
Member

@shelhamer
Copy link
Member Author

@longjon please take a look. In 3cac223 I added the preprocessing option dicts as members on the C++ side–let me know what you think.

@shelhamer
Copy link
Member Author

Let's not merge until sorting out #525 and we should probably address #613 too. @longjon if you have a chance to help, just push to my fork to update the PR.

@shelhamer
Copy link
Member Author

@longjon a97a41b settles #525:

  • pycaffe represents images as single in [0, 255]
  • input feature scaling is done after mean subtraction
  • image resizing behaves (with special cases for performance)
    • images (RGB or intensity) are resized by normalizing / denormalizing to the whims of skimage.transform.resize
    • general K channel images are resized by scipy.ndimage.zoom

@shelhamer
Copy link
Member Author

I want to break the interface by changing Net.set_mean to take an array instead of a filename because that was a limiting mistake. @longjon thoughts?

@longjon
Copy link
Contributor

longjon commented Aug 1, 2014

Yes please. That interface in particular seems okay to break because it should go away when we finally have unified input preprocessing.

(I'll have a closer look at the rest of this soon.)

@shelhamer
Copy link
Member Author

@longjon please review and merge. Further input preprocessing optimization can be a follow-up. I'd like to have the preprocessing fix in and release the fix to master soon since it's not the nicest bug.

@shelhamer
Copy link
Member Author

Note the build is fine -- Travis just times out downloading CUDA.

else:
# ndimage interpolates anything but more slowly.
scale = tuple(np.array(new_dims)
/ np.array(im.shape[:2], dtype=np.float32))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what dtype=np.float32 is doing here... won't it just be promoted to float64 anyway? E.g.,

$ ipython --no-banner

In [1]: import numpy as np

In [2]: (np.array(5) / np.array(4, dtype=np.float32)).dtype
Out[2]: dtype('float64')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. This doesn't belong and I'll drop it.

On Monday, August 4, 2014, longjon notifications@github.com wrote:

In python/caffe/io.py:

@@ -40,7 +43,18 @@ def resize_image(im, new_dims, interp_order=1):
Give
im: resized ndarray with shape (new_dims[0], new_dims[1], K)
"""

  • return skimage.transform.resize(im, new_dims, order=interp_order)
  • if im.shape[-1] == 1 or im.shape[-1] == 3:
  •    # skimage is fast but only understands {1,3} channel images in [0, 1].
    
  •    im_min, im_max = im.min(), im.max()
    
  •    im_std = (im - im_min) / (im_max - im_min)
    
  •    resized_std = resize(im_std, new_dims, order=interp_order)
    
  •    resized_im = resized_std \* (im_max - im_min) + im_min
    
  • else:
  •    # ndimage interpolates anything but more slowly.
    
  •    scale = tuple(np.array(new_dims)
    
  •                  / np.array(im.shape[:2], dtype=np.float32))
    

Not sure what dtype=np.float32 is doing here... won't it just be promoted
to float64 anyway? E.g.,

$ ipython --no-banner
In [1]: import numpy as np
In [2]: (np.array(5) / np.array(4, dtype=np.float32)).dtypeOut[2]: dtype('float64')


Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/pull/816/files#r15795016.

@longjon
Copy link
Contributor

longjon commented Aug 5, 2014

Looks good except as noted. There were a couple kinda awkward things I noticed that existed before this PR:

  • some might consider set_* functions unpythonic, since Python has properties and direct access is always overridable
  • the mean wrangling in Detector.configure_crop duplicates Net.deprocess

In addition, I'm not totally sure why we have mnist_words.txt, since it's basically the identity. I suppose it does serve as a simple example, so that's fine.

These things said, I'm happy to merge any PR that is a strict improvement, and I'd rather merge something incremental sooner than make everything just right eventually.

define `Net.{mean, input_scale, channel_swap}` on the boost::python side
so that the members always exist. drop ugly initialization logic.
With the right input processing, the actual image classification output
is sensible.

- filter visualization example's top prediction is "tabby cat"
- net surgery fully-convolutional output map is better

Fix incorrect class names too.
@shelhamer
Copy link
Member Author

Rebased to address comments and hold off on integrating @petewarden's changes -- they will be included in a follow-up once this fix is in.

@longjon please take a last look and merge.

- reorder channels (for instance color to BGR)
- subtract mean
- transpose dimensions to K x H x W
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should raw scale be noted here?

- load an image as [0,1] single / np.float32 according to Python convention
- fix input scaling during preprocessing:
  - scale input for preprocessing by `raw_scale` e.g. to map an image
    to [0, 255] for the CaffeNet and AlexNet ImageNet models
  - scale feature space by `input_scale` after mean subtraction
  - switch examples to raw scale for ImageNet models
  - fix BVLC#525
- preserve type after resizing.
- resize 1, 3, or K channel images with special casing between
  skimage.transform (1 and 3) and scipy.ndimage (K) for speed
shelhamer added a commit that referenced this pull request Aug 6, 2014
@shelhamer shelhamer merged commit 52d7a48 into BVLC:dev Aug 6, 2014
@shelhamer shelhamer deleted the pycaffe-labels-grayscale-attrs-examples branch August 6, 2014 06:24
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
…ttrs-examples

Improve and polish pycaffe
RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014
…ttrs-examples

Improve and polish pycaffe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants