Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow Object Detection API for 1-channel grayscale image #3369

Closed
baimukashev opened this issue Feb 12, 2018 · 10 comments
Closed

Tensorflow Object Detection API for 1-channel grayscale image #3369

baimukashev opened this issue Feb 12, 2018 · 10 comments

Comments

@baimukashev
Copy link

System information

  • What is the top-level directory of the model you are using:
  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • TensorFlow installed from (source or binary):
  • TensorFlow version (use command below):
  • Bazel version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:
  • Exact command to reproduce:

Describe the problem

Is there any way to use pre-trained models in Object Detection API of Tensorflow, which trained for RGB images, for single channel grayscale images(depth) ?

Source code / logs

@mttang
Copy link

mttang commented Feb 12, 2018

You could convert a single channel grayscale image to a 3 channel RGB image to use the pre-trained RGB models. Alternatively, you could train a grayscale model yourself by using convert_to_grayscale in models/research/object_detection/protos/image_resizer.proto.

@vinay0410
Copy link

vinay0410 commented Apr 13, 2018

Hi @mttang,
I did that but when I try to run inferences on a single grayscale ( depth ) image using the ipynb notebook available. I get an error stating ValueError: Cannot feed value of shape (1, 480, 640) for Tensor u'image_tensor:0', which has shape '(?, ?, ?, 3)'

@cmbowyer13
Copy link

Have you figured a workout around for grey scale images @vinay0410 using the object detection api?

@richknight
Copy link

@mttang - how do you convert a single channel grayscale image to an RGB image to use with one of the well known models ?

@renandepadua
Copy link

renandepadua commented Jul 5, 2018

@richknight you can use OpenCV's applyColorMap function. I do that to apply the common models on IR image. https://docs.opencv.org/2.4/modules/contrib/doc/facerec/colormaps.html

@TwistedHardware
Copy link
Contributor

Hi @vinay0410,

ValueError: Cannot feed value of shape (1, 480, 640) for Tensor u'image_tensor:0', which has shape '(?, ?, ?, 3)'

The expected shape of the tensor is (BATCH_SIZE, WIDTH, HEIGHT, CHANNELS). and you are passing (BATCH_SIZE, WIDTH, HEIGHT) without channel.

When you reshape your images you need to keep them as rank-3 images. meaning (480, 640, 1) instead of (480, 640). To do that in numpy:

import numpy as np
img = np.random.random((480,640)) # <-- this is rank-2 image
img = img.reshape(x.shape[0],x.shape[1],1) # <-- this makes it a rank-3 image

This changes your rank-2 image of shape 480 x 640 to a rank-3 image of shape 480 x 640 x 1

@Arturexe
Copy link

Arturexe commented Aug 16, 2018

I managed to get it to work with the help of the Stackoverflow community. Take a look: https://stackoverflow.com/questions/51872412/tensorflow-numpy-image-reshape

The model is taken from Sentdexs tutorial on Youtube: https://www.youtube.com/watch?v=srPndLNMMpk&list=PLQVvvaa0QuDcNK5GeCQnxYnSSaar2tpku&index=6

@Zrufy
Copy link

Zrufy commented Aug 5, 2020

@vinay0410 how do you solve that error?I'm in stuck with this problem!

@denisb411
Copy link

Hi @vinay0410,

ValueError: Cannot feed value of shape (1, 480, 640) for Tensor u'image_tensor:0', which has shape '(?, ?, ?, 3)'

The expected shape of the tensor is (BATCH_SIZE, WIDTH, HEIGHT, CHANNELS). and you are passing (BATCH_SIZE, WIDTH, HEIGHT) without channel.

When you reshape your images you need to keep them as rank-3 images. meaning (480, 640, 1) instead of (480, 640). To do that in numpy:

import numpy as np
img = np.random.random((480,640)) # <-- this is rank-2 image
img = img.reshape(x.shape[0],x.shape[1],1) # <-- this makes it a rank-3 image

This changes your rank-2 image of shape 480 x 640 to a rank-3 image of shape 480 x 640 x 1

I can't see this working as the expected shape has 3 channels. Anyone tested it?

@hongym7
Copy link

hongym7 commented Jul 14, 2021

img_path = os.path.join(image_subdirectory, data['filename'])
with tf.gfile.GFile(img_path, 'rb') as fid:
encoded_jpg = fid.read()

encoded_jpg_io = io.BytesIO(encoded_jpg)
create_image = PIL.Image.open(encoded_jpg_io)

input image is grayscale image.
create_image is [ w, h, 3]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests