Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image order RGB or BRG? #30

Closed
John1231983 opened this issue Jan 31, 2018 · 6 comments
Closed

Image order RGB or BRG? #30

John1231983 opened this issue Jan 31, 2018 · 6 comments

Comments

@John1231983
Copy link

Hello, I am using the official resnet-101 pre-trained model (as your link). It is trained from ImageNet, with image order is RGB and IMAGE_MEAN is

_R_MEAN = 123.68 / 255
_G_MEAN = 116.78 / 255
_B_MEAN = 103.94 / 255

The official resnet-101 pre-processing L223 is

channels = tf.split(axis=2, num_or_size_splits=num_channels, value=image)
  for i in range(num_channels):
    channels[i] -= means[i]
  return tf.concat(axis=2, values=channels)

While your code is converting RGB to BRG and used another IMAGE_MEAN. I think we should you same pre-processing as pre-trained model did such as RGB order and imagenet image mean. Am I right?

@zhengyang-wang
Copy link
Owner

For deeplab pre-trained models, I believe the order of image channels is BGR.
I provide the pre-trained resnet models from TF Slim, where the means should not be divided by 255.
https://github.com/tensorflow/models/blob/master/research/slim/preprocessing/vgg_preprocessing.py
And you are right about the order. If resnet pre-trained models are used, the order should be changed back to RGB. A one-line change is enough.

@zhengyang-wang
Copy link
Owner

But actually I don't think it is crucial. In this task, the size of training patches is also different from that in resnet. And the set of images is different. Maybe simply using image_mean=[127.5, 127.5, 127.5] will work well.

@John1231983
Copy link
Author

Agree. I have tested with difference IMAGE_MEAN and it has no much performance diffference. Could I ask ome more question about pretrained model? When you use resnet pretrained model, it means that you will copy pretrained weight of resnet to encoder part of deeplab network. Then we will train the decoder part. But I found that you also trained the encoder part after copy weight from resnet model. Why did not only train the decoder part? Thanks

@zhengyang-wang
Copy link
Owner

This is because the set of images changes. It is true that they are all natural images with similar features so that transfer learning is feasible here. However, images in PASCAL or CITYSCAPES do not appear in ImageNet. Thus, we'd like to fine-tune the encoder to let it fit the new set of images. Actually, we use the pre-trained models in order to make sure the training converge, as the number of images in PASCAL or CITYSCAPES is much smaller than that in ImageNet.

@John1231983
Copy link
Author

I see. So it looks like we use pretrain model to have a good weight initialization. Then retrain the model with the good weight. Am I right?

@zhengyang-wang
Copy link
Owner

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants