Region proposal network (RPN) layer #7

0x00b1 · 2017-05-30T13:56:36Z

The region proposal network (RPN) should take two inputs, image features (i.e. features extracted by ResNet) and ground truth bounding boxes and produce object proposals and corresponding “objectness” scores. I’m envisioning something like:

x = keras.layers.Input((223, 223, 3))

a = keras_resnet.ResNet50(x)

b = keras.layers.Input((None, 4))

y = keras_rcnn.layers.RPN((14, 14))([a, b])

0x00b1 · 2017-05-30T14:15:06Z

Here’re a few links with information about implementing layers with multiple inputs:

keras-team/keras#148
keras-team/keras#2364
keras-team/keras#3037

0x00b1 · 2017-05-30T14:18:38Z

This issue has information about implementing a loss function for an intermediate layer.

keras-team/keras#5563

JihongJu · 2017-06-02T15:26:17Z

@0x00b1 Why would the RPN need the ground truth as input? As far as I understand, RPN takes the Conv features as the input and predicts labels and bounding box transforms for K anchors in each cell. I guess what you mean here is something like a AnchorTargetLayer which produces anchor classification labels and bbox regression targets, given the bounding box ground truth?

0x00b1 · 2017-06-02T15:30:05Z

Yeah, exactly. I imagined a layer that’d encapsulate AnchorLayer, AnchorTargetLayer, and ProposalLayer into one layer to circumvent the awkward train-and-predict step in Ross’ implementation. Unfortunately, it’s still unclear how this should be implemented. That’s why I’ve been implementing the Anchor and Proposal layers in parallel. What do you think?

JihongJu · 2017-06-02T17:13:14Z

I do think it's a good idea to make the train-to-predict switching easier. But I'm not sure whether encapsulating the Proposal Layer and the Anchor Target Layer into one is the way to go. It seems feeding the ground truth as inputs, instead of labels, to the model could be problematic during testing. I can hardly imagine what to fed as b during test time.

How about keeping the anchor ground truth generation away from the model definition? What we want then becomes a loss function that calculates the losses given the "objectness" scores and the bounding boxes GT directly. We can put the Anchor Target Layer inside such a loss function.

JihongJu · 2017-06-05T21:24:52Z

@0x00b1 I am thinking of structuring it as:

x = keras.layers.Input((223, 223, 3))
a = keras_resnet.ResNet50(x)
[rpn_cls, rpn_reg] = keras_rcnn.layers.RegionProposalNetwork()(a)
rpn_pred = keras.backend.concatenate([rpn_cls, rpn_reg])
proposals = keras_rcnn.layers.ObjectProposal()([rpn_cls, rpn_reg])
[rcnn_cls, rcnn_reg] = keras_rcnn.layers.ROI([7, 7])([x, proposals])
model = Model( inputs=x, outputs=[rpn_pred, rcnn_cls, rcnn_reg])
model.compile( loss=[rpn_pred_loss, rcnn_cls_loss, rcnn_reg_loss], optimizer="adam")

And we have

def rpn_pred_loss(lambda, *args, **kwargs):
    def f(y_true, y_pred):
        # separate y_pred into rpn_cls_pred and rpn_reg_pred
        rpn_cls_pred, rpn_reg_pred = separate_pred(y_pred)
        # convert y_true from gt_boxes to gt_anchors
        rpn_cls_gt, rpn_reg_gt = encode(y_true, rpn_cls_pred)
        # classification loss
        rpn_cls_loss = keras_rcnn.rpn.classification(anchors=9)(rpn_cls_gt, rpn_cls_pred)
        # regression loss
        rpn_reg_loss = keras_rcnn.rpn.regresion(anchors=9)(rpn_reg_gt, rpn_reg_pred)

        return rpn_cls_loss + lambda * rpn_reg_loss
    return f

Then we can also extend this to Mask R-CNN as simple as adding a mask branch and another loss rcnn_mask_loss. What do you think?

0x00b1 · 2017-06-06T17:46:46Z

@JihongJu I love this! Especially this:

x = keras.layers.Input((223, 223, 3))
a = keras_resnet.ResNet50(x)
[rpn_cls, rpn_reg] = keras_rcnn.layers.RegionProposalNetwork()(a)
rpn_pred = keras.backend.concatenate([rpn_cls, rpn_reg])
proposals = keras_rcnn.layers.ObjectProposal()([rpn_cls, rpn_reg])
[rcnn_cls, rcnn_reg] = keras_rcnn.layers.ROI([7, 7])([x, proposals])
model = Model( inputs=x, outputs=[rpn_pred, rcnn_cls, rcnn_reg])
model.compile( loss=[rpn_pred_loss, rcnn_cls_loss, rcnn_reg_loss], optimizer="adam")

0x00b1 · 2017-06-08T18:56:07Z

@JihongJu I started structuring this into code:

classes = 2

x = keras.layers.Input((224, 224, 3))

y = keras_resnet.ResNet50(x)

rpn_classification = keras.layers.Conv2D(9 * 1, (1, 1), activation="sigmoid")(y.layers[-2].output)

rpn_regression = keras.layers.Conv2D(9 * 4, (1, 1))(y.layers[-2].output)

rpn_prediction = keras.layers.concatenate([rpn_classification, rpn_regression])

proposals = keras_rcnn.layers.object_detection.ObjectProposal(300)([rpn_classification, rpn_regression])

y = keras_rcnn.layers.ROI((7, 7) 32)([x, proposals])
y = kera.layers.AveragePooling2D((7, 7))(y)
y = keras.layers.Dense(4096)(y)

score = keras.layers.Dense(classes, activation="softmax")(y)

boxes = keras.layers.Dense(4 * (classes - 1))(y)

model = keras.models.Model(x, [rpn_prediction, score, boxes])

model.compile(optimizer="adam", loss="mse")

jhung0 · 2017-06-08T20:32:07Z

I started working on the loss function:

https://github.com/broadinstitute/keras-rcnn/blob/master/keras_rcnn/losses/rpn.py

https://github.com/broadinstitute/keras-rcnn/blob/master/tests/losses/test_rpn.py

JihongJu · 2017-06-08T21:15:40Z

@0x00b1 Cool. Maybe a typo here

model.compile(optimizer="adam", loss="mse") # Should be rpn/rcnn losses

jhung0 · 2017-06-08T21:51:42Z

I think that y_true and y_pred should be the same shape. Right now in the tests, classification has

y_pred = keras.backend.variable(0.5 * numpy.ones((1, 4, 4, n_anchors)))
y_true = keras.backend.variable(numpy.ones((1, 4, 4, 2 * n_anchors)))

and regression has

y_pred = keras.backend.variable(0.5 * numpy.ones((1, 4, 4, 4 * n_anchors)))
y_true = keras.backend.variable(numpy.ones((1, 4, 4, 8 * n_anchors)))

jhung0 · 2017-06-08T22:30:01Z

https://github.com/mitmul/chainer-faster-rcnn/blob/v2/models/region_proposal_network.py
has output space

2 * n_anchors

for classification and

4 * n_anchors

for regression.

So

rpn_classification = keras.layers.Conv2D(9 * 2, (1, 1), activation="softmax")(y.layers[-2].output)

jhung0 · 2017-06-08T23:31:43Z

my edits:

classes = 2

x = keras.layers.Input((224, 224, 3))

y = keras_resnet.ResNet50(x, include_top=False)

rpn_classification = keras.layers.Conv2D(9 * 2, (1, 1), activation="softmax")(y)

rpn_regression = keras.layers.Conv2D(9 * 4, (1, 1))(y)

jhung0 · 2017-06-08T23:49:14Z

And encode would be https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/rpn/anchor_target_layer.py#L65 ?

JihongJu · 2017-06-08T23:49:40Z

@jhung0 To answer you question about the y_true shape, the first anchors values indicate if the anchor is taken into account (1) or not (0). I think this implementation originally came from keras-frcnn, which I think is quite ugly. You could refer to the Anchor Layer for how the anchor target is generated for us.

JihongJu · 2017-06-08T23:50:55Z

@jhung0 Yes, indeed. and we already have some works done by @0x00b1 Anchor

JihongJu · 2017-06-09T00:06:59Z

@0x00b1 And we missed a pooling layer before the R-CNN C layers since ROI output fixed-size feature maps

y = keras_rcnn.layers.ROI((7, 7) 32)([x, proposals])
y = kera.layers.AveragePooling2D(pool_size=(7, 7))(y)
y = keras.layers.Dense(4096)(y)

jhung0 · 2017-06-09T00:08:51Z

@JihongJu I don't think we need that weird y_true shape...? The losses seem to just depend on the values with the anchor taken into account like in https://github.com/rbgirshick/py-faster-rcnn/blob/master/models/pascal_voc/VGG16/faster_rcnn_end2end/train.prototxt#L465

JihongJu · 2017-06-09T00:11:20Z

@jhung0 For rpn_cls_score, I think what matters is whether we want to use the softmax loss or the logistic loss. Because rpn will have only one possible class, I don't see a particular reason why we should use softmax loss. What do you think @0x00b1

JihongJu · 2017-06-09T00:13:37Z

@jhung0 I agree with you. y_true should have a shape as simple as (anchors,). But we will have 0, 1 and -1 in it. Probably we need a loss can ignore -1s in y_true.

0x00b1 · 2017-06-09T13:33:00Z

@0x00b1 And we missed a pooling layer before the R-CNN C layers since ROI output fixed-size feature maps

Nice catch. Updated my earlier comment! 😎

0x00b1 · 2017-06-09T13:36:41Z

@jhung0 For rpn_cls_score, I think what matters is whether we want to use the softmax loss or the logistic loss. Because rpn will have only one possible class, I don't see a particular reason why we should use softmax loss. What do you think @0x00b1

Totally. We shouldn’t use softmax.

0x00b1 · 2017-06-09T13:39:07Z

@jhung0 To answer you question about the y_true shape, the first anchors values indicate if the anchor is taken into account (1) or not (0). I think this implementation originally came from keras-frcnn, which I think is quite ugly. You could refer to the Anchor Layer for how the anchor target is generated for us.

@jhung0 Yeah, I dislike the keras-frcnn implementation too. The Anchor layer should have more or loss everything you need.

JihongJu · 2017-06-09T22:17:17Z

@0x00b1 Another thing we missed here is that these four layers

y = kera.layers.AveragePooling2D((7, 7))(y)
y = keras.layers.Dense(4096)(y)
score = keras.layers.Dense(classes, activation="softmax")(y)
boxes = keras.layers.Dense(4 * (classes - 1))(y)

should be applied per proposal. We will need the TimeDistributed layer from keras for this purpose.

JihongJu · 2017-06-09T22:58:57Z

@0x00b1 I modified the code above for the ResNet and added to #27.

emedinac · 2017-07-03T00:16:33Z

Good night to everyone.
I don't have the honor of being a contributor here and I'm not an expertise in keras programming, but I would like to suggest an idea about RPN "layer".

I saw into the file called "keras_rcnn/models.py" that the RPN was instantiated as a MODEL and not as a layer, as the original idea here in this amazing group. I know RPN is based on CNN, but I think RPN would be better if this block were instantiated as a layer to module this block. so, I propose this (of course, as I said I'm not a good programmer still, like the people working here):
Thank you for reading this message.

class RPN(keras.engine.topology.Layer):
    def __init__(self, anchors=9 , **kwargs):
        self.anchors_cls = anchors * 1
        self.anchors_reg = anchors * 4
        super(RPN, self).__init__(**kwargs)

    def build(self, input_shape):
        self.channels = self.anchors_cls + self.anchors_reg

    def call(self, inputs):
        # y = inputs.layers[-2].output
        y = inputs
        a = keras.layers.Conv2D(self.anchors_cls, (1, 1), activation="sigmoid")(y)
        b = keras.layers.Conv2D(self.anchors_reg, (1, 1))(y)

        y = keras.layers.concatenate([a, b]) 
        return y # [rpn_cls, rpn_reg]

    def compute_output_shape(self, input_shape):
        return None, input_shape[1], input_shape[2], self.channels  # shape=(?, 500, 375, 45) for VOC2012

For non-concatenated output could be this (I think this is not an elegant programming, but it works in cases I tested such as concatenate again the outputs):

    def call(self, inputs):
        # y = inputs.layers[-2].output
        y = inputs

        a = keras.layers.Conv2D(self.anchors_cls, (1, 1), activation="sigmoid")(y)
        b = keras.layers.Conv2D(self.anchors_reg, (1, 1))(y)

        # y = keras.layers.concatenate([a, b]) 
        return [a,b] # [rpn_cls, rpn_reg]

    def compute_output_shape(self, input_shape):
        out1 = None, input_shape[1], input_shape[2], self.anchors_cls, 
        out2 = None, input_shape[1], input_shape[2], self.anchors_reg
        return [out1,out2]
    def compute_mask(self, inputs, mask=None):
        return 2 * [None]

0x00b1 changed the title ~~Region proposal network (RPN)~~ Region proposal network (RPN) layer May 30, 2017

0x00b1 added enhancement help wanted labels Jun 2, 2017

JihongJu mentioned this issue Jun 9, 2017

R-CNN API design #28

Closed

0x00b1 mentioned this issue Jun 26, 2017

where is the starting point #36

Closed

JihongJu mentioned this issue Jul 28, 2017

[WIP] Region Proposal Network (RPN) classification and regression losses #51

Closed

jhung0 closed this as completed Aug 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Region proposal network (RPN) layer #7

Region proposal network (RPN) layer #7

0x00b1 commented May 30, 2017 •

edited

Loading

0x00b1 commented May 30, 2017

0x00b1 commented May 30, 2017

JihongJu commented Jun 2, 2017 •

edited

Loading

0x00b1 commented Jun 2, 2017

JihongJu commented Jun 2, 2017 •

edited

Loading

JihongJu commented Jun 5, 2017 •

edited

Loading

0x00b1 commented Jun 6, 2017

0x00b1 commented Jun 8, 2017 •

edited

Loading

jhung0 commented Jun 8, 2017

JihongJu commented Jun 8, 2017

jhung0 commented Jun 8, 2017 •

edited

Loading

jhung0 commented Jun 8, 2017 •

edited

Loading

jhung0 commented Jun 8, 2017

jhung0 commented Jun 8, 2017

JihongJu commented Jun 8, 2017

JihongJu commented Jun 8, 2017

JihongJu commented Jun 9, 2017

jhung0 commented Jun 9, 2017

JihongJu commented Jun 9, 2017

JihongJu commented Jun 9, 2017 •

edited

Loading

0x00b1 commented Jun 9, 2017

0x00b1 commented Jun 9, 2017

0x00b1 commented Jun 9, 2017

JihongJu commented Jun 9, 2017 •

edited

Loading

JihongJu commented Jun 9, 2017 •

edited

Loading

emedinac commented Jul 3, 2017 •

edited

Loading

Region proposal network (RPN) layer #7

Region proposal network (RPN) layer #7

Comments

0x00b1 commented May 30, 2017 • edited Loading

0x00b1 commented May 30, 2017

0x00b1 commented May 30, 2017

JihongJu commented Jun 2, 2017 • edited Loading

0x00b1 commented Jun 2, 2017

JihongJu commented Jun 2, 2017 • edited Loading

JihongJu commented Jun 5, 2017 • edited Loading

0x00b1 commented Jun 6, 2017

0x00b1 commented Jun 8, 2017 • edited Loading

jhung0 commented Jun 8, 2017

JihongJu commented Jun 8, 2017

jhung0 commented Jun 8, 2017 • edited Loading

jhung0 commented Jun 8, 2017 • edited Loading

jhung0 commented Jun 8, 2017

jhung0 commented Jun 8, 2017

JihongJu commented Jun 8, 2017

JihongJu commented Jun 8, 2017

JihongJu commented Jun 9, 2017

jhung0 commented Jun 9, 2017

JihongJu commented Jun 9, 2017

JihongJu commented Jun 9, 2017 • edited Loading

0x00b1 commented Jun 9, 2017

0x00b1 commented Jun 9, 2017

0x00b1 commented Jun 9, 2017

JihongJu commented Jun 9, 2017 • edited Loading

JihongJu commented Jun 9, 2017 • edited Loading

emedinac commented Jul 3, 2017 • edited Loading

0x00b1 commented May 30, 2017 •

edited

Loading

JihongJu commented Jun 2, 2017 •

edited

Loading

JihongJu commented Jun 2, 2017 •

edited

Loading

JihongJu commented Jun 5, 2017 •

edited

Loading

0x00b1 commented Jun 8, 2017 •

edited

Loading

jhung0 commented Jun 8, 2017 •

edited

Loading

jhung0 commented Jun 8, 2017 •

edited

Loading

JihongJu commented Jun 9, 2017 •

edited

Loading

JihongJu commented Jun 9, 2017 •

edited

Loading

JihongJu commented Jun 9, 2017 •

edited

Loading

emedinac commented Jul 3, 2017 •

edited

Loading