-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Region proposal network (RPN) layer #7
Comments
Here’re a few links with information about implementing layers with multiple inputs: keras-team/keras#148 |
This issue has information about implementing a loss function for an intermediate layer. |
@0x00b1 Why would the RPN need the ground truth as input? As far as I understand, RPN takes the Conv features as the input and predicts labels and bounding box transforms for K anchors in each cell. I guess what you mean here is something like a AnchorTargetLayer which produces anchor classification labels and bbox regression targets, given the bounding box ground truth? |
Yeah, exactly. I imagined a layer that’d encapsulate AnchorLayer, AnchorTargetLayer, and ProposalLayer into one layer to circumvent the awkward train-and-predict step in Ross’ implementation. Unfortunately, it’s still unclear how this should be implemented. That’s why I’ve been implementing the Anchor and Proposal layers in parallel. What do you think? |
I do think it's a good idea to make the train-to-predict switching easier. But I'm not sure whether encapsulating the Proposal Layer and the Anchor Target Layer into one is the way to go. It seems feeding the ground truth as inputs, instead of labels, to the model could be problematic during testing. I can hardly imagine what to fed as How about keeping the anchor ground truth generation away from the model definition? What we want then becomes a loss function that calculates the losses given the "objectness" scores and the bounding boxes GT directly. We can put the Anchor Target Layer inside such a loss function. |
@0x00b1 I am thinking of structuring it as: x = keras.layers.Input((223, 223, 3))
a = keras_resnet.ResNet50(x)
[rpn_cls, rpn_reg] = keras_rcnn.layers.RegionProposalNetwork()(a)
rpn_pred = keras.backend.concatenate([rpn_cls, rpn_reg])
proposals = keras_rcnn.layers.ObjectProposal()([rpn_cls, rpn_reg])
[rcnn_cls, rcnn_reg] = keras_rcnn.layers.ROI([7, 7])([x, proposals])
model = Model( inputs=x, outputs=[rpn_pred, rcnn_cls, rcnn_reg])
model.compile( loss=[rpn_pred_loss, rcnn_cls_loss, rcnn_reg_loss], optimizer="adam") And we have
Then we can also extend this to Mask R-CNN as simple as adding a mask branch and another loss |
@JihongJu I love this! Especially this: x = keras.layers.Input((223, 223, 3))
a = keras_resnet.ResNet50(x)
[rpn_cls, rpn_reg] = keras_rcnn.layers.RegionProposalNetwork()(a)
rpn_pred = keras.backend.concatenate([rpn_cls, rpn_reg])
proposals = keras_rcnn.layers.ObjectProposal()([rpn_cls, rpn_reg])
[rcnn_cls, rcnn_reg] = keras_rcnn.layers.ROI([7, 7])([x, proposals])
model = Model( inputs=x, outputs=[rpn_pred, rcnn_cls, rcnn_reg])
model.compile( loss=[rpn_pred_loss, rcnn_cls_loss, rcnn_reg_loss], optimizer="adam") |
@JihongJu I started structuring this into code: classes = 2
x = keras.layers.Input((224, 224, 3))
y = keras_resnet.ResNet50(x)
rpn_classification = keras.layers.Conv2D(9 * 1, (1, 1), activation="sigmoid")(y.layers[-2].output)
rpn_regression = keras.layers.Conv2D(9 * 4, (1, 1))(y.layers[-2].output)
rpn_prediction = keras.layers.concatenate([rpn_classification, rpn_regression])
proposals = keras_rcnn.layers.object_detection.ObjectProposal(300)([rpn_classification, rpn_regression])
y = keras_rcnn.layers.ROI((7, 7) 32)([x, proposals])
y = kera.layers.AveragePooling2D((7, 7))(y)
y = keras.layers.Dense(4096)(y)
score = keras.layers.Dense(classes, activation="softmax")(y)
boxes = keras.layers.Dense(4 * (classes - 1))(y)
model = keras.models.Model(x, [rpn_prediction, score, boxes])
model.compile(optimizer="adam", loss="mse") |
I started working on the loss function: https://github.com/broadinstitute/keras-rcnn/blob/master/keras_rcnn/losses/rpn.py https://github.com/broadinstitute/keras-rcnn/blob/master/tests/losses/test_rpn.py |
@0x00b1 Cool. Maybe a typo here model.compile(optimizer="adam", loss="mse") # Should be rpn/rcnn losses |
I think that y_true and y_pred should be the same shape. Right now in the tests, classification has
and regression has
|
https://github.com/mitmul/chainer-faster-rcnn/blob/v2/models/region_proposal_network.py
for classification and
for regression. So
|
my edits:
|
@jhung0 To answer you question about the |
@0x00b1 And we missed a pooling layer before the R-CNN C layers since ROI output fixed-size feature maps y = keras_rcnn.layers.ROI((7, 7) 32)([x, proposals])
y = kera.layers.AveragePooling2D(pool_size=(7, 7))(y)
y = keras.layers.Dense(4096)(y) |
@JihongJu I don't think we need that weird |
@jhung0 I agree with you. y_true should have a shape as simple as (anchors,). But we will have 0, 1 and -1 in it. Probably we need a loss can ignore -1s in y_true. |
Nice catch. Updated my earlier comment! 😎 |
@jhung0 Yeah, I dislike the keras-frcnn implementation too. The Anchor layer should have more or loss everything you need. |
@0x00b1 Another thing we missed here is that these four layers y = kera.layers.AveragePooling2D((7, 7))(y)
y = keras.layers.Dense(4096)(y)
score = keras.layers.Dense(classes, activation="softmax")(y)
boxes = keras.layers.Dense(4 * (classes - 1))(y) should be applied per proposal. We will need the TimeDistributed layer from keras for this purpose. |
Good night to everyone. I saw into the file called "keras_rcnn/models.py" that the RPN was instantiated as a MODEL and not as a layer, as the original idea here in this amazing group. I know RPN is based on CNN, but I think RPN would be better if this block were instantiated as a layer to module this block. so, I propose this (of course, as I said I'm not a good programmer still, like the people working here):
For non-concatenated output could be this (I think this is not an elegant programming, but it works in cases I tested such as concatenate again the outputs):
|
The region proposal network (RPN) should take two inputs, image features (i.e. features extracted by ResNet) and ground truth bounding boxes and produce object proposals and corresponding “objectness” scores. I’m envisioning something like:
The text was updated successfully, but these errors were encountered: