Region Proposal Network (RPN) classification and regression losses (WIP) #41

0x00b1 · 2017-07-10T17:37:12Z

Implementation of the Region Proposal Network (RPN) classification and regression losses.

Noteworthy items:

We’ve refactored anchor generation as Keras backend functions so they’ll run from the Keras backend libraries like TensorFlow and Theano.
The loss is computed on a Lambda layer. This is a workaround for Keras’ assumption that the y_true and y_pred values have identical shapes. We should probably refactor this into a custom layer to simplify model construction.
It’s buggy. While there’re plentiful unit tests they likely don’t provide sufficient coverage.
We need to write documentation.

JihongJu · 2017-07-10T18:53:14Z

Keras’ assumption that the y_true and y_pred values have identical shapes.

That is interesting. I wasn't aware of it. Do you think we could add pre-processing to targets and generate y_true on the fly?
Keras model.compile L833

JihongJu · 2017-07-10T19:02:46Z

tests/test_models.py

+    def example(x):
+        y_true, y_pred = x
+
+        return loss(y_true, y_pred)


I think loss here needs to be passed as an argument to the Lambda layer. Otherwise, it will cause problems while saving and loading the model. See #5396.

(Not tested)

rpn_proposal_loss = keras_rcnn.losses.rpn.proposal(9, (224, 224), 16) def rpn_loss(x, loss): y_true, y_pred = x return loss(y_true, y_pred) loss = keras.layers.Lambda(example, arguments={'loss': rpn_proposal_loss},)([y_true, y_pred])

Cool! Incorporating that

codecov-io · 2017-07-10T19:36:25Z

Codecov Report

Merging #41 into master will increase coverage by 12.35%.
The diff coverage is 100%.

@@             Coverage Diff             @@
##           master      #41       +/-   ##
===========================================
+ Coverage   53.64%   65.99%   +12.35%     
===========================================
  Files          17       17               
  Lines         563      644       +81     
===========================================
+ Hits          302      425      +123     
+ Misses        261      219       -42

Impacted Files	Coverage Δ
keras_rcnn/losses/rpn.py	`100% <100%> (ø)`	⬆️
keras_rcnn/backend/common.py	`100% <100%> (+27.27%)`	⬆️
...s_rcnn/layers/object_detection/_object_proposal.py	`100% <100%> (ø)`	⬆️
keras_rcnn/backend/tensorflow_backend.py	`100% <100%> (+20.61%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e341198...9a7a4c1. Read the comment docs.

jhung0 · 2017-07-11T22:13:53Z

The code to handle the different shapes was inspired by keras-team/keras#4781 suggests using dummy targets

hgaiser

I just noticed this repository yesterday and, although it is far from finished, it looks really well thought out. I'm hoping I can contribute actively to this project in the near future. I was working on my own port of py-faster-rcnn because I was unhappy with the existing ports for keras / tensorflow, but got stuck on the losses for the RPN network. I mainly believe the issue is with how Keras forces the loss to be computed in a certain way and I made an issue about it on the keras github page (keras-team/keras#7395). If you would like to contribute to that discussion, that'd be helpful. I'm curious how the RPN loss is currently handled in keras-rcnn, that part is a bit unclear to me still. I'm guessing it is some form of workaround, but which kind is it ? :p

In my port I had made a custom layer, analogous to this example: https://github.com/fchollet/keras/blob/master/examples/variational_autoencoder.py#L56-L61 . I got a training network with a loss that was nicely shrinking, but the resulting bboxes seem to favor scaling to the entire image .. Can't figure out for the life of me why that happens.

Anyway, I'm sort of abusing this review to say hello and to see where I can contribute :)

hgaiser · 2017-07-21T13:45:00Z

keras_rcnn/backend/common.py


 def anchor(base_size=16, ratios=None, scales=None):
    """
    Generates a regular grid of multi-aspect and multi-scale anchor boxes.
    """
    if ratios is None:
-        ratios = keras.backend.variable(numpy.array([0.5, 1, 2]))
+        ratios = keras.backend.cast([0.5, 1, 2], 'float32')


Why not change the default value in the method definition to keras.backend.cast([0.5, 1, 2], 'float32')?

Better yet, https://www.tensorflow.org/api_docs/python/tf/contrib/keras/backend/floatx

hgaiser · 2017-07-21T13:45:09Z

keras_rcnn/backend/common.py


 def anchor(base_size=16, ratios=None, scales=None):
    """
    Generates a regular grid of multi-aspect and multi-scale anchor boxes.
    """
    if ratios is None:
-        ratios = keras.backend.variable(numpy.array([0.5, 1, 2]))
+        ratios = keras.backend.cast([0.5, 1, 2], 'float32')

    if scales is None:


hgaiser · 2017-07-21T13:54:50Z

keras_rcnn/backend/common.py

    return anchors


 def _ratio_enum(anchor, ratios):
    """
    Enumerate a set of anchors for each aspect ratio wrt an anchor.
    """
-
+    # import pdb


Used for debugging? Can be removed?

Yep missed that one

hgaiser · 2017-07-21T13:55:27Z

keras_rcnn/backend/common.py

+    ws = proposals[:, 2] - proposals[:, 0] + 1
+    hs = proposals[:, 3] - proposals[:, 1] + 1
+
+    indicies = keras_rcnn.backend.where((ws >= minimum) & (hs >= minimum))


Isn't it indices? :p

hgaiser · 2017-07-21T13:59:28Z

keras_rcnn/layers/object_detection/_object_proposal.py

    def __init__(self, maximum_proposals=300, **kwargs):
-        self.output_dim = (None, None, 4)
+        self.output_dim = (None, maximum_proposals, 4)


I may be mistaken, but I believe the name dim is generally reserved for a scalar value, representing the length of one dimension (https://keras.io/getting-started/sequential-model-guide/#specifying-the-input-shape)

Yes seems so. What name would be good? output_shape?

Yeah I think so.

0x00b1 · 2017-07-26T19:05:18Z

I just noticed this repository yesterday and, although it is far from finished, it looks really well thought out. I'm hoping I can contribute actively to this project in the near future. I was working on my own port of py-faster-rcnn because I was unhappy with the existing ports for keras / tensorflow, but got stuck on the losses for the RPN network. I mainly believe the issue is with how Keras forces the loss to be computed in a certain way and I made an issue about it on the keras github page (keras-team/keras#7395). If you would like to contribute to that discussion, that'd be helpful. I'm curious how the RPN loss is currently handled in keras-rcnn, that part is a bit unclear to me still. I'm guessing it is some form of workaround, but which kind is it ? :p

Hi, @hgaiser! Let’s collaborate! It’s a tricky problem and we could use the help!

In this pull request, the approach:

For y_pred, the two feature maps produced by the convolutional layers that return the regression (for the reference bounding boxes) and the classification (for the reference bounding box’s objectness) are concatenated into one matrix.
For y_true, we generate the anchors as described by the Faster RCNN paper. It’s somewhat different than the py-faster-rcnn implementation by @rbgirshick as the anchors are generated on the GPU (as part of the loss computation).

I should add that it’s still wonky since we have an intermediate loss (i.e. a loss computed on an intermediate layer). However, in the past week or two I’ve been thinking about a possible RPN implementation that treats the RPN losses as regularization terms rather than traditional losses.

I’m curious, @hgaiser, is this a problem You’d like to help out with? If so, we should schedule some time to talk! This offer also includes anybody else that is lurking and wants to help!

hgaiser · 2017-07-26T19:14:45Z

Hi,

I would very much like to collaborate! I had been working on a keras port by myself and it was driving me insane ;)

I made a few changes to this PR already, I'd like to share that with you tomorrow, I'll do so in another PR. That is sort of my view on how I imagine it, I would like to hear what you had in mind and what you think of my approach. How shall we talk? I'm always online on Slack (kerasteam), maybe that could be a good channel.

0x00b1 · 2017-07-27T01:19:19Z

tests/test_models.py

+
+    image = keras.layers.Input((224, 224, 3))
+
+    y_true = keras.layers.Input((None, 4), name="y_true")


This is a problem. You’d need to add a bounding box input to the predict method because the model will expect two inputs (an image and corresponding bounding boxes). A possible implementation is creating a custom layer that acts like keras.layers.Input during training but produces reference bounding boxes during test.

0x00b1 · 2017-07-27T01:23:10Z

tests/test_models.py

+    y_true = keras.layers.Input((None, 4), name="y_true")
+
+
+    features = keras.layers.Conv2D(64, **options)(image)


I’d use a ResNet implementation from the keras-resnet package rather than VGG since we’ll consider this standard and match everything else.

0x00b1 · 2017-07-27T01:23:51Z

tests/test_models.py

+        loss_c, loss_r = loss(y_true, y_pred)
+        return loss_c + loss_r
+
+        # use the following for testing:


You can remove these comments.

0x00b1 · 2017-07-27T01:24:15Z

tests/test_models.py

+
+    rpn_loss = keras_rcnn.losses.rpn.proposal(9, (224, 224), 16)
+
+    def example(x, loss):


Use a more descriptive identifier.

0x00b1 · 2017-07-27T01:24:39Z

tests/test_models.py

+
+    y_true = numpy.expand_dims(y_true, 0)
+
+    model.fit([a, y_true], [None])


You should also test model.predict and model.evaluate.

mcquin · 2017-08-02T14:52:42Z

Remaining work: #52 (comment)

hgaiser · 2017-08-03T11:55:21Z

keras_rcnn/backend/common.py


 def anchor(base_size=16, ratios=None, scales=None):
    """
    Generates a regular grid of multi-aspect and multi-scale anchor boxes.
    """
    if ratios is None:
-        ratios = keras.backend.variable(numpy.array([0.5, 1, 2]))
+        ratios = keras.backend.cast([0.5, 1, 2], keras.backend.floatx())


Isn't cast intended to cast Tensor's from one type to another? An alternative is to use keras.backend.variable(numpy.array[0.5, 1, 2], dtype=keras.backend.floatx())

hgaiser · 2017-08-03T11:56:19Z

keras_rcnn/backend/common.py


    targets = keras.backend.transpose(targets)

-    return targets
+    return keras.backend.cast(targets, 'float32')


keras.backend.floatx(), but is it even necessary? Isn't targets already float here?

hgaiser · 2017-08-03T11:58:34Z

keras_rcnn/layers/losses.py

+    def __init__(self, anchors, **kwargs):
+        self.anchors = anchors
+
+        self.is_placeholder = True


I noticed this in the example layer too, but upon further inspection I don't think it is necessary. Looking at Keras' source code it doesn't even appear to be necessary in the example.

hgaiser · 2017-08-03T11:59:30Z

keras_rcnn/layers/losses.py

+
+        super(Classification, self).__init__(**kwargs)
+
+    def _loss(self, y_true, y_pred):


Hmm maybe a personal preference, but I think it's better to rename y_true and y_pred to rpn_cls_score and rpn_labels or something similar.

hgaiser · 2017-08-03T12:00:19Z

keras_rcnn/layers/losses.py

+
+        super(Regression, self).__init__(**kwargs)
+
+    def _loss(self, y_true, y_pred):


hgaiser · 2017-08-03T12:02:04Z

keras_rcnn/layers/losses.py

+
+        self.add_loss(loss, inputs=inputs)
+
+        return y_pred


It's a bit semantic, but we could return loss instead of y_pred here. It is ignored, but semantically it makes more sense (and potentially necessary in the future, if Keras decides to change the way losses are handled).

hgaiser · 2017-08-03T12:02:25Z

keras_rcnn/layers/object_detection/_object_proposal.py

 import keras.engine

 import keras_rcnn.backend

+# FIXME: remove global
+RPN_PRE_NMS_TOP_N = 12000


Just a note, 12000 is only used during training, not testing.

hgaiser · 2017-08-03T12:04:23Z

keras_rcnn/losses/rpn.py

@@ -1,21 +1,107 @@
-import keras


This entire file can be removed?

hgaiser · 2017-08-03T12:06:32Z

keras_rcnn/models.py

        # ResNet50 as encoder
-        encoder = keras_resnet.models.ResNet50
+        image, _, _ = inputs
+        features = keras_resnet.models.ResNet50(image, blocks=blocks, include_top=False).output


I was thinking about this one ... do we want to call the model as a function (ie. keras_resnet.models.ResNet50(image, blocks=blocks, include_top=False)(image)) or do we want to incorporate every layer in the RCNN model? The difference is that in the summary, you'd see all the layers (the way it is now, res2-res4 and everything in between) or you would see a single 'layer' called ResNet50. Let me know what you think. For example, I'm running a FCN network (with two branches, one processing RGB, the other processing depth) at the moment and its summary looks like this:

____________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ==================================================================================================== input_25 (InputLayer) (None, 1024, 1280, 3) 0 ____________________________________________________________________________________________________ input_26 (InputLayer) (None, 1024, 1280, 1) 0 ____________________________________________________________________________________________________ resnet50-rgb (Model) (None, 1024, 1280, 1) 33155975 input_25[0][0] ____________________________________________________________________________________________________ resnet50-depth (Model) (None, 1024, 1280, 1) 33149703 input_26[0][0] ____________________________________________________________________________________________________ merged (Concatenate) (None, 1024, 1280, 2) 0 resnet50-rgb[1][0] resnet50-depth[1][0] ____________________________________________________________________________________________________ segmentation (Conv2D) (None, 1024, 1280, 1) 3 merged[0][0] ____________________________________________________________________________________________________ flatten_11 (Flatten) (None, 1310720) 0 segmentation[0][0] ==================================================================================================== Total params: 66,305,681 Trainable params: 66,199,441 Non-trainable params: 106,240

hgaiser · 2017-08-04T08:44:51Z

keras_rcnn/layers/losses.py

+
+    def _loss(self, y_true, y_pred):
+        # Binary classification loss
+        x, y = y_pred[:, :, :, :], y_true[:, :, :, self.anchors:]


I believe y_true is shaped (None, None) (as in (batch_id, label_id)). Slicing it like y_true[:, :, :, self.anchors:] won't work (why is slicing necessary on y_true anyway?)

I think the arguments are swapped.

hgaiser · 2017-08-04T08:50:45Z

keras_rcnn/layers/losses.py

+        # Binary classification loss
+        x, y = y_pred[:, :, :, :], y_true[:, :, :, self.anchors:]
+
+        a = y_true[:, :, :, :self.anchors] * keras.backend.binary_crossentropy(x, y)


I have two questions here, why use binary_crossentropy? And why multiply with y_true?
I'm thinking binary_crossentropy should be used when there is only one output, but we have two. One option is to use only one of the two outputs (output for foreground, probably) and use binary_crossentropy (the option you seem to implement here), the other option is to use categorical_crossentropy and convert the labels to one_hot vectors. Which has the preference? I don't know :) intuitively I'd guess categorical_crossentropy. I'd love to hear your thoughts about this.

hgaiser · 2017-08-04T08:52:11Z

keras_rcnn/layers/losses.py

+        b = keras.backend.epsilon() + y_true[:, :, :, :self.anchors]
+        b = keras.backend.sum(b)
+
+        return 1.0 * (a / b)


Why not return keras.backend.mean(keras.backend.binary_crossentropy(x, y)) directly? Why divide by anchor overlaps?

hgaiser · 2017-08-04T08:53:07Z

keras_rcnn/layers/losses.py

+
+    def _loss(self, y_true, y_pred):
+        # Robust L1 Loss
+        x = y_true[:, :, :, 4 * self.anchors:] - y_pred


Same as Classification, I believe it is sized (None, None, 4) (batch_id, box_id, box).

0x00b1 · 2017-08-07T20:13:00Z

@hgaiser Good call. Fixed compute_output_shape for the loss layers.

hgaiser · 2017-08-07T20:14:27Z

Oh that attribute was already there, nice :)

…

On Aug 7, 2017 10:13 PM, "Allen Goodman" ***@***.***> wrote: @hgaiser <https://github.com/hgaiser> Good call. Fixed compute_output_shape for the loss layers. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#41 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AArtakwqW5-PwjyWkgLF6q0ku9oL_14wks5sV2_MgaJpZM4OTK5M> .

JihongJu · 2017-08-09T19:41:36Z

keras_rcnn/layers/object_detection/_anchor_target.py

+
+    def compute_output_shape(self, input_shape):
+        return [(None, 1), (None, 4)]
+


I think a get_config() method would be useful to save and load trained model.

hgaiser · 2017-08-09T20:23:45Z

keras_rcnn/backend/tensorflow_backend.py

-    adjoint_b=False,
-    a_is_sparse=False,
-    b_is_sparse=False
+        a,


Just my curiosity here, why 8 spaces?

…henon-fix-rpn

…osses

…twork-losses remove int_shape from object proposal

…twork-losses trains (test_models.py updated)

0x00b1 changed the title ~~Region Proposal Network (RPN) classification and regression losses~~ Region Proposal Network (RPN) classification and regression losses (WIP) Jul 10, 2017

JihongJu reviewed Jul 10, 2017

View reviewed changes

hgaiser mentioned this pull request Jul 21, 2017

Keras losses discussion keras-team/keras#7395

Closed

hgaiser reviewed Jul 21, 2017

View reviewed changes

hgaiser mentioned this pull request Jul 25, 2017

Alternating training #49

Closed

0x00b1 commented Jul 27, 2017

View reviewed changes

hgaiser mentioned this pull request Aug 2, 2017

Region Proposal Network (RPN) corrections #52

Merged

hgaiser reviewed Aug 3, 2017

View reviewed changes

hgaiser reviewed Aug 4, 2017

View reviewed changes

hgaiser mentioned this pull request Aug 4, 2017

Change rpn classification loss layer. #56

Merged

JihongJu reviewed Aug 9, 2017

View reviewed changes

hgaiser reviewed Aug 9, 2017

View reviewed changes

Region Proposal Network (RPN)

df93c4e

0x00b1 force-pushed the features/region-proposal-network-losses branch from 47860b3 to df93c4e Compare August 15, 2017 17:38

0x00b1 and others added 8 commits August 15, 2017 13:45

removed dead code

63f5244

RCNN losses

e1d4cdd

bug fixes to the rpn - be careful to keep the zeroth axis for the batch

0a1ba10

fixed imports

3479f26

Merge branch 'fix-rpn' of https://github.com/yhenon/keras-rcnn into y…

57d1af6

…henon-fix-rpn

Merge branch 'yhenon-fix-rpn' into features/region-proposal-network-l…

4448092

…osses

sample_rois fixes

d7a8099

remove int_shape from object proposal

fe6f78f

0x00b1 and others added 8 commits August 16, 2017 15:53

Merge pull request #68 from jhung0/jhung0-features/region-proposal-ne…

7529272

…twork-losses remove int_shape from object proposal

trains (test_models.py updated)

9cffa33

Merge pull request #69 from jhung0/jhung0-features/region-proposal-ne…

4049438

…twork-losses trains (test_models.py updated)

test_models fix

9dcbe5f

ObjectDetectionGenerator fixes

cf1a024

ObjectDetectionGenerator fixes

09c552a

ObjectDetectionGenerator fixes

455b560

Merge branch 'master' into features/region-proposal-network-losses

4d7c573

0x00b1 merged commit 14add26 into master Aug 17, 2017

0x00b1 deleted the features/region-proposal-network-losses branch August 17, 2017 19:44


		image = keras.layers.Input((224, 224, 3))

		y_true = keras.layers.Input((None, 4), name="y_true")

		y_true = keras.layers.Input((None, 4), name="y_true")


		features = keras.layers.Conv2D(64, **options)(image)


		rpn_loss = keras_rcnn.losses.rpn.proposal(9, (224, 224), 16)

		def example(x, loss):


		y_true = numpy.expand_dims(y_true, 0)

		model.fit([a, y_true], [None])


		super(Classification, self).__init__(**kwargs)

		def _loss(self, y_true, y_pred):


		super(Regression, self).__init__(**kwargs)

		def _loss(self, y_true, y_pred):


		def compute_output_shape(self, input_shape):
		return [(None, 1), (None, 4)]

Region Proposal Network (RPN) classification and regression losses (WIP) #41

Region Proposal Network (RPN) classification and regression losses (WIP) #41

Conversation

0x00b1 commented Jul 10, 2017 • edited Loading

JihongJu commented Jul 10, 2017 • edited Loading

JihongJu Jul 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Jul 10, 2017 • edited Loading

Codecov Report

jhung0 commented Jul 11, 2017

hgaiser left a comment • edited Loading

Choose a reason for hiding this comment

hgaiser Jul 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 commented Jul 26, 2017 • edited Loading

hgaiser commented Jul 26, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcquin commented Aug 2, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hgaiser Aug 3, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 Aug 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 commented Aug 7, 2017

hgaiser commented Aug 7, 2017 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 commented Jul 10, 2017 •

edited

Loading

JihongJu commented Jul 10, 2017 •

edited

Loading

JihongJu Jul 10, 2017 •

edited

Loading

codecov-io commented Jul 10, 2017 •

edited

Loading

hgaiser left a comment •

edited

Loading

hgaiser Jul 21, 2017 •

edited

Loading

0x00b1 commented Jul 26, 2017 •

edited

Loading

hgaiser Aug 3, 2017 •

edited

Loading

0x00b1 Aug 4, 2017 •

edited

Loading