Refactor data generator #71

mihaimorariu · 2017-08-17T21:18:13Z

This PR refactors the data generator code. The data generator now outputs ground truth boxes with classes and metadata (shape and scale of rescaled image).

0x00b1 · 2017-08-17T21:44:59Z

Hi, @mihaimorariu! I appreciate the contribution! I'll review either this evening or tomorrow. Meanwhile, would you mind adding yourself to AUTHORS.rst.

0x00b1 · 2017-08-18T03:38:48Z

@mihaimorariu Would you mind rebasing?

mihaimorariu · 2017-08-18T07:46:15Z

Hi, @0x00b1! Thank you!

I am new to this, but I would gladly like to contribute as well. I have rebased and added myself to AUTHORS.rst, as requested.

0x00b1 · 2017-08-18T15:35:13Z

keras_rcnn/preprocessing/_object_detection.py


-            current_index = (self.batch_index * batch_size) % n
+def scale_shape(shape, min_size=224, max_size=224):


Nice! We really needed this! 👍

You should add a test for this function to tests.preprocessing.object_detection.

0x00b1 · 2017-08-18T15:36:52Z

keras_rcnn/preprocessing/_object_detection.py


-            current_index = (self.batch_index * batch_size) % n
+def scale_shape(shape, min_size=224, max_size=224):


It’s a little wonky and unexpected for shape to include the channels dimension. I would expect this to work like shape parameters elsewhere (e.g. Keras’ ImageGenerator) where it’s a (width, height) pair.

It also has the benefit of not needing separate “channels first” and “channels last” branches.

Sure, I can change that. But I think it would be useful to agree on whether we are going to use (width, height) tuples, from now on, or (height, width). I saw different usage in different parts of the code.

0x00b1 · 2017-08-18T15:39:37Z

keras_rcnn/preprocessing/_object_detection.py

-                self.batch_index += 1
-            else:
-                current_batch_size = n - current_index
+    return (int(shape[0] * scale), int(shape[1] * scale), shape[2]), scale


An implementation that might be clearer:

rr, cc = shape rr *= scale cc *= scale return (int(rr), int(cc)), scale

0x00b1 · 2017-08-18T15:40:44Z

keras_rcnn/preprocessing/_object_detection.py

-                self.batch_index += 1
-            else:
-                current_batch_size = n - current_index
+    return (int(shape[0] * scale), int(shape[1] * scale), shape[2]), scale


If you drop the channels dimension, you’ll need to drop the channels dimension here too.

0x00b1 · 2017-08-18T15:44:34Z

keras_rcnn/preprocessing/_object_detection.py


-            yield index_array[current_index:current_index + current_batch_size], current_index, current_batch_size
+    return skimage.transform.rescale(image, scale=scale, mode="reflect"), scale


I don’t think this function is needed. I’d inline skimage.transform.rescale.

0x00b1 · 2017-08-18T15:51:58Z

keras_rcnn/preprocessing/_object_detection.py

-        return self.next(*args, **kwargs)
+class DictionaryIterator(keras.preprocessing.image.Iterator):
+    def __init__(self, data, classes, image_data_generator, image_shape, batch_size=1,
+                 shuffle=True, seed=numpy.uint32(time.time() * 1000)):


Likewise, the seed semantics should match so it should be seed=None.

0x00b1 · 2017-08-18T15:52:51Z

keras_rcnn/preprocessing/_object_detection.py

+        self.data                 = data
+        self.classes              = classes
+        self.image_data_generator = image_data_generator
+        self.image_shape          = image_shape


self.image_shape is unused.

0x00b1 · 2017-08-18T15:53:57Z

keras_rcnn/preprocessing/_object_detection.py

+        self.classes              = classes
+        self.image_data_generator = image_data_generator
+        self.image_shape          = image_shape
+        self.target_shape, _      = scale_shape(image_shape)


This shouldn’t be inside the constructor since the target_shape is dependent on the image.

You’re missing the call to super().

0x00b1 · 2017-08-18T15:55:53Z

keras_rcnn/preprocessing/_object_detection.py


+        return [image_batch, gt_boxes_batch, metadata], None


You’re missing the y value: [boxes, labels]

You’re also missing labels from x.

0x00b1 · 2017-08-18T15:57:11Z

keras_rcnn/preprocessing/_object_detection.py

-        image = skimage.io.imread(pathname)
+        for batch_index, image_index in enumerate(selection):
+            path         = self.data[image_index]["filename"]
+            image        = skimage.io.imread(path, as_grey=(self.target_shape[2] == 1))


You don’t need to specify as_gray.

0x00b1 · 2017-08-18T15:57:58Z

Thanks, @mihaimorariu! This was great. I requested a few changes. Let me know if you need help!

0x00b1 · 2017-08-18T17:23:11Z

keras_rcnn/preprocessing/_object_detection.py

-    def flow(self, dictionary, shuffle=True, seed=None):
-        return DictionaryIterator(dictionary, self, shuffle=shuffle, seed=seed)
+    def flow(self, data, classes, image_shape):
+        return DictionaryIterator(data, classes, keras.preprocessing.image.ImageDataGenerator(), image_shape)


keras.preprocessing.image.ImageDataGenerator() should be self

0x00b1 · 2017-08-18T17:24:12Z

keras_rcnn/preprocessing/_object_detection.py

-
-    def flow(self, dictionary, shuffle=True, seed=None):
-        return DictionaryIterator(dictionary, self, shuffle=shuffle, seed=seed)
+    def flow(self, data, classes, image_shape):


I recommend that we keep using dictionary rather than data. Is flow_from_dictionary is clearer?

0x00b1 · 2017-08-18T17:25:33Z

keras_rcnn/preprocessing/_object_detection.py

-
-    def flow(self, dictionary, shuffle=True, seed=None):
-        return DictionaryIterator(dictionary, self, shuffle=shuffle, seed=seed)
+    def flow(self, data, classes, image_shape):


We should drop the input_shape parameter.

shape should be inferred for situations where images have variable dimensions.

How will we handle differently sized images, w.r.t. the batch size? Say we have image A of size (500, 500, 3) and image B of size (1000, 1000, 3). Do we make a batch of size (2, 1000, 1000, 3)? Do we set the "unoccupied" data to 0 for image A?

@hgaiser When shape inference works it’d be (None, None, 3). However, until it’s implemented, the model expects fixed dimensions so you’d need to crop or resize.

@hgaiser @mihaimorariu Maybe I misinterpreted the image_size parameter?

Should it be target_size?

image_shape is the shape of the original image. target_shape is the shape of its rescaled version. So, if I understand correctly, shall we assume that image_shape is the same for all images, for now?

@0x00b1 I get that, but generators work with numpy arrays and numpy arrays don't work with shapes like (None, None, 3). A numpy array has to be created, containing any number of images with any kind of shape, how?

If I remember correctly image_size is the original image size and this generator would force all images to be of that size (some check for this still needs to be added, would give a clearer error message too; alternative is to reshape all images to this size). In an ideal situation, image_size is not forced to one constant size. However, given the problem mentioned above, the most viable solutions is to either force batch size to 1 or force all images to be of the same size. A combination of the two is something I don't see a solution for. I'm very much open to suggestions on this part 👍

0x00b1 · 2017-08-18T17:28:45Z

keras_rcnn/preprocessing/_object_detection.py

+            path         = self.data[image_index]["filename"]
+            image        = skimage.io.imread(path, as_grey=(self.target_shape[2] == 1))
+            image, scale = scale_image(image)
+            image        = self.image_data_generator.random_transform(image)


I would drop the random_transform and standardize calls for the moment.

Unfortunately, we can’t easily inherit from ImageDataGenerator since we’ll need to write augmentation and pre-processing functions that can manage images and bounding boxes together. We’ll also need to support masks in the near future.

Just a note, supporting masks is no issue using the seed parameter in random_transform. The easiest solution for boxes would be to make a fake image where the bounding box is a square segmentation, augment it using random_transform and re-compute the bounding box. Not efficient, but easy.

@hgaiser 👍

0x00b1 · 2017-08-18T17:30:44Z

keras_rcnn/preprocessing/_object_detection.py


-        boxes = numpy.asarray([[d[k] for k in ['y1', 'x1', 'y2', 'x2']] for d in ds])
+                gt_data = [b["y1"], b["x1"], b["y2"], b["x2"], self.classes[b["class"]]]


I believe this should be ['y1', 'x1', 'y2', 'x2'] (i.e. without the class label).

We should be yielding (my current understanding):

# image: (batches, width, height, channels) # metadata: (batches, 3) where `3` corresponds to the values of `width`, `height`, and `scale` # bounding boxes: (boxes, 4) where `4` corresponds to `y1`, `x1`, `y2`, and `x2` # labels: (boxes, classes) yield (image, metadata, bounding boxes, labels), (bounding boxes, labels)

(@jhung0 Is my understanding correct?)

yes, boxes should just have the 4 coordinates. labels has the class info

Ah so we're deviating from py-faster-rcnn here? There, gt_boxes is a matrix containing the boxes and labels as a (?, 5) matrix.

Yes, the labels here are in one-hot form.

It makes sense to split them, its clearer. Why one-hot form if I may ask?

@hgaiser It’s the standard for Keras. It does make some stuff easier (a notable exception is ImageNet’s 21,841 classes 😧).

Alright then :)

Figured with the sparse versions of the loss functions, for example sparse_categorical_crossentropy, it didn't matter much.

0x00b1 · 2017-08-18T17:39:14Z

keras_rcnn/preprocessing/_object_detection.py


-    def __next__(self, *args, **kwargs):
-        return self.next(*args, **kwargs)
+class DictionaryIterator(keras.preprocessing.image.Iterator):


Nice. 👍 Inheriting from keras.preprocessing.image.Iterator will be easier to maintain than continuing to rely on our implementation.

jhung0 · 2017-08-22T12:53:38Z

keras_rcnn/preprocessing/_object_detection.py


-        pathname = self.dictionary[index]["filename"]
+        # Labels has num_classes + 1 elements, since 0 is reserved for background.
+        num_classes = len(self.classes.keys())


could just do len(self.classes)

jhung0 · 2017-08-22T12:55:33Z

keras_rcnn/preprocessing/_object_detection.py


-        return [image_batch, gt_boxes_batch, metadata], None
+        return [images, boxes, labels, self.scale], None


Why have a None?

Generators must return X, Y, but Y doesn't make sense in our case because the network generates Y. Therefore, Y can be ignored.

Also, shouldn't metadata have image width/height information too?

Do we need to include it, since that information can be obtained from images?

jhung0 · 2017-08-24T01:16:29Z

keras_rcnn/preprocessing/_object_detection.py

@@ -37,9 +37,14 @@ def __init__(self, dictionary, classes, generator, batch_size=1, shuffle=False,
        # Compute and store the target image shape.
        cols, rows, channels          = dictionary[0]["shape"]
        self.image_shape              = (rows, cols, channels)
+
        self.target_shape, self.scale = scale_size(self.image_shape[0:2])


How would the min/max size for scale_size be set?

mihaimorariu force-pushed the features/data-generator branch 4 times, most recently from bd05718 to dfd9837 Compare August 18, 2017 07:42

hgaiser mentioned this pull request Aug 18, 2017

Object detection generator should return a scale and scale boxes with images #82

Open

0x00b1 suggested changes Aug 18, 2017

View reviewed changes

0x00b1 approved these changes Aug 18, 2017

View reviewed changes

0x00b1 reviewed Aug 18, 2017

View reviewed changes

mihaimorariu changed the title ~~Refactor data generator.~~ Refactor data generator Aug 22, 2017

mihaimorariu force-pushed the features/data-generator branch from 32ea32a to 9cb32f6 Compare August 22, 2017 11:00

jhung0 reviewed Aug 22, 2017

View reviewed changes

mihaimorariu force-pushed the features/data-generator branch from 23591e8 to 09a7c15 Compare August 23, 2017 13:42

jhung0 reviewed Aug 24, 2017

View reviewed changes

Mihai Morariu added 6 commits August 24, 2017 10:12

Refactor data generator.

17b30b4

Update AUTHORS.rst

9a97b76

Add test for scale_shape.

c99abf3

Add unit test for 'scale_size'.

abef75a

Add requested changes.

ae16a71

Return metadata.

c33bf09

mihaimorariu force-pushed the features/data-generator branch from 09a7c15 to c33bf09 Compare August 24, 2017 08:12

0x00b1 merged commit c6afb5d into broadinstitute:master Aug 25, 2017


		current_index = (self.batch_index * batch_size) % n
		def scale_shape(shape, min_size=224, max_size=224):


		yield index_array[current_index:current_index + current_batch_size], current_index, current_batch_size
		return skimage.transform.rescale(image, scale=scale, mode="reflect"), scale


		boxes = numpy.asarray([[d[k] for k in ['y1', 'x1', 'y2', 'x2']] for d in ds])
		gt_data = [b["y1"], b["x1"], b["y2"], b["x2"], self.classes[b["class"]]]


		return [image_batch, gt_boxes_batch, metadata], None
		return [images, boxes, labels, self.scale], None

Refactor data generator #71

Refactor data generator #71

Conversation

mihaimorariu commented Aug 17, 2017

0x00b1 commented Aug 17, 2017

0x00b1 commented Aug 18, 2017

mihaimorariu commented Aug 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 commented Aug 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 Aug 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 Aug 18, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0x00b1 Aug 21, 2017 •

edited

Loading

0x00b1 Aug 18, 2017 •

edited

Loading