Added opencv vector<Mat> to memory data layer with tests #1416

mtamburrano · 2014-11-07T19:17:55Z

AddMatVector to load Opencv Mat into MemoryDataLayer
MemoryDataLayer accepts blobs with dynamic batch size, especially useful when you need to predict a number of images that is not known a priori

beniz · 2014-11-09T13:46:41Z

Thank you, very useful and works fine.

mtamburrano · 2014-11-17T11:00:16Z

@pleasurelong dataset.hpp is in caffe/include/caffe/dataset.hpp. I don't know why you don't have it, are you sure you have not accidentally deleted it?

ghost · 2014-11-19T07:47:41Z

@mtamburrano This is OK, I do not need this file now. thanks for your replay ^ ^

bhack · 2014-11-25T07:16:40Z

@sguada probably this will need to be changed if "the mergers" pass your #1328. What do you think? Could we unify this two?

beniz · 2014-11-27T10:27:33Z

@bhack my 2 cents here, this short PR is very simple and works well against the current dev tree. It seems to be of interest to several users. To me it'd make sense to not delay it, and update it in other PR, as needed.

bhack · 2014-11-28T09:22:12Z

@sguada Can you make a passage through here?

sguada · 2014-11-28T10:00:25Z

Maybe there was a mistaken approach with AddDatumVector which now is translated to AddMatVector, so let's fix both.

The idea is that MemoryData layer can hold more data than needed by batch_size and that forward passes consumes part of the queue data.

So let's rename added_data_ and added_label_ to data_queue_ and labels_queue_ to make more clear the usage.

If you want to add dynamic batch_size we could add a method for that, which would require to resize the tops.

mtamburrano · 2014-11-28T11:16:31Z

so, what about when added_data_ (or data_queue_) has size < batch_size? The forward pass should not do anything until the queue is large enough or it should be considered as an error?

sguada · 2014-11-28T11:19:06Z

I think that should be an error, data_queue_ should be a multiple of batch_size. So if one wants to add less data should change first the size of batch_size.

mtamburrano · 2014-11-28T11:40:45Z

Ok then, just to recap:

rename added_data_ and added_label_ to data_queue_ and labels_queue_


the forward pass should take as much items as  batch_size, until the data_queue_ is empty
introuduce a new method to automatically resize batch_size to the dimension of data_queue_ and reshape tops accordingly. In this case we could allow initial batch_size to be larger than queue_data_ size?


Doubt:
If we allow queue_data_ to be a multiple of bath_size, the user should call forward until the data is empty. So we should create also a method to check if queue_data_ is empty?

sguada · 2014-11-28T22:34:55Z

The forward pass already takes batch_size elements and move the current pos to the appropriate position. Currently when it gets to the end loop back to the beginning.

Replies:

Yeah rename vars
No, the forward doesn't need to know about data_queue it just has move the current pos within data_ which points to data_queue if that was the case.
No, the batch_size should not change depending of the size of data_queue_, it is defined as param of the layer. What I said is that we could allow to change it after construction by exposing another method, i.e. ChangeBatchSize(int batch_size) which will cause a resize of the top, which always has to be equal to batch_size.

Since the user pass the data should know how many forward passes have to do, but yeah you can also add a bool var processed_data that is true when pos reach the end n_, and exposed with a public method.

mtamburrano · 2014-12-01T17:46:02Z

yes, I meant what you write, I probably badly explained it.
My doubt is, has ever worked this mechanism? I mean, when you add more datum than batch_size and consume it with multiple forward passes?
I ask this because I see that the queue is implemented, but the check at https://github.com/BVLC/caffe/blob/dev/src/caffe/layers/memory_data_layer.cpp#L37 doesn't allow to add datum larger than batch size...
So what, I think that the check should just verify if datum.size() is a multiple of batch_size like https://github.com/BVLC/caffe/blob/dev/src/caffe/layers/memory_data_layer.cpp#L57.

Additionally the line 49 Reset(top_data, top_label, batch_size_); should not be Reset(top_data, top_label, num); ?
Although I don't understand the usefulness of n_, it is only used to check in Reset if batch_size is a multiple of n_, but n_ == batch_size because the issue reported above, and even fixing the issue like I proposed, the check in Reset would be redundant because would be performed in addDatumVector

sguada · 2014-12-03T01:56:29Z

@mtamburrano MemoryDataLayer was originally developed by @longjon to be able to pass data directly in memory by calling Reset with pointers to the data and labels, with sizes that could be multiple of batch_size. However later, I think @kloudkl, added the option to also pass Datum, and vectors of Datum, adding the extra constraint that the size of the vector should be the size of the batch.

This layer when the data is added through Reset, it works nicely with multiples of batch_size. So let's remove that constraint from AddVector for both Datum and CvMat, and allow the layer to add more than 1 batch at the time.

sguada · 2014-12-03T01:59:19Z

src/caffe/data_transformer.cpp

@@ -175,6 +175,26 @@ void DataTransformer<Dtype>::Transform(const vector<Datum> & datum_vector,
  }
 }

+template<typename Dtype>


This code should be surrounded by #ifndef OSX

mtamburrano · 2014-12-09T14:20:16Z

restructured MemoryDataLayer following @sguada's directions:

added_data and added_labels now can be store multiple of batch_size elements, and are correctly reshaped.
the forward pass consumes batch_size elements.
data can't be added before the older data has not been completely consumed.
when all the data has been consumed, the batch_size can be changed with ChangeBatchSize method.

bhack · 2014-12-29T23:49:49Z

Can be merged now?

StevenLobo2 · 2014-12-31T09:48:00Z

What could be alternative for OS X?

sguada · 2015-01-06T20:50:35Z

include/caffe/data_layers.hpp

@@ -270,10 +270,13 @@ class MemoryDataLayer : public BaseDataLayer<Dtype> {
  virtual inline int ExactNumTopBlobs() const { return 2; }



Define Reshape method to do the reshape of the tops, and use it in DataLayerSetUp. Remove the reshaping from the Forward since Reshape method is called by the net before calling Forward.

I think this is not needed anymore because #1313, right?

longjon · 2015-01-09T22:56:03Z

I haven't read this carefully, but since I've been asked about Reshape:

In my conception the Reshape method is for computing top shapes as a function of bottom shapes. So, a data layer has no bottoms, and needs no Reshape method. This is purely an organizational distinction.
Nevertheless, all layers are allowed to reshape their top blobs in their forward passes. So a data layer that produces top blobs of various shapes can reshape in Forward.
Reshaping is basically free, there's basically no reason ever to have a bool like needs_reshape_.
I can't tell whether ChangeBatchSize is really needed here, but if it is it should probably be called set_batch_size, since it's basically a setter.
I don't see any way that this should interact with Reshape single input batches for inputs of varying dimension #1313 since this derives from BaseDataLayer and that only affects DataLayer.

bhack · 2015-01-10T08:54:34Z

@longjon Thank you for the feedbacks. It was not claimed an interaction with #1313 at code level but only for maintaining "semantic constancy" inside the caffe code caused by divergent comments between core members in two different PRs. We are waiting a final reply from @sguada so that we can allocate working time to the last actions needed to let this merge.

sguada · 2015-01-11T19:06:05Z

src/caffe/layers/memory_data_layer.cpp

+      "Can't change batch_size before all data haven't been consumed"
+      << " by the upper layers";
+  batch_size_ = new_size;
+  added_data_.Reshape(batch_size_, channels_, height_, width_);


ChangeBatchSize should be just a setter, don't need to reshape anything else. That will happen in the corresponding methods.

sguada · 2015-01-11T19:07:32Z

To align with #1313, just follow the directions given by @longjon:

Reshape add_data and add_label to match the added vector
Reshape the top in the Forward to match the batch_size
Change the name to set_batch_size

bhack · 2015-01-12T10:31:08Z

@Nerei do you have any hints on how we could remove ifdefs for OSX introduced by the inclusion of opencv in caffe? This will also reply to @StevenLobo2 question.

mtamburrano · 2015-01-12T11:24:33Z

Edits done, I hope this time everything is fine ;)

bhack · 2015-01-15T10:31:47Z

@deepcnn I don't see where do you are feeding the memory data layer. Please ask usage questions on the caffe-users mailing list. Thanks!

bhack · 2015-01-16T07:28:16Z

@shelhamer @mlapin Can we remove OSX opencv ifdef from here now that #1236 is merged?

shelhamer · 2015-01-16T17:02:10Z

I think so, as long as you follow #1236 in removing OpenCV includes from
headers.
On Thu, Jan 15, 2015 at 23:28 bhack notifications@github.com wrote:

@shelhamer https://github.com/shelhamer @mlapin
https://github.com/mlapin Can we remove OSX opencv ifdef from here now
that #1236 #1236 is merged?

—
Reply to this email directly or view it on GitHub
#1416 (comment).

bhack · 2015-01-17T10:20:44Z

@deepcnn Please use your thread in the mailing list. We are not talking about your issue here. Actually this PR cannot work on OSX caused by OSX ifdef wrapping.

bhack · 2015-01-17T10:34:15Z

@shelhamer I think this is ready and reviewed multiple times by @sguada and partially by @longjon. We don't have OSX building infrastructure in our team and on Travis. Can we merge this? Some core developers with OSX want to test the removal of ifdefs?

bhack · 2015-01-27T16:11:22Z

@shelhamer I think that you have OSX. Can you brew this iced caffe or can we merge this PR?

shelhamer · 2015-02-01T18:06:53Z

@bhack I'll take a look after the ICML deadline 02/06 -- in other news, CUDA 7 is at last compatible with libc++ so the OS X installation for 10.9 + 10.10 will be so much simpler. I'll check how this works with pycaffe at the same time so we can see about having an interactive solving example.

bhack · 2015-02-01T18:42:51Z

@shelhamer We can eventually allocate some working time for the week after 02/06 (we are waiting for this from CVPR deadline and we have also others PR in the review queue almost stalled). I really understand research deadline but it is becoming really hard for us to reserve working hours without a minimal coordination of BVCL review plan (monthly bimonthly or whatever you want).

lou-k · 2015-02-05T22:41:47Z

It appears that the line:

void DataTransformer<Dtype>::Transform(const cv::Mat& cv_img

is no longer surrounded by a #ifndef OSX in the dev branch. Does that mean this change will work in OSX now?

@mtamburrano can you rebase this regardless? The PR currently breaks the build with:

src/caffe/data_transformer.cpp:195:2: error: #endif without #if
#endif
 ^
src/caffe/data_transformer.cpp:197:2: error: unterminated conditional directive
#ifndef OSX

- keep current `DataTransformer` check so that datums can be transformed into a blob incrementally - standardize check messages - include opencv where needed, drop unneeded OS X guards TODO these tests need to be de-duped

Feed cv::Mats to MemoryDataLayer and set batch size on-the-fly.

shelhamer · 2015-02-07T06:16:02Z

I merged this with a little grooming of my own in 02d9170. I hope this is useful, as it has been requested, but I'm not entirely excited myself for the creeping dependence on OpenCV. While it is helpful in some environments, it is a heavy dependency in others, so I plan to have it split off according to #1738 eventually.

- keep current `DataTransformer` check so that datums can be transformed into a blob incrementally - standardize check messages - include opencv where needed, drop unneeded OS X guards TODO these tests need to be de-duped

Added opencv vector<Mat> to memory data layer with tests

80c9a2c

MemoryDataLayer now accepts dynamic batch_size

1dae52e

bhack mentioned this pull request Nov 17, 2014

C++ Image Classification with memory_data_param #1443

Closed

sguada reviewed Dec 3, 2014
View reviewed changes

This was referenced Dec 5, 2014

OpenCV Vision Challenge #1531

Closed

graphmodels module opencv/opencv_contrib#99

Closed

shelhamer assigned sguada Dec 29, 2014

erogol mentioned this pull request Dec 30, 2014

Is this version of caffe has any function too read opencv mat into caffe format? terrychenism/caffe-windows-cudnn#2

Closed

sguada reviewed Jan 6, 2015
View reviewed changes

bhack mentioned this pull request Jan 8, 2015

Reshape single input batches for inputs of varying dimension #1313

Merged

3 tasks

mtamburrano force-pushed the matVector branch from 5690ad6 to a4d9788 Compare January 8, 2015 18:24

MemoryDataLayer now correctly consumes batch_size elements

4a61df9

mtamburrano force-pushed the matVector branch from a4d9788 to 4a61df9 Compare January 8, 2015 18:24

small fixes

1ac92f8

sguada reviewed Jan 11, 2015
View reviewed changes

removed needs_reshape_ and ChangeBatchSize is now set_batch_size

3bf7c74

GiuliaP mentioned this pull request Jan 23, 2015

Access mean image size (data_mean_) from outside Caffe's code to resize input image before feeding to MemoryDataLayer #1791

Closed

shelhamer added a commit that referenced this pull request Feb 7, 2015

Merge pull request #1416 from mtamburrano/matVector

02d9170

Feed cv::Mats to MemoryDataLayer and set batch size on-the-fly.

shelhamer closed this Feb 7, 2015

acpn mentioned this pull request Jun 21, 2015

Understanding a Prediction #2632

Closed

LiquidE3 mentioned this pull request Feb 25, 2016

the gender classification of caffemodel #3723

Closed

longjon mentioned this pull request Mar 1, 2016

Enhance memory data layer #3589

Closed

		@@ -270,10 +270,13 @@ class MemoryDataLayer : public BaseDataLayer<Dtype> {
		virtual inline int ExactNumTopBlobs() const { return 2; }

Added opencv vector<Mat> to memory data layer with tests #1416

Added opencv vector<Mat> to memory data layer with tests #1416

Conversation

mtamburrano commented Nov 7, 2014

beniz commented Nov 9, 2014

mtamburrano commented Nov 17, 2014

ghost commented Nov 19, 2014

bhack commented Nov 25, 2014

beniz commented Nov 27, 2014

bhack commented Nov 28, 2014

sguada commented Nov 28, 2014

mtamburrano commented Nov 28, 2014

sguada commented Nov 28, 2014

mtamburrano commented Nov 28, 2014

sguada commented Nov 28, 2014

mtamburrano commented Dec 1, 2014

sguada commented Dec 3, 2014

sguada Dec 3, 2014

Choose a reason for hiding this comment

mtamburrano commented Dec 9, 2014

bhack commented Dec 29, 2014

StevenLobo2 commented Dec 31, 2014

sguada Jan 6, 2015

Choose a reason for hiding this comment

mtamburrano Jan 8, 2015

Choose a reason for hiding this comment

longjon commented Jan 9, 2015

bhack commented Jan 10, 2015

sguada Jan 11, 2015

Choose a reason for hiding this comment

sguada commented Jan 11, 2015

bhack commented Jan 12, 2015

mtamburrano commented Jan 12, 2015

bhack commented Jan 15, 2015

bhack commented Jan 16, 2015

shelhamer commented Jan 16, 2015

bhack commented Jan 17, 2015

bhack commented Jan 17, 2015

bhack commented Jan 27, 2015

shelhamer commented Feb 1, 2015

bhack commented Feb 1, 2015

lou-k commented Feb 5, 2015

shelhamer commented Feb 7, 2015