Blobs are N-D arrays (for N not necessarily equals 4) #1486

jeffdonahue · 2014-11-26T21:07:35Z

This PR gives Blobs a vector<int> of dimensions, rather than the old num, channels, height, width. The first commit is all the changes needed to get Caffe to compile and everything to run as before with the new vector of tensor dimensions. The remaining commits generalize some existing classes to use the new tensor dimensions (but they are not necessary to make it run, as it's still fine to just use all 4-D blobs with extra singleton dimensions where needed).

Currently I think the only problem is that in the InnerProductLayer the weight blobs won't be compatible with existing saved nets (since the weights are now 2D tensors and biases are 1D tensors), so I need to add something to handle that case for backwards compatibility.

bhack · 2014-11-26T21:16:53Z

Has anyone analyzed how this will cross-impact all others PRs (some of this are ageing in the queue)?

jeffdonahue · 2014-11-26T21:20:08Z

It should have little to no impact technically; everything works fine with just my first commit (and no changes whatsoever to existing layers). It's possible as a result of this PR we might ask for some minimal changes to not assume the 4D dimensions before we merge them into the official repo, but they will work fine without these changes.

longjon · 2014-11-26T22:44:59Z

I took a really rushed pass of the first commit, it looks pretty good. I'm glad we can do this so quickly and fairly noninvasively!

jeffdonahue · 2014-11-26T23:02:54Z

Thanks for the feedback @longjon!

sirotenko · 2014-11-27T13:43:22Z

Upd. Already done.
How about keeping width, height, num, channels as a shorcuts for first 4 dimensions?
When I implemented similar functionality several years ago in cudacnn lib I found it convenient to use width and height instead of shape[0], shape[1].

shelhamer · 2014-11-27T14:14:24Z

@sirotenko yeah, that is why @jeffdonahue kept those properties for shorthand: jeffdonahue@3715eab#diff-5c854864685133b02ed80f33ba8ad535R73

shelhamer · 2014-12-01T18:31:39Z

Hey Jeff, this looks sweet! I'm pulling this for review on my flight along
with JL's latest PRs so we can warm up the merge machine and come out of
the deadline quiet.
On Mon, Dec 1, 2014 at 11:30 souzou notifications@github.com wrote:

Hello,
I done all this change, I compile but it have many error,
I try to correct them :

In solver.hpp , we should add this line in protected issue:
int current_step_;

In io.cpp , the definition of the function : CVMatToDatum, should be
before the use of it

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include // NOLINT(readability/streams)
#include
#include
#include "caffe/common.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/util/io.hpp"
namespace caffe {
using google::protobuf::io::FileInputStream;
using google::protobuf::io::FileOutputStream;
using google::protobuf::io::ZeroCopyInputStream;
using google::protobuf::io::CodedInputStream;
using google::protobuf::io::ZeroCopyOutputStream;
using google::protobuf::io::CodedOutputStream;
using google::protobuf::Message;
bool ReadProtoFromTextFile(const char* filename, Message* proto) {
int fd = open(filename, O_RDONLY);
CHECK_NE(fd, -1) << "File not found: " << filename;
FileInputStream* input = new FileInputStream(fd);
bool success = google::protobuf::TextFormat::Parse(input, proto);
delete input;
close(fd);
return success;
}
void WriteProtoToTextFile(const Message& proto, const char* filename) {
int fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, 0644);
FileOutputStream* output = new FileOutputStream(fd);
CHECK(google::protobuf::TextFormat::Print(proto, output));
delete output;
close(fd);
}
bool ReadProtoFromBinaryFile(const char* filename, Message* proto) {
int fd = open(filename, O_RDONLY);
CHECK_NE(fd, -1) << "File not found: " << filename;
ZeroCopyInputStream* raw_input = new FileInputStream(fd);
CodedInputStream* coded_input = new CodedInputStream(raw_input);
coded_input->SetTotalBytesLimit(1073741824, 536870912);
bool success = proto->ParseFromCodedStream(coded_input);
delete coded_input;
delete raw_input;
close(fd);
return success;
}

void CVMatToDatum(const cv::Mat& cv_img, Datum* datum) {
CHECK(cv_img.depth() == CV_8U) << "Image data type must be unsigned byte";
datum->set_channels(cv_img.channels());
datum->set_height(cv_img.rows);
datum->set_width(cv_img.cols);
datum->clear_data();
datum->clear_float_data();
datum->set_encoded(false);
int datum_channels = datum->channels();
int datum_height = datum->height();
int datum_width = datum->width();
int datum_size = datum_channels * datum_height * datum_width;
std::string buffer(datum_size, ' ');
for (int h = 0; h < datum_height; ++h) {
const uchar* ptr = cv_img.ptr(h);
int img_index = 0;
for (int w = 0; w < datum_width; ++w) {
for (int c = 0; c < datum_channels; ++c) {
int datum_index = (c * datum_height + h) * datum_width + w;
buffer[datum_index] = static_cast(ptr[img_index++]);
}
}
}
datum->set_data(buffer);
}

void WriteProtoToBinaryFile(const Message& proto, const char* filename) {
fstream output(filename, ios::out | ios::trunc | ios::binary);
CHECK(proto.SerializeToOstream(&output));
}
cv::Mat ReadImageToCVMat(const string& filename,
const int height, const int width, const bool is_color) {
cv::Mat cv_img;
int cv_read_flag = (is_color ? CV_LOAD_IMAGE_COLOR :
CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat cv_img_origin = cv::imread(filename, cv_read_flag);
if (!cv_img_origin.data) {
LOG(ERROR) << "Could not open or find file " << filename;
return cv_img_origin;
}
if (height > 0 && width > 0) {
cv::resize(cv_img_origin, cv_img, cv::Size(width, height));
} else {
cv_img = cv_img_origin;
}
return cv_img;
}
bool ReadImageToDatum(const string& filename, const int label,
const int height, const int width, const bool is_color, Datum* datum) {
cv::Mat cv_img = ReadImageToCVMat(filename, height, width, is_color);

if (cv_img.data) {
CVMatToDatum(cv_img, datum);
datum->set_label(label);
return true;
} else {
return false;
}
}
bool ReadFileToDatum(const string& filename, const int label,
Datum* datum) {
std::streampos size;
fstream file(filename.c_str(), ios::in|ios::binary|ios::ate);
if (file.is_open()) {
size = file.tellg();
std::string buffer(size, ' ');
file.seekg(0, ios::beg);
file.read(&buffer[0], size);
file.close();
datum->set_data(buffer);
datum->set_label(label);
datum->set_encoded(true);
return true;
} else {
return false;
}
}
cv::Mat DecodeDatumToCVMat(const Datum& datum,
const int height, const int width, const bool is_color) {
cv::Mat cv_img;
CHECK(datum.encoded()) << "Datum not encoded";
int cv_read_flag = (is_color ? CV_LOAD_IMAGE_COLOR :
CV_LOAD_IMAGE_GRAYSCALE);
const string& data = datum.data();
std::vector vec_data(data.c_str(), data.c_str() + data.size());
if (height > 0 && width > 0) {
cv::Mat cv_img_origin = cv::imdecode(cv::Mat(vec_data), cv_read_flag);
cv::resize(cv_img_origin, cv_img, cv::Size(width, height));
} else {
cv_img = cv::imdecode(vec_data, cv_read_flag);
}
if (!cv_img.data) {
LOG(ERROR) << "Could not decode datum ";
}
return cv_img;
}
// If Datum is encoded will decoded using DecodeDatumToCVMat and
CVMatToDatum
// if height and width are set it will resize it
// If Datum is not encoded will do nothing
bool DecodeDatum(const int height, const int width, const bool is_color,
Datum* datum) {
if (datum->encoded()) {
cv::Mat cv_img = DecodeDatumToCVMat((*datum), height, width, is_color);
CVMatToDatum(cv_img, datum);
return true;
} else {
return false;
}
}

// Verifies format of data stored in HDF5 file and reshapes blob
accordingly.
template
void hdf5_load_nd_dataset_helper(
hid_t file_id, const char* dataset_name_, int min_dim, int max_dim,
Blob* blob) {
// Verify that the dataset exists.
CHECK(H5LTfind_dataset(file_id, dataset_name_))
<< "Failed to find HDF5 dataset " << dataset_name_;
// Verify that the number of dimensions is in the accepted range.
herr_t status;
int ndims;
status = H5LTget_dataset_ndims(file_id, dataset_name_, &ndims);
CHECK_GE(status, 0) << "Failed to get dataset ndims for " << dataset_name_;
CHECK_GE(ndims, min_dim);
CHECK_LE(ndims, max_dim);
// Verify that the data format is what we expect: float or double.
std::vector dims(ndims);
H5T_class_t class_;
status = H5LTget_dataset_info(
file_id, dataset_name_, dims.data(), &class_, NULL);
CHECK_GE(status, 0) << "Failed to get dataset info for " << dataset_name_;
CHECK_EQ(class_, H5T_FLOAT) << "Expected float or double data";
vector blob_dims(dims.size());
for (int i = 0; i < dims.size(); ++i) {
blob_dims[i] = dims[i];
}
blob->Reshape(blob_dims);
}
template <>
void hdf5_load_nd_dataset(hid_t file_id, const char* dataset_name_,
int min_dim, int max_dim, Blob* blob) {
hdf5_load_nd_dataset_helper(file_id, dataset_name_, min_dim, max_dim,
blob);
herr_t status = H5LTread_dataset_float(
file_id, dataset_name_, blob->mutable_cpu_data());
CHECK_GE(status, 0) << "Failed to read float dataset " << dataset_name_;
}
template <>
void hdf5_load_nd_dataset(hid_t file_id, const char* dataset_name_,
int min_dim, int max_dim, Blob* blob) {
hdf5_load_nd_dataset_helper(file_id, dataset_name_, min_dim, max_dim,
blob);
herr_t status = H5LTread_dataset_double(
file_id, dataset_name_, blob->mutable_cpu_data());
CHECK_GE(status, 0) << "Failed to read double dataset " << dataset_name_;
}
template <>
void hdf5_save_nd_dataset(
const hid_t file_id, const string dataset_name, const Blob& blob) {
hsize_t dims[HDF5_NUM_DIMS];
dims[0] = blob.num();
dims[1] = blob.channels();
dims[2] = blob.height();
dims[3] = blob.width();
herr_t status = H5LTmake_dataset_float(
file_id, dataset_name.c_str(), HDF5_NUM_DIMS, dims, blob.cpu_data());
CHECK_GE(status, 0) << "Failed to make float dataset " << dataset_name;
}
template <>
void hdf5_save_nd_dataset(
const hid_t file_id, const string dataset_name, const Blob& blob) {
hsize_t dims[HDF5_NUM_DIMS];
dims[0] = blob.num();
dims[1] = blob.channels();
dims[2] = blob.height();
dims[3] = blob.width();
herr_t status = H5LTmake_dataset_double(
file_id, dataset_name.c_str(), HDF5_NUM_DIMS, dims, blob.cpu_data());
CHECK_GE(status, 0) << "Failed to make double dataset " << dataset_name;
}
} // namespace caffe

<\addr>

—
Reply to this email directly or view it on GitHub
#1486 (comment).

souzou · 2014-12-03T18:41:06Z

Hello ,
I download the caffe-tensor-blob repository writed by jeffdonahue ( I try to done this changes with the caffe-master repository but many error of compilation) , So I compile it with :

make all ---> success
make test ---> success
make runtest ---> failed

AdaGradSolverTest/2.TestAdaGradLeastSquaresUpdateWithWeightDecay
F1203 17:56:11.812132 3091 euclidean_loss_layer.cpp:14] Check failed: bottom[0]->shape() == bottom[1]->shape()
* Check failure stack trace: ***
@ 0x2ba445ceddaa (unknown)
@ 0x2ba445cedce4 (unknown)
@ 0x2ba445ced6e6 (unknown)
@ 0x2ba445cf0687 (unknown)
@ 0x7b7284 caffe::EuclideanLossLayer<>::Reshape()
@ 0x7e6731 caffe::Net<>::Init()
@ 0x7e839e caffe::Net<>::Net()
@ 0x75c550 caffe::Solver<>::InitTrainNet()
@ 0x75d806 caffe::Solver<>::Init()
@ 0x75d966 caffe::Solver<>::Solver()
@ 0x571726 caffe::AdaGradSolverTest<>::InitSolver()
@ 0x571b6b caffe::GradientBasedSolverTest<>::InitSolverFromProtoString()
@ 0x566a49 caffe::GradientBasedSolverTest<>::RunLeastSquaresSolver()
@ 0x56bbd6 caffe::AdaGradSolverTest_TestAdaGradLeastSquaresUpdateWithWeightDecay_Test<>::TestBody()
@ 0x6fafd3 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x6f1cb7 testing::Test::Run()
@ 0x6f1d5e testing::TestInfo::Run()
@ 0x6f1e65 testing::TestCase::Run()
@ 0x6f51a8 testing::internal::UnitTestImpl::RunAllTests()
@ 0x6f5437 testing::UnitTest::Run()
@ 0x42186a main
@ 0x2ba448b0eec5 (unknown)
@ 0x42a4de (unknown)
@ (nil) (unknown)
Aborted
make: * [runtest] Erreur 134
----------------***---------------------------------

If any person have explication, ???????????????

THX,

souzou · 2014-12-03T18:47:16Z

Any documentation of how to use the tensor-blob ???
How to create the data layer, ??
Any specification of the text file that contains the path of image and the labels??

souzou · 2014-12-15T17:00:03Z

Caffe can read 3 formats (as far as I know):

leveldb
hdf5

The easiest way is to store the image somewhere as a JPG (maybe another image format) and then creating two text files (trainingset.txt, testset.txt) in the following format:

PATH_TO_RGB_IMAGE PATH_TO_Gray_IMAGE LABEL
PATH_TO_RGB_IMAGE PATH_TO_Gray_IMAGE LABEL
PATH_TO_RGB_IMAGE PATH_TO_Gray_IMAGE LABEL
...

where LABEL is a number. For me only a relative path from the solver-file to the image file works in PATH_TO_IMAGE.

I try to use the tool (in the tool directory) convert_imageset to create a leveldb or a lmdb. These db files don't support N-D blob that you have to write into your net-layout-file.
My question is what change I should do to create hdf5 file because only this type support N-D blob and if they are any documentaion about HDF5 .

THX;

num/channnels/height/width indexing is valid.

from saved NetParameter Want to keep the param Blob shape the layer has set, and not necessarily adopt the one from the saved net (e.g. want to keep new 1D bias shape, rather than take the (1 x 1 x 1 x D) shape from a legacy net).

dzhwinter · 2015-03-30T07:26:57Z

@jeffdonahue @longjon how to make convolution based on N-D array in .more than 4 axes?

jeffdonahue force-pushed the tensor-blob branch from 1a86e76 to dfb4f23 Compare November 26, 2014 21:27

bhack mentioned this pull request Nov 27, 2014

upgrade the K : number of channel #1494

Closed

jeffdonahue force-pushed the tensor-blob branch 6 times, most recently from 4d982cd to a76912b Compare November 30, 2014 06:02

longjon mentioned this pull request Dec 1, 2014

net.cpp now allows zero-sized batches #1484

Closed

jeffdonahue force-pushed the tensor-blob branch from a76912b to 3bd2c0c Compare December 2, 2014 03:44

jeffdonahue force-pushed the tensor-blob branch 3 times, most recently from 73e15bc to 12e8c11 Compare December 3, 2014 23:47

shelhamer mentioned this pull request Dec 15, 2014

data layer for N-Dblob #1553

Closed

jeffdonahue mentioned this pull request Dec 19, 2014

Check for overflow in Blob::Reshape #1584

Closed

longjon mentioned this pull request Dec 28, 2014

Augment layers with their induced coordinate maps #1637

Closed

shelhamer added the enhancement label Dec 30, 2014

jeffdonahue force-pushed the tensor-blob branch 2 times, most recently from 6d76bf9 to 4a8465a Compare December 31, 2014 08:58

shelhamer mentioned this pull request Jan 15, 2015

cuDNN R2 #1731

Closed

8 tasks

jeffdonahue added 24 commits February 25, 2015 11:24

AccuracyLayer output is 0D (scalar)

0cdad98

AccuracyLayer generalized to N instance axes

fe2e760

Test{Net,Solver} fixes for AccuracyLayer generalization

5efa84d

EltwiseLayer need not assume old 4D dim names

919ba57

FlattenLayer: generalized Blob axes

2933340

common_layers.hpp: remove unused "Blob col_bob_"

9256ff9

TestConcatLayer: fix style errors

af12ddb

TestConcatLayer: add forward/gradient tests for concatenation along num

d56d8cb

ConcatLayer: generalized Blob axes

913adf0

SliceLayer: generalized Blob axes

c94fbc5

SoftmaxLayer: generalized Blob axes

aed4ab3

CuDNNSoftmaxLayer: generalized Blob axes

1dc5caf

SoftmaxLossLayer generalized like SoftmaxLayer

1f66260

SplitLayer: change Reshape(n,h,c,w) to ReshapeLike(...)

a1ea7f1

HDF5DataLayer shapes output according to HDF5 shape

4759e41

DataLayer outputs 1D labels

fd6f89e

MemoryDataLayer outputs 1D labels

32fc958

ImageDataLayer outputs 1D labels

69c9a66

WindowDataLayer outputs 1D labels

566602c

EuclideanLossLayer: generalized Blob axes

88f299a

DummyDataLayer outputs blobs of arbitrary shape

6e38795

Add CHECK_EQ(4, ...)s to "vision layers" to enforce that the

2273406

num/channnels/height/width indexing is valid.

PyBlobs support generalized axes

a359a43

jeffdonahue force-pushed the tensor-blob branch from 1500fd2 to a8023e2 Compare February 25, 2015 22:58

jeffdonahue closed this Feb 25, 2015

jeffdonahue mentioned this pull request Feb 25, 2015

Blobs are N-D arrays (for N not necessarily equals 4) #1970

Merged

shelhamer mentioned this pull request Feb 26, 2015

Augment layers with their induced coordinate maps #1975

Closed

shelhamer mentioned this pull request Mar 5, 2015

cuDNN R2 #2038

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blobs are N-D arrays (for N not necessarily equals 4) #1486

Blobs are N-D arrays (for N not necessarily equals 4) #1486

jeffdonahue commented Nov 26, 2014

bhack commented Nov 26, 2014

jeffdonahue commented Nov 26, 2014

longjon commented Nov 26, 2014

jeffdonahue commented Nov 26, 2014

sirotenko commented Nov 27, 2014

shelhamer commented Nov 27, 2014

shelhamer commented Dec 1, 2014

souzou commented Dec 3, 2014

souzou commented Dec 3, 2014

souzou commented Dec 15, 2014

dzhwinter commented Mar 30, 2015

Blobs are N-D arrays (for N not necessarily equals 4) #1486

Blobs are N-D arrays (for N not necessarily equals 4) #1486

Conversation

jeffdonahue commented Nov 26, 2014

bhack commented Nov 26, 2014

jeffdonahue commented Nov 26, 2014

longjon commented Nov 26, 2014

jeffdonahue commented Nov 26, 2014

sirotenko commented Nov 27, 2014

shelhamer commented Nov 27, 2014

shelhamer commented Dec 1, 2014

souzou commented Dec 3, 2014

souzou commented Dec 3, 2014

souzou commented Dec 15, 2014

dzhwinter commented Mar 30, 2015