Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 1-dim vector as input rather than images #446

Closed
GeoMetrix opened this issue May 24, 2014 · 8 comments
Closed

Use 1-dim vector as input rather than images #446

GeoMetrix opened this issue May 24, 2014 · 8 comments
Labels

Comments

@GeoMetrix
Copy link

Hi. all. Is it possible to directly read a vector as an input in Caffe? For example, some feature extraction from raw data and it generates a long feature vector in one dimension. I saw most of Caffe examples are working on images and their input data is image/ matrix.

I want to try 1-dim input data. Can anyone give some suggestion or examples? Thank you vech much

@shelhamer
Copy link
Member

All blobs are 4D, but any number of those dimensions can be singletons. That is, a N x K x 1 x 1 blob is a collection of N K-length vectors. The Caffe library and wrappers let you form blobs in this way–the spatial dimensions are up to you! The probability output of our reference ImageNet model is in fact such a blob: for N inputs, the output blob is a N x 1000 x 1 x 1 blob of the 1000 class probabilities per input.

The hdf5 data layer was designed with vector processing as a main use case. One could use Caffe as a fast GPU SGD for training logistic regression or SVMs on vectors this way in addition to deep models.

@GeoMetrix
Copy link
Author

Thank for your suggestion. @shelhamer . I have tried the HDF5 data layer, but failed because of the error network definition. I think it's would be nice to write down the experiment steps, other people can also know how to do vector data with Caffe.

  1. my raw data is in LIBSVM style. So the first thing I need to do is transform them to HDF5.

import h5py
with h5py.File('sample_data.h5', 'w') as f:
f['data'] = data
f['label'] = label

1912 1:-13.2268419266 2:-10.7498941422 3:-39.1328201294 4:-1.40995645523 5:7.65763092041 6:10.6338567734 7:1.62673580647 8:-26.3683280945 9:22.6957206726 10:-0.855510830879 11:-13.2951841354 12:-9.82209396362 13:-6.48870944977 14:18.402715683 15:0.833472371101

  1. I used the lenet example as my test case. Following the hdf5 data layer definition. I thought its correct to load this dataset. But it makes some errors.

layers {
layer {
name: "hdf5data"
type: "hdf5_data"
source: "sample_data.h5"
batchsize: 5
}
top: "hdf5data"
top: "label"
}
layers {
layer {
name: "relu1"
type: "relu"
num_output: 20
kernelsize: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "hdf5data"
top: "relu1"
}
layers {
layer {
name: "pool1"
type: "pool"
kernelsize: 2
stride: 2
pool: MAX
}
bottom: "relu1"
top: "pool1"
}
layers {
layer {
name: "relu2"
type: "relu"
num_output: 50
kernelsize: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "pool1"
top: "relu2"
}
layers {
layer {
name: "pool2"
type: "pool"
kernelsize: 2
stride: 2
pool: MAX
}
bottom: "relu2"
top: "pool2"
}
layers {
layer {
name: "ip1"
type: "innerproduct"
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "pool2"
top: "ip1"
}
layers {
layer {
name: "relu1"
type: "relu"
}
bottom: "ip1"
top: "ip1"
}
layers {
layer {
name: "ip2"
type: "innerproduct"
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "ip1"
top: "ip2"
}
layers {
layer {
name: "loss"
type: "softmax_loss"
}
bottom: "ip2"
bottom: "label"
}

It seems that this code cannot read the HDF5 data. It failed to open it. Another error is the network definition. How can I solve this problem? Thanks.

I0524 19:36:21.213978 23688 train_net.cpp:26] Starting Optimization
I0524 19:36:21.214136 23688 solver.cpp:26] Creating training net.
I0524 19:36:21.214189 23688 net.cpp:74] Creating Layer hdf5data
I0524 19:36:21.214196 23688 net.cpp:110] hdf5data -> hdf5data
I0524 19:36:21.214206 23688 net.cpp:110] hdf5data -> label
I0524 19:36:21.214217 23688 hdf5_data_layer.cpp:63] Loading filename from sample_data.h5
I0524 19:36:21.214267 23688 hdf5_data_layer.cpp:75] Number of files: 9
I0524 19:36:21.214274 23688 hdf5_data_layer.cpp:34] Loading HDF5 file‰HDF
HDF5-DIAG: Error detected in HDF5 (1.8.4-patch1) thread 139670937782592:
#000: ../../../src/H5F.c line 1514 in H5Fopen(): unable to open file
major: File accessability
minor: Unable to open file
#1: ../../../src/H5F.c line 1218 in H5F_open(): unable to open file
major: File accessability
minor: Unable to open file
#2: ../../../src/H5FD.c line 1079 in H5FD_open(): open failed
major: Virtual File Layer
minor: Unable to initialize object
#3: ../../../src/H5FDsec2.c line 365 in H5FD_sec2_open(): unable to open file
major: File accessability
minor: Unable to open file
#4: ../../../src/H5FDsec2.c line 365 in H5FD_sec2_open(): No such file or directory
major: Internal error (too specific to document in detail)
minor: System error message
E0524 19:36:21.214862 23688 hdf5_data_layer.cpp:37] Failed opening HDF5 file‰HDF
I0524 19:36:21.214874 23688 hdf5_data_layer.cpp:86] output data size: 5,0,0,0
I0524 19:36:21.214885 23688 net.cpp:125] Top shape: 5 0 0 0 (0)
I0524 19:36:21.214892 23688 net.cpp:125] Top shape: 5 0 0 0 (0)
I0524 19:36:21.214898 23688 net.cpp:156] hdf5data does not need backward computation.
I0524 19:36:21.214907 23688 net.cpp:74] Creating Layer relu1
I0524 19:36:21.214915 23688 net.cpp:84] relu1 <- hdf5data
I0524 19:36:21.214923 23688 net.cpp:110] relu1 -> relu1
I0524 19:36:21.214933 23688 net.cpp:125] Top shape: 5 0 0 0 (0)
I0524 19:36:21.214939 23688 net.cpp:156] relu1 does not need backward computation.
I0524 19:36:21.214947 23688 net.cpp:74] Creating Layer pool1
I0524 19:36:21.214953 23688 net.cpp:84] pool1 <- relu1
I0524 19:36:21.214959 23688 net.cpp:110] pool1 -> pool1
I0524 19:36:21.214967 23688 net.cpp:125] Top shape: 5 0 0 0 (0)
I0524 19:36:21.214974 23688 net.cpp:156] pool1 does not need backward computation.
I0524 19:36:21.214980 23688 net.cpp:74] Creating Layer relu2
I0524 19:36:21.214987 23688 net.cpp:84] relu2 <- pool1
I0524 19:36:21.214994 23688 net.cpp:110] relu2 -> relu2
I0524 19:36:21.215000 23688 net.cpp:125] Top shape: 5 0 0 0 (0)
I0524 19:36:21.215006 23688 net.cpp:156] relu2 does not need backward computation.
I0524 19:36:21.215013 23688 net.cpp:74] Creating Layer pool2
I0524 19:36:21.215018 23688 net.cpp:84] pool2 <- relu2
I0524 19:36:21.215024 23688 net.cpp:110] pool2 -> pool2
I0524 19:36:21.215031 23688 net.cpp:125] Top shape: 5 0 0 0 (0)
I0524 19:36:21.215037 23688 net.cpp:156] pool2 does not need backward computation.
I0524 19:36:21.215044 23688 net.cpp:74] Creating Layer ip1
I0524 19:36:21.215050 23688 net.cpp:84] ip1 <- pool2
I0524 19:36:21.215056 23688 net.cpp:110] ip1 -> ip1
F0524 19:36:21.215067 23688 filler.hpp:114] Check failed: blob->count()
*** Check failure stack trace: ***
@ 0x7f07a65e3b7d google::LogMessage::Fail()
@ 0x7f07a65e5c7f google::LogMessage::SendToLog()
@ 0x7f07a65e376c google::LogMessage::Flush()
@ 0x7f07a65e651d google::LogMessageFatal::~LogMessageFatal()
@ 0x462e44 caffe::XavierFiller<>::Fill()
@ 0x4710d7 caffe::InnerProductLayer<>::SetUp()
@ 0x42fc55 caffe::Net<>::Init()
@ 0x430fe8 caffe::Net<>::Net()
@ 0x421a8c caffe::Solver<>::Solver()
@ 0x40e7ef main
@ 0x7f07a431f76d (unknown)
@ 0x40fe6d (unknown)
Aborted (core dumped)

@GeoMetrix
Copy link
Author

I also see @sergeyk wrote a hdf5 example. Can you give some code or examples? Thank you very much

@sguada
Copy link
Contributor

sguada commented May 24, 2014

Make sure the source is pointing to the right path of the file. To be sure
you can use absolute the absolute path to your hdf5 file.

On Saturday, May 24, 2014, GeoMetrix notifications@github.com wrote:

I also see @sergeyk https://github.com/sergeyk wrote a hdf5 example.
Can you give some code or examples? Thank you very much


Reply to this email directly or view it on GitHubhttps://github.com//issues/446#issuecomment-44084738
.

Sergio

@shelhamer
Copy link
Member

The network error is just a downstream problem from the hdf5 data not
loading: note the blob shape of 5 x 0 x 0 x 0. Such a blob with zero
dimensions holds no data, so the filler (a weight initializer) fails when
it tried to do a zero-length fill.

Try an absolute path as Sergio suggested. @sergeyk is there any trick to h5
data?

Le samedi 24 mai 2014, Sergio Guadarrama notifications@github.com a
écrit :

Make sure the source is pointing to the right path of the file. To be sure
you can use absolute the absolute path to your hdf5 file.

On Saturday, May 24, 2014, GeoMetrix <notifications@github.comjavascript:_e(%7B%7D,'cvml','notifications@github.com');>
wrote:

I also see @sergeyk https://github.com/sergeyk wrote a hdf5 example.
Can you give some code or examples? Thank you very much


Reply to this email directly or view it on GitHub<
https://github.com/BVLC/caffe/issues/446#issuecomment-44084738>
.

Sergio


Reply to this email directly or view it on GitHubhttps://github.com//issues/446#issuecomment-44089911
.

@GeoMetrix
Copy link
Author

Thanks. @sergeyk and @shelhamer . I replaced the hdf5 source as an absolute path. The output is the same. This code can show: hdf5_data_layer.cpp:75] Number of files: 9. It seems it can load the data from hdf5 file.

@shelhamer , I followed the hdf5 data from scripts src/caffe/test/test_data/generate_sample_data.py. I think this code works on our Test case. Its generate data can also work for new dataset.

Is there any work hdf5 dataset and examples here?

@GeoMetrix
Copy link
Author

My purpose is to use HDF5 data to represent 1-dim vector data rather than images or matrix. Therefore as @shelhamer said, we can use Caffe as a fast SGD or SVMs.

@johnny5550822
Copy link

I suffer from the same problem....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants