Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make runtest error with HDF5 1.8.14 #2123

Closed
pblach opened this issue Mar 14, 2015 · 7 comments
Closed

make runtest error with HDF5 1.8.14 #2123

pblach opened this issue Mar 14, 2015 · 7 comments

Comments

@pblach
Copy link

pblach commented Mar 14, 2015

A lot of the tests pass until this:
[----------] 1 test from HDF5DataLayerTest/0, where TypeParam = caffe::FloatCPU
[ RUN ] HDF5DataLayerTest/0.TestRead
HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 0:
#000: H5Dio.c line 173 in H5Dread(): can't read data
major: Dataset
minor: Read failed

The complete output is below (and HDF5 config at the bottom):

root@overo:/home/gumstix/caffe/native_caffe/caffe# make runtest
.build_release/tools/caffe
caffe: command line brew
usage: caffe

commands:
train train or finetune a model
test score a model
device_query show GPU diagnostic information
time benchmark model execution time

Flags from tools/caffe.cpp:
-gpu (Run in GPU mode on given device ID.) type: int32 default: -1
-iterations (The number of iterations to run.) type: int32 default: 50
-model (The model definition protocol buffer text file..) type: string
default: ""
-snapshot (Optional; the snapshot solver state to resume training.)
type: string default: ""
-solver (The solver definition protocol buffer text file.) type: string
default: ""
-weights (Optional; the pretrained weights to initialize finetuning. Cannot
be set simultaneously with snapshot.) type: string default: ""
.build_release/test/test_all.testbin 0 --gtest_shuffle --gtest_filter="-GPU"
Note: Google Test filter = -GPU
Note: Randomizing tests' orders with a seed of 98246 .
[==========] Running 593 tests from 111 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from MaxPoolingDropoutTest/1, where TypeParam = caffe::DoubleCPU
[ RUN ] MaxPoolingDropoutTest/1.TestForward
[ OK ] MaxPoolingDropoutTest/1.TestForward (3 ms)
[ RUN ] MaxPoolingDropoutTest/1.TestBackward
[ OK ] MaxPoolingDropoutTest/1.TestBackward (1 ms)
[ RUN ] MaxPoolingDropoutTest/1.TestSetup
[ OK ] MaxPoolingDropoutTest/1.TestSetup (0 ms)
[----------] 3 tests from MaxPoolingDropoutTest/1 (16 ms total)

[----------] 6 tests from NesterovSolverTest/1, where TypeParam = caffe::DoubleCPU
[ RUN ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateWithMomentum
[ OK ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateWithMomentum (355 ms)
[ RUN ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateWithEverything
[ OK ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateWithEverything (778 ms)
[ RUN ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateLROneTenth
[ OK ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateLROneTenth (148 ms)
[ RUN ] NesterovSolverTest/1.TestLeastSquaresUpdateWithMomentumMultiIter
[ OK ] NesterovSolverTest/1.TestLeastSquaresUpdateWithMomentumMultiIter (778 ms)
[ RUN ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateWithWeightDecay
[ OK ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdateWithWeightDecay (149 ms)
[ RUN ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdate
[ OK ] NesterovSolverTest/1.TestNesterovLeastSquaresUpdate (149 ms)
[----------] 6 tests from NesterovSolverTest/1 (2415 ms total)

[----------] 6 tests from MVNLayerTest/1, where TypeParam = caffe::DoubleCPU
[ RUN ] MVNLayerTest/1.TestGradientMeanOnly
[ OK ] MVNLayerTest/1.TestGradientMeanOnly (6477 ms)
[ RUN ] MVNLayerTest/1.TestGradient
[ OK ] MVNLayerTest/1.TestGradient (6476 ms)
[ RUN ] MVNLayerTest/1.TestForwardAcrossChannels
[ OK ] MVNLayerTest/1.TestForwardAcrossChannels (1 ms)
[ RUN ] MVNLayerTest/1.TestGradientAcrossChannels
[ OK ] MVNLayerTest/1.TestGradientAcrossChannels (6445 ms)
[ RUN ] MVNLayerTest/1.TestForwardMeanOnly
[ OK ] MVNLayerTest/1.TestForwardMeanOnly (1 ms)
[ RUN ] MVNLayerTest/1.TestForward
[ OK ] MVNLayerTest/1.TestForward (1 ms)
[----------] 6 tests from MVNLayerTest/1 (19446 ms total)

[----------] 6 tests from SGDSolverTest/0, where TypeParam = caffe::FloatCPU
[ RUN ] SGDSolverTest/0.TestLeastSquaresUpdate
[ OK ] SGDSolverTest/0.TestLeastSquaresUpdate (161 ms)
[ RUN ] SGDSolverTest/0.TestLeastSquaresUpdateWithWeightDecay
[ OK ] SGDSolverTest/0.TestLeastSquaresUpdateWithWeightDecay (143 ms)
[ RUN ] SGDSolverTest/0.TestLeastSquaresUpdateWithMomentum
[ OK ] SGDSolverTest/0.TestLeastSquaresUpdateWithMomentum (274 ms)
[ RUN ] SGDSolverTest/0.TestLeastSquaresUpdateWithMomentumMultiIter
[ OK ] SGDSolverTest/0.TestLeastSquaresUpdateWithMomentumMultiIter (704 ms)
[ RUN ] SGDSolverTest/0.TestLeastSquaresUpdateWithEverything
[ OK ] SGDSolverTest/0.TestLeastSquaresUpdateWithEverything (707 ms)
[ RUN ] SGDSolverTest/0.TestLeastSquaresUpdateLROneTenth
[ OK ] SGDSolverTest/0.TestLeastSquaresUpdateLROneTenth (141 ms)
[----------] 6 tests from SGDSolverTest/0 (2186 ms total)

[----------] 1 test from StochasticPoolingLayerTest/1, where TypeParam = double
[ RUN ] StochasticPoolingLayerTest/1.TestSetup
[ OK ] StochasticPoolingLayerTest/1.TestSetup (1 ms)
[----------] 1 test from StochasticPoolingLayerTest/1 (4 ms total)

[----------] 1 test from InfogainLossLayerTest/0, where TypeParam = caffe::FloatCPU
[ RUN ] InfogainLossLayerTest/0.TestGradient
[ OK ] InfogainLossLayerTest/0.TestGradient (30 ms)
[----------] 1 test from InfogainLossLayerTest/0 (40 ms total)

[----------] 6 tests from SliceLayerTest/1, where TypeParam = caffe::DoubleCPU
[ RUN ] SliceLayerTest/1.TestSliceAcrossChannels
[ OK ] SliceLayerTest/1.TestSliceAcrossChannels (4 ms)
[ RUN ] SliceLayerTest/1.TestGradientAcrossChannels
[ OK ] SliceLayerTest/1.TestGradientAcrossChannels (464 ms)
[ RUN ] SliceLayerTest/1.TestSetupChannels
[ OK ] SliceLayerTest/1.TestSetupChannels (2 ms)
[ RUN ] SliceLayerTest/1.TestSliceAcrossNum
[ OK ] SliceLayerTest/1.TestSliceAcrossNum (2 ms)
[ RUN ] SliceLayerTest/1.TestSetupNum
[ OK ] SliceLayerTest/1.TestSetupNum (2 ms)
[ RUN ] SliceLayerTest/1.TestGradientAcrossNum
[ OK ] SliceLayerTest/1.TestGradientAcrossNum (452 ms)
[----------] 6 tests from SliceLayerTest/1 (966 ms total)

[----------] 6 tests from AccuracyLayerTest/0, where TypeParam = float
[ RUN ] AccuracyLayerTest/0.TestForwardIgnoreLabel
[ OK ] AccuracyLayerTest/0.TestForwardIgnoreLabel (5 ms)
[ RUN ] AccuracyLayerTest/0.TestForwardCPUTopK
[ OK ] AccuracyLayerTest/0.TestForwardCPUTopK (22 ms)
[ RUN ] AccuracyLayerTest/0.TestForwardCPU
[ OK ] AccuracyLayerTest/0.TestForwardCPU (4 ms)
[ RUN ] AccuracyLayerTest/0.TestSetupTopK
[ OK ] AccuracyLayerTest/0.TestSetupTopK (2 ms)
[ RUN ] AccuracyLayerTest/0.TestSetup
[ OK ] AccuracyLayerTest/0.TestSetup (2 ms)
[ RUN ] AccuracyLayerTest/0.TestForwardWithSpatialAxes
[ OK ] AccuracyLayerTest/0.TestForwardWithSpatialAxes (4 ms)
[----------] 6 tests from AccuracyLayerTest/0 (76 ms total)

[----------] 1 test from UniformFillerTest/0, where TypeParam = float
[ RUN ] UniformFillerTest/0.TestFill
[ OK ] UniformFillerTest/0.TestFill (0 ms)
[----------] 1 test from UniformFillerTest/0 (9 ms total)

[----------] 1 test from StochasticPoolingLayerTest/0, where TypeParam = float
[ RUN ] StochasticPoolingLayerTest/0.TestSetup
[ OK ] StochasticPoolingLayerTest/0.TestSetup (1 ms)
[----------] 1 test from StochasticPoolingLayerTest/0 (1 ms total)

[----------] 1 test from HDF5DataLayerTest/0, where TypeParam = caffe::FloatCPU
[ RUN ] HDF5DataLayerTest/0.TestRead
HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 0:
#000: H5Dio.c line 173 in H5Dread(): can't read data
major: Dataset
minor: Read failed
#1: H5Dio.c line 550 in H5D__read(): can't read data
major: Dataset
minor: Read failed
#2: H5Dchunk.c line 1872 in H5D__chunk_read(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#3: H5Dchunk.c line 2902 in H5D__chunk_lock(): data pipeline read failed
major: Data filters
minor: Filter operation failed
#4: H5Z.c line 1357 in H5Z_pipeline(): required filter 'deflate' is not registered
major: Data filters
minor: Read failed
#5: H5PL.c line 298 in H5PL_load(): search in paths failed
major: Plugin for dynamically loaded library
minor: Can't get value
#6: H5PL.c line 402 in H5PL__find(): can't open directory
major: Plugin for dynamically loaded library
minor: Can't open directory or file
F0313 20:19:22.347337 5736 io.cpp:268] Check failed: status >= 0 (-1 vs. 0) Failed to read float dataset data
*** Check failure stack trace: ***
@ 0x401948f6 google::LogMessage::Fail()
@ 0x4019612a google::LogMessage::SendToLog()
@ 0x401945da google::LogMessage::Flush()
@ 0x4019677c google::LogMessageFatal::~LogMessageFatal()
@ 0x40a32956 caffe::hdf5_load_nd_dataset<>()
@ 0x40a3fc80 caffe::HDF5DataLayer<>::LoadHDF5FileData()
@ 0x40a3f9a0 caffe::HDF5DataLayer<>::Forward_cpu()
@ 0xb05a0 caffe::HDF5DataLayerTest_TestRead_Test<>::TestBody()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18421c testing::internal::HandleExceptionsInMethodIfSupported<>()
Aborted (core dumped)
make: *** [runtest] Error 134
root@overo:/home/gumstix/caffe/native_caffe/caffe#

Also I don't have a GPU, this should be CPU only.
Below is the config from HDF5 (when I run ./configure before I run make):

Post process src/libhdf5.settings
config.status: executing depfiles commands
config.status: executing libtool commands
SUMMARY OF THE HDF5 CONFIGURATION
=================================

General Information:

               HDF5 Version: 1.8.14
              Configured on: Sat Mar 14 00:03:56 UTC 2015
              Configured by: gumstix@overo
             Configure mode: production
                Host system: armv7l-unknown-linux-gnueabi
          Uname information: Linux overo 3.2.1-linaro-omap #3 PREEMPT Thu Jul 26 17:05:26 PDT 2012 armv7l armv7l armv7l GNU/Linux
                   Byte sex: little-endian
                  Libraries: static, shared
         Installation point: /home/gumstix/caffe/hdf5/hdf5-1.8.14/hdf5

Compiling Options:

           Compilation Mode: production
                 C Compiler: /usr/bin/gcc
                     CFLAGS: 
                  H5_CFLAGS:  
                  AM_CFLAGS: 
                   CPPFLAGS: 
                H5_CPPFLAGS: -D_POSIX_C_SOURCE=199506L   -DNDEBUG -UH5_DEBUG_API
                AM_CPPFLAGS: -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_BSD_SOURCE 
           Shared C Library: yes
           Static C Library: yes

Statically Linked Executables: no
LDFLAGS:
H5_LDFLAGS:
AM_LDFLAGS: -L/home/gumstix/caffe/hdf5/hdf5-1.8.14/hdf5/lib
Extra libraries: -lz -lrt -ldl -lm
Archiver: ar
Ranlib: ranlib
Debugged Packages:
API Tracing: no

Languages:

                    Fortran: no

                        C++: no

Features:

              Parallel HDF5: no
         High Level library: yes
               Threadsafety: no
        Default API Mapping: v18

With Deprecated Public Symbols: yes
I/O filters (external): deflate(zlib)
I/O filters (internal): shuffle,fletcher32,nbit,scaleoffset
MPE: no
Direct VFD: no
dmalloc: no
Clear file buffers before write: yes
Using memory checker: no
Function Stack Tracing: no
Strict File Format Checks: no
Optimization Instrumentation: no
Large File Support (LFS): yes
gumstix@overo:~/caffe/hdf5/hdf5-1.8.14$

Anything I can try or look up?

@pblach
Copy link
Author

pblach commented Mar 15, 2015

i redid HDF5 and now it said 593 tests from 111 test cases ran (370843 ms total)
[ PASSED ] 593 tests.

So apparently I didn't get HDF5 right the first time and now all seems to work.
For the record here is what I did with HDF5:
cd hdf5-1.8.14
./configure --prefix=/usr/local/hdf5 --enable-cxx
make
make check
sudo make install
sudo make check-install

i then copied a bunch of .so libraries to /usr/lib
and that's it.

@pblach pblach closed this as completed Mar 15, 2015
@gjy3035
Copy link

gjy3035 commented Mar 24, 2015

@pblach How to solve it ?
I encountered the same error. Thanks for your help!

@gjy3035
Copy link

gjy3035 commented Mar 24, 2015

@pblach I tried to solve it following you. However, the problem still appeared.
can you give me a method to solve it in detial?

@apouschi
Copy link

apouschi commented Apr 5, 2015

@gjy3035 I just dealt with this issue. It turns out that the hdf5 'deflate' filter mentioned in error 4 requires the zlib package to work properly. I got this error because my hdf5 couldn't find zlib in its default install location. To fix this, I left my zlib unchanged, uninstalled my hdf5 build, and then configured a new hdf5 build by running the following in the hdf5 directory:

./configure --enable-cxx --enable-fortran --with-zlib=/path/to/zlib/include,/path/to/zlib/lib
--prefix=/usr/local/hdf5

(Just as a general note: the --enable flags may be unnecessary. I had them left over from a previous attempt to solve this, and haven't tried configuring hdf5 without them. Also, the --with-zlib part is described in section 4.3.7 of the hdf5 installation instructions, in case more detail is needed.)

From there I finished installing hdf5 with its make commands, then did the same for caffe.

For clarity, all the commands i used are replicated below:

cd hdf5-1.8.14
./configure --enable-cxx --enable-fortran --with-zlib=/path/to/zlib/include,/path/to/zlib/lib
--prefix=/usr/local/hdf5
make
make check
sudo make install
sudo make install-check
cd ../caffe-master
make
make test
make runtest

I'm guessing that pblach's hdf5 was able to find zlib after he installed it, while yours, like mine, was not, but that's just a guess. For anyone else who might find this thread, I'd advise you to try pblach's solution first, and if you still get the same error, try mine.

Hope this helps

@vgprasadh
Copy link

@apouschi - Thanks for the zlib info. I was lost until I saw your post. Thanks once again.

@pblach
Copy link
Author

pblach commented Nov 3, 2015

what a coincidence, i'm building hdf5 as we speak (with the goal of
building caffe). I will let you know how it goes.

On Tue, Nov 3, 2015 at 5:28 AM, Guha Prasad Venkataraman <
notifications@github.com> wrote:

@apouschi https://github.com/apouschi - Thanks for the zlib info. I was
lost until I saw your post. Thanks once again.


Reply to this email directly or view it on GitHub
#2123 (comment).

@tianoak
Copy link

tianoak commented Mar 29, 2017

I have to use --enable-fortran90, so it can ignore fortran2003

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants