Skip to content
This repository has been archived by the owner on Feb 24, 2018. It is now read-only.

"HDF5ERROR in aiori-HDF5.c (line 553): cannot create data set" error when benchmarking HDF5 in Blue Gene Q #9

Closed
brunomaga opened this issue Jul 15, 2013 · 8 comments

Comments

@brunomaga
Copy link

Hello

I'm having the following error when benchmarking the HDF5 method on a Blue Gene Q: "HDF5ERROR in aiori-HDF5.c (line 553): cannot create data set."

I'm using zlib/1.2.7 and hdf5/1.8.10, and compiling for the compute nodes (not front nodes).

The output is in this attachment below:
ior-hdf5

The file creation parameters are in this attachment below:
backend-create-params

I use the latest git master branch version (v. 3.0.1)
The MPIIO and POSIX tests work fine.

Thank you for your support.

PS: As per contract with IBM, I was told by my line manager that we have to use to the version 2.10.3 of IOR... is this version fully compatible with Blue Gene Q? Any known issues?

@morrone
Copy link
Member

morrone commented Jul 15, 2013

I'm using zlib/1.2.7 and hdf5/1.8.10, and compiling for the compute nodes (not front nodes).

You are one up on me; IOR with HDF5 won't even compile for me on BG/Q. :) Did you have to make any modifications to get it to build?

I'm using zlib/1.2.7 and hdf5/1.8.10, and compiling for the compute nodes (not front nodes).

Your screenshot says 1.8.8. Are you sure about that?

PS: As per contract with IBM, I was told by my line manager that we have to use to the version 2.10.3 of IOR... is this version fully compatible with Blue Gene Q? Any known issues?

Sorry, I haven't used that version of IOR in at least two years. Off the top of my head, I know I changed the way IOR gets the host name to identify tasks per node. I think that made IOR's output a little more sane. I cannot remember if anything was more clearly broken than that.

@brunomaga
Copy link
Author

You are right, in my test I used 1.8.8 but I've just tested 1.8.10 and it returns the same error.

About the compilation process, see if this helps:

I didn't have to modify the source code, just past the parameters into the ./configure and make steps.
To be clear, I use modules to load the zlib and hdf5 dependencies.
For the compute nodes compilation, I loaded our compiler (wrapper/xl).
Here are the variables set by the modules:

HDF5
setenv HDF5_ROOT {bgsys}/local/hdf5/1.8.10
setenv HDF5_INCDIR {bgsys}/local/hdf5/1.8.10/include
setenv HDF5_LIBDIR {bgsys}/local/hdf5/1.8.10/lib
setenv HDF5_LINK_OPTS -L{bgsys}/local/hdf5/1.8.10/lib -lhdf5

ZLIB
module-whatis loads the zlib environment for BG/Q CNs
setenv ZLIB_ROOT {bgsys}/local/zlib/1.2.7
setenv ZLIB_INCDIR {bgsys}/local/zlib/1.2.7/include
setenv ZLIB_LIBDIR {bgsys}/local/zlib/1.2.7/lib
setenv ZLIB_LINK_OPTS -L{bgsys}/local/zlib/1.2.7/lib -lz

Compiler
module-whatis loads the BG/Q mpi xl wrappers environment
prepend-path PATH {bgsys}/drivers/ppcfloor/comm/xl/bin
prepend-path MANPATH {bgsys}/drivers/ppcfloor/comm/xl/man
setenv CC {bgsys}/drivers/ppcfloor/comm/xl/bin/mpixlc
setenv CXX {bgsys}/drivers/ppcfloor/comm/xl/bin/mpixlcxx
setenv FC {bgsys}/drivers/ppcfloor/comm/xl/bin/mpixlf90

In order to configure it for hdf5 building, I run:
./configure --with-hdf5=yes CFLAGS=-I$HDF5_INCDIR LDFLAGS="-L$HDF5_LIBDIR -L$ZLIB_LIBDIR" LIBS="-lm"

Then I just run make normally, and the make command would come up like this:
mpixlc -I/bgsys/local/hdf5/1.8.10/include -L/bgsys/local/hdf5/1.8.10/lib -L/bgsys/local/zlib/1.2.7/lib -o ior ior.o utilities.o parse_options.o aiori-POSIX.o aiori-MPIIO.o aiori-HDF5.o -lhdf5 -lz -lm

I noticed now that I get two warning in the compilation process:
"aiori-HDF5.c", line 551.30: 1506-098 (E) Missing argument(s).
"aiori-HDF5.c", line 555.34: 1506-098 (E) Missing argument(s).

This is the function where the execution dies, maybe its related to this issue?

Then when running, I get the following output:

HDF5-DIAG: Error detected in HDF5 (1.8.10) MPI-process 0:
#000: H5D.c line 152 in H5Dcreate2(): not link creation property list
major: Invalid arguments to routine
minor: Inappropriate type
.
IOR-3.0.1: MPI Coordinated Test of Parallel I/O

Began: Tue Jul 16 09:47:10 2013
Command line used: ./IOR/src/ior -a HDF5
Machine: CNK bgqio10-ib

Test 0 started: Tue Jul 16 09:47:10 2013
Summary:
api = HDF5-1.8.10 (Parallel)
test filename = testFile
access = single-shared-file
ordering in a file = sequential offsets
ordering inter file= no tasks offsets
clients = 1 (1 per node)
repetitions = 1
xfersize = 262144 bytes
blocksize = 1 MiB
aggregate filesize = 1 MiB

access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter


** error **
ERROR in aiori-HDF5.c (line 553): cannot create data set.
** exiting **

Hope this helps

@morrone
Copy link
Member

morrone commented Jul 16, 2013

I noticed now that I get two warning in the compilation process:
"aiori-HDF5.c", line 551.30: 1506-098 (E) Missing argument(s).
"aiori-HDF5.c", line 555.34: 1506-098 (E) Missing argument(s).

This is the function where the execution dies, maybe its related to this issue?

Yes, I would very much suspect that to be the case. The "(E)" denotes: "Error conditions exist that the compiler can correct, but the program might not produce the expected results."

I am a bit surprised that the compiler doesn't halt, actually. I would dig into that error.

@brunomaga
Copy link
Author

Ok thank you. I will start looking into that. Let me know if you need help with the BGQ compilation.

@roblatham00
Copy link

I too am surprised you could build this far.

"aiori-HDF5.c", line 555.34: 1506-098 (E) Missing argument(s).

means IOR is trying to use old-style HDF5 arguments. There is a backwards compatiblity option, though:

add -DH5_USE_16_API to your cflags.

Maybe IOR should just do that automatically. I shudder to think anyone is actually running HDF5-1.6.x, but even if they were that define would not affect them.

@roblatham00
Copy link

I guess some day I'll set up my github life so I can "send a pull request. Until then, I blat patches into the issue tracker:

From bf979035e87250970e022527d44bf745ff20cf87 Mon Sep 17 00:00:00 2001
From: Rob Latham <robl@mcs.anl.gov>
Date: Thu, 18 Jul 2013 21:31:54 +0000
Subject: [PATCH] backwards compatiblity macro

Since aiori-HDF5 uses old 1.6-style function calls (create, open, etc),
turn on the backwards compatiblity here (the one place HDF5 calls are
made)
---
 src/aiori-HDF5.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/src/aiori-HDF5.c b/src/aiori-HDF5.c
index 2f222f0..7649be0 100644
--- a/src/aiori-HDF5.c
+++ b/src/aiori-HDF5.c
@@ -19,6 +19,9 @@
 #include <stdio.h>              /* only for fprintf() */
 #include <stdlib.h>
 #include <sys/stat.h>
+/* HDF5 routines here still use the old 1.6 style.  Nothing wrong with that but
+ * save users the trouble of  passing this flag through configure */
+#define H5_USE_16_API
 #include <hdf5.h>
 #include <mpi.h>

-- 
1.7.1

@morrone morrone reopened this Jul 18, 2013
@morrone
Copy link
Member

morrone commented Jul 18, 2013

Thanks, Rob! I added that change to master.

@brunomaga
Copy link
Author

It worked!! Thank you for you help Rob.

osteffen pushed a commit to ThinkParQ/ior-1 that referenced this issue Oct 20, 2017
Changed the semantics of -R to compare the data with the expected dat…
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants