A basic loader of HDF5 files into SciDB
C++
Latest commit 7aadaa8 Dec 12, 2013 @wangd Update HISTORY and README
Permalink
Failed to load latest commit information.
cmake/Modules
doc
src
tests
.gitignore
CMakeLists.txt
HISTORY
LICENSE
README
README.building

README

SciDB-HDF5
----------
The SciDB-HDF5 project was designed to allow scientists to load their
array data from HDF5 (now FITS as well) files into SciDB.  While full
functionality is not promised, we have used the code to load large 1D,
2D, and 3D arrays and manipulate them using SciDB.


Quick notes
-----------

Prerequisites:
A checkout of SciDB  -- scidb.org/download
CMake -- www.cmake.org
HDF5 libraries and headers (e.g., yum install hdf5 hdf5-devel)
CFITSIO and CCfits (for FITS functionality)

RHEL5 RPMS for HDF5 are available at: 
http://www.slac.stanford.edu/~danielw/hdf/
You need the -static and -devel packages at minimum, and probably the plain 
hdf5 rpm too.

How to build:
------------
Standard cmake steps should work.

Suppose you've checked out the source into /home/daniel/scidbhdf

1. Make a build directory, say /home/daniel/scidbhdf-bld
2. cd into your build directory ("cd /home/daniel/scidbhdf-bld")
3. "cmake <your-source-directory>" ("cmake /home/daniel/scidbhdf")
4. Patch locations for dependencies as needed by "ccmake <build-directory>"
  ("ccmake /home/daniel/scidbhdf-bld",
    or just "ccmake ." if $PWD=="/home/daniel/scidb-hdf-bld")
5. Within your build directory, "make"

The "loader" binary will be created within the "src" subdirectory.  It is 
hardcoded to use a particular HDF5 file and to read a particular tree within.

The "libloadhdf.so" is a shared library designed to be a scidb plugin.
You can install it by copying it from the src/ directory to your SciDB
installation's plugins directory (ROOT/lib/scidb/plugins)

The "libloadfits.so" is a shared library designed to be a scidb
plugin.  It can be installed similarly to the other plugin.


Status
------
SciDB-HDF5 is beta-level code. It functions under some conditions
but may need development to handle more general usage.  It does not
contain known memory leaks or scalability problems, but has not
undergone extensive testing.  

FITS *tables* are not currently supported, since they do not have natural
counterparts in SciDB. FITS arrays seem to work fine.


AFL function signatures:
------------------------
libloadhdf.so:
void loadhdf(<full path of HDF5 file>, 
             <path to array within file>)

This function produces a temporary array as its output.  To store this
array permanently, wrap the result inside of SciDB's "store()" operator
(see examples below).

libloadfits.so:
void loadfits(<target array name>, <full path of FITS file>, 
             <HDU number>)

This function overwrites the target specified array, which is deleted
if necessary and constructed with a shape and form according to the
source array's original shape. Please note that loadfits() writes
directly to an array, and thus does not really work with distributed
arrays. It has not yet been updated to return a temporary
array. Patches welcome.


Operating notes
---------------
The code automatically maps a 1D array of 2D images into a single 3D
array, as this was the best behavior for the original use case.


Trying out the loader:
----------------------
## Load the plugin
iquery -aq "load_library('loadhdf');"

## Load an array
iquery -naq "store(loadhdf('/u24/ldrd/sxrcom10-r0232.h5',
'/Configure:0000/Run:0000/CalibCycle:0000/Camera::FrameV1/SxrBeamline.0:Opal1000.1/image'),
image);"

## Some SciDB snippets to inspect the loaded array
iquery -aq "show(image);"

## Count it
iquery -aq "count(image);"

## pixel standard deviations for the first 3 images in the series
iquery -aq "stdev(subarray(image,0,0,0,1023,1023,2),int1,unnamed2);"

## Load a compound:
iquery -aq "store(loadhdf('/u24/ldrd/sxrcom10-r0232.h5',
'/Configure:0000/Run:0000/CalibCycle:0000/Camera::FrameV1/SxrBeamline.0:Opal1000.1/time'),
imagetime);

## See its shape:
iquery -aq "show(imagetime);"
[("imagetime<seconds:uint32 NOT NULL,nanoseconds:uint32 NOT NULL> [unnamed0=0:9223372036854775807,42247,0]")]

### FITS loader examples:

## Load the plugin and the fits file
iquery -aq "load_library('loadfits');"
iquery -aq "loadfits('s20fits', '/u24/danielw/testdata/S20.fits',1);"

## See what we loaded
iquery -aq "show(s20fits);"
[("s20fits<unknown:float NOT NULL> [dim0=0:4072,4072,0,dim1=0:4000,4000,0]")]

## Compute an average
iquery -aq "avg(s20fits);"
[(5.21866)]


Notes:
------
1. You can reconfigure the build type via:
   For a debug build
     cmake -DCMAKE_BUILD_TYPE=Debug <build dir>
   For a release build
     cmake -DCMAKE_BUILD_TYPE=Release <build dir>

   e.g. if your build directory is /home/daniel/scidbhdf-bld and you want to 
   configure for debugging, 
   "cmake -DCMAKE_BUILD_TYPE=Debug /home/daniel/scidbhdf-bld"

2. You might find the "make VERBOSE=1" command useful when troubleshooting
   build problems.


All files are distributed under the Apache License 2.0 (described
below) unless otherwise noted.

Copyright Notice:
-----------------
 Copyright 2012 Jacek Becla, Daniel Liwei Wang

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.