Spark-HDF5

Progress

The plugin can read single-dimensional arrays from HDF5 files.

The following types are supported:

Int8
UInt8
Int16
UInt16
Int32
Int64
Float32
Float64
Fixed length strings

Setup

If you are using the sbt-spark-package, the easiest way to use the package is by requiring it from the spark packages website:

spDependencies += "LLNL/spark-hdf5:0.0.4"

Otherwise, download the latest release jar and include it on your classpath.

Usage

import gov.llnl.spark.hdf._

val df = sqlContext.read.hdf5("path/to/file.h5", "/dataset")
df.show

You can start a spark repl with the console target:

sbt console

This will fetch all of the dependencies, set up a local Spark instance, and start a Spark repl with the plugin loaded.

Options

The following options can be set:

Key	Default	Description
`extension`	`h5`	The file extension of data
`chunk size`	`10000`	The maximum number of elements to be read in a single scan

Testing

The plugin includes a test suite which can be run through SBT

sbt test

Roadmap

Use the hdf-obj package rather than the sis-jhdf5 wrapper
Support for multi-dimensional arrays
Support for compound datasets
Additional testing
Partition discovery (data inference based on location)

Release

This code was developed at the Lawrence Livermore National Lab (LLNL) and is available under the Apache 2.0 license (LLNL-CODE-699384)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
lib		lib
project		project
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spark-HDF5

Progress

Setup

Usage

Options

Testing

Roadmap

Release

About

Releases 2

Packages

Contributors 2

Languages

License

LLNL/spark-hdf5

Folders and files

Latest commit

History

Repository files navigation

Spark-HDF5

Progress

Setup

Usage

Options

Testing

Roadmap

Release

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages