Skip to content
A Dict-like wrapper for HDF5/JLD files
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src
test
.travis.yml
LICENSE.md
README.md
REQUIRE
mmapsnippet

README.md

DictFiles

Build Status Build Status

DictFiles provides an easy to use abstraction over the excellent JLD and HDF5 packages by Tim Holy. A DictFile is a standard JLD file which behaves similar to nested Dict's:

using DictFiles
dictopen("/tmp/test") do a
    a["key1"] = [1 2 3]
    a["key1"]            # == [1 2 3]
	a[]                  # == Dict("key1"=>[1 2 3])

    a["key2",1] = "One"
    a["key2","two"] = 2
    a["key2"]            # == Dict(1 => "One", "two" => 2)
    a["key2", 1]         # == "One"

    a[:mykey] = Dict("item" => 2.2)
    a[:mykey,"item"]    # == 2.2
end

It provides additional features for memory-mapping individual entries and compacting of the file to reclaim space lost through deletions / updates.

Installation

Simply add the package using Julia's package manager once:

Pkg.add("DictFiles")

You also need the master branch of HDF5.jl:

Pkg.checkout("HDF5")

Then include it where you need it:

using DictFiles

Documentation

DictFiles behave like nested Dicts. The primary way to assess a DictFile df is using df[keys...] = value and df[keys...], where keys is a tuple of primitive types, i.e. strings, chars, numbers, tuples, small arrays.

DictFile, dictopen, close

a = DictFile("/tmp/test")
# do something with a
close(a)

A better way do to this, in case an error occurs, is:

dictopen("/tmp/test") do a
    # do something with a
end

Like open, both methods take a mode parameter, with the default being r+, with the added behavior for r+ that the file is created when it does not exist yet.

dictread, dictwrite

To read the entire contents of a file:

r = dictread(filename)

To overwrite the entire contents of a dictfile with a Dict:

dictwrite(somedict, filename)

Setting and getting, browsing, deleting

dictopen("/tmp/test") do a
    a["mykey"] = 1
    a["mykey"]                #  returns 1
 
    # following the metaphor of nested Dict's:
    a[] = Dict("mykey" => 1, "another key" => Dict("a"=>"A", :b =>"B", 1=>'c'))
    a[]                       # gets the entire contents as one Dict()

    a["another key", :b]      #  "B"
    a["another key"]          #  Dict("a"=>"A", :b =>"B", 1=>'c')

    keys(a)                   #  Dict("another key","mykey") 
    keys(a,"another key")     #  Dict("a",1,:b) 
    values(a)                 #  [Dict(:b=>"B",1=>'c',"a"=>"A"),1] 
    values(a,"another key")   #  Dict("A",'c',"B") 
    haskey(a,"mykey") ? println("has key!") : nothing

    # note that the default parameter for get comes second! 
    get(a, "default", "mykey")   #  1 
    delete!(a, "mykey")
    get(a, "default", "mykey")   #  "default"
end

In case you have a very nested data structure in your file and want to only work on a part of it:

dictopen("/tmp/test") do a 
    a[] = Dict("some"=>1, "nested data" => Dict("a" => 1, "b" => 2))
    b = DictFile(a, "nested data")   #  e.g., you can pass b to other functions
    keys(b)                          #  {"a","b"] 
    b["c"] = 3 
    a[]                              #  Dict("some"=>1, 
                                     #  "nested data" => Dict("a" => 1, "b" => 2, "c" => 3))
end

Compacting

When fields get overwritten or explicitly deleted, HDF5 appends the new data to the file und unlinks the old data. The space of the original data is not recovered. For this, you can compact the file from time to time. This copies all data to a temporary file and replaces the original on success.

    DictFiles.compact("/tmp/test")

Contibuting

I'd be very grateful for bug reports und feature suggestions - please file an issue!

You can’t perform that action at this time.