Skip to content

rened/DictFiles.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DictFiles

Build Status Build Status

DictFiles provides an easy to use abstraction over the excellent JLD and HDF5 packages by Tim Holy. A DictFile is a standard JLD file which behaves similar to nested Dict's:

using DictFiles
dictopen("/tmp/test") do a
    a["key1"] = [1 2 3]
    a["key1"]            # == [1 2 3]
	a[]                  # == Dict("key1"=>[1 2 3])

    a["key2",1] = "One"
    a["key2","two"] = 2
    a["key2"]            # == Dict(1 => "One", "two" => 2)
    a["key2", 1]         # == "One"

    a[:mykey] = Dict("item" => 2.2)
    a[:mykey,"item"]    # == 2.2
end

It provides additional features for memory-mapping individual entries and compacting of the file to reclaim space lost through deletions / updates.

Installation

Simply add the package using Julia's package manager once:

Pkg.add("DictFiles")

You also need the master branch of HDF5.jl:

Pkg.checkout("HDF5")

Then include it where you need it:

using DictFiles

Documentation

DictFiles behave like nested Dicts. The primary way to assess a DictFile df is using df[keys...] = value and df[keys...], where keys is a tuple of primitive types, i.e. strings, chars, numbers, tuples, small arrays.

DictFile, dictopen, close

a = DictFile("/tmp/test")
# do something with a
close(a)

A better way do to this, in case an error occurs, is:

dictopen("/tmp/test") do a
    # do something with a
end

Like open, both methods take a mode parameter, with the default being r+, with the added behavior for r+ that the file is created when it does not exist yet.

dictread, dictwrite

To read the entire contents of a file:

r = dictread(filename)

To overwrite the entire contents of a dictfile with a Dict:

dictwrite(somedict, filename)

Setting and getting, browsing, deleting

dictopen("/tmp/test") do a
    a["mykey"] = 1
    a["mykey"]                #  returns 1
 
    # following the metaphor of nested Dict's:
    a[] = Dict("mykey" => 1, "another key" => Dict("a"=>"A", :b =>"B", 1=>'c'))
    a[]                       # gets the entire contents as one Dict()

    a["another key", :b]      #  "B"
    a["another key"]          #  Dict("a"=>"A", :b =>"B", 1=>'c')

    keys(a)                   #  Dict("another key","mykey") 
    keys(a,"another key")     #  Dict("a",1,:b) 
    values(a)                 #  [Dict(:b=>"B",1=>'c',"a"=>"A"),1] 
    values(a,"another key")   #  Dict("A",'c',"B") 
    haskey(a,"mykey") ? println("has key!") : nothing

    # note that the default parameter for get comes second! 
    get(a, "default", "mykey")   #  1 
    delete!(a, "mykey")
    get(a, "default", "mykey")   #  "default"
end

In case you have a very nested data structure in your file and want to only work on a part of it:

dictopen("/tmp/test") do a 
    a[] = Dict("some"=>1, "nested data" => Dict("a" => 1, "b" => 2))
    b = DictFile(a, "nested data")   #  e.g., you can pass b to other functions
    keys(b)                          #  {"a","b"] 
    b["c"] = 3 
    a[]                              #  Dict("some"=>1, 
                                     #  "nested data" => Dict("a" => 1, "b" => 2, "c" => 3))
end

Compacting

When fields get overwritten or explicitly deleted, HDF5 appends the new data to the file und unlinks the old data. The space of the original data is not recovered. For this, you can compact the file from time to time. This copies all data to a temporary file and replaces the original on success.

    DictFiles.compact("/tmp/test")

Contibuting

I'd be very grateful for bug reports und feature suggestions - please file an issue!

About

A Dict-like wrapper for HDF5/JLD files

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages