Skip to content

DhairyaLGandhi/IndexedTables.jl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JuliaDB docs Build Coverage
Build Status codecov.io

IndexedTables.jl

IndexedTables provides tabular data structures where some of the columns form a sorted index. It provides the backend to JuliaDB, but can be used on its own for efficient in-memory data processing and analytics.

Data Structures

  • The two table types in IndexedTables differ in how data is accessed.
  • There is no performance difference between table types for operations such as selecting, filtering, and map/reduce.

First let's create some data to work with.

city = vcat(fill("New York", 3), fill("Boston", 3))

dates = repmat(Date(2016,7,6):Date(2016,7,8), 2)

values = [91, 89, 91, 95, 83, 76]

Table

  • Data is accessed as a Vector of NamedTuples.
  • Sorted by primary key(s), pkey.
julia> t1 = table(@NT(city = city, dates = dates, values = values); pkey = [:city, :dates])
Table with 6 rows, 3 columns:
city        dates       values
──────────────────────────────
"Boston"    2016-07-06  95
"Boston"    2016-07-07  83
"Boston"    2016-07-08  76
"New York"  2016-07-06  91
"New York"  2016-07-07  89
"New York"  2016-07-08  91

julia> t1[1]
(city = "Boston", dates = 2016-07-06, values = 95)

julia> first(t1)
(city = "Boston", dates = 2016-07-06, values = 95)

NDSparse

  • Data is accessed as an N-dimensional sparse array with arbitrary indexes.
  • Sorted by index variables (first argument).
julia> t2 = ndsparse(@NT(city=city, dates=dates), @NT(value=values))
2-d NDSparse with 6 values (1 field named tuples):
city        dates      │ value
───────────────────────┼──────
"Boston"    2016-07-0695
"Boston"    2016-07-0783
"Boston"    2016-07-0876
"New York"  2016-07-0691
"New York"  2016-07-0789
"New York"  2016-07-0891

julia> t2["Boston", Date(2016, 7, 6)]
(value = 95)

julia> first(t2)
(value = 95)

As with other multi-dimensional arrays, dimensions can be permuted to change the sort order:

julia> permutedims(t2, [2,1])
2-d NDSparse with 6 values (1 field named tuples):
dates       city       │ value
───────────────────────┼──────
2016-07-06  "Boston"95
2016-07-06  "New York"91
2016-07-07  "Boston"83
2016-07-07  "New York"89
2016-07-08  "Boston"76
2016-07-08  "New York"91

Get started

For more information, check out the JuliaDB API Reference.

About

tables with indices

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Julia 100.0%