JuliaDB docs | Build | Coverage |
---|---|---|
IndexedTables provides tabular data structures where some of the columns form a sorted index. It provides the backend to JuliaDB, but can be used on its own for efficient in-memory data processing and analytics.
- The two table types in IndexedTables differ in how data is accessed.
- There is no performance difference between table types for operations such as selecting, filtering, and map/reduce.
First let's create some data to work with.
city = vcat(fill("New York", 3), fill("Boston", 3))
dates = repmat(Date(2016,7,6):Date(2016,7,8), 2)
values = [91, 89, 91, 95, 83, 76]
- Data is accessed as a Vector of NamedTuples.
- Sorted by primary key(s),
pkey
.
julia> t1 = table(@NT(city = city, dates = dates, values = values); pkey = [:city, :dates])
Table with 6 rows, 3 columns:
city dates values
──────────────────────────────
"Boston" 2016-07-06 95
"Boston" 2016-07-07 83
"Boston" 2016-07-08 76
"New York" 2016-07-06 91
"New York" 2016-07-07 89
"New York" 2016-07-08 91
julia> t1[1]
(city = "Boston", dates = 2016-07-06, values = 95)
julia> first(t1)
(city = "Boston", dates = 2016-07-06, values = 95)
- Data is accessed as an N-dimensional sparse array with arbitrary indexes.
- Sorted by index variables (first argument).
julia> t2 = ndsparse(@NT(city=city, dates=dates), @NT(value=values))
2-d NDSparse with 6 values (1 field named tuples):
city dates │ value
───────────────────────┼──────
"Boston" 2016-07-06 │ 95
"Boston" 2016-07-07 │ 83
"Boston" 2016-07-08 │ 76
"New York" 2016-07-06 │ 91
"New York" 2016-07-07 │ 89
"New York" 2016-07-08 │ 91
julia> t2["Boston", Date(2016, 7, 6)]
(value = 95)
julia> first(t2)
(value = 95)
As with other multi-dimensional arrays, dimensions can be permuted to change the sort order:
julia> permutedims(t2, [2,1])
2-d NDSparse with 6 values (1 field named tuples):
dates city │ value
───────────────────────┼──────
2016-07-06 "Boston" │ 95
2016-07-06 "New York" │ 91
2016-07-07 "Boston" │ 83
2016-07-07 "New York" │ 89
2016-07-08 "Boston" │ 76
2016-07-08 "New York" │ 91
For more information, check out the JuliaDB API Reference.