Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create #1

Merged
merged 19 commits into from Apr 16, 2019
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 6 additions & 2 deletions Project.toml
@@ -1,10 +1,14 @@
name = "NamedDims"
uuid = "356022a1-0364-5f58-8944-0da4b18d706f"
authors = ["Lyndon White <lyndon.white@invenialabs.co.uk>"]
authors = ["Invenia Technical Computing Corporation"]
version = "0.1.0"

[deps]
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"

[extras]
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"

[targets]
test = ["Test"]
test = ["Test", "SparseArrays"]
2 changes: 2 additions & 0 deletions README.md
Expand Up @@ -5,3 +5,5 @@
[![Build Status](https://travis-ci.com/invenia/NamedDims.jl.svg?branch=master)](https://travis-ci.com/invenia/NamedDims.jl)
[![Build Status](https://ci.appveyor.com/api/projects/status/github/invenia/NamedDims.jl?svg=true)](https://ci.appveyor.com/project/invenia/NamedDims-jl)
[![Codecov](https://codecov.io/gh/invenia/NamedDims.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/invenia/NamedDims.jl)

`NamedDimsArray` is a zero-cost abstraction to add names to the dimensions of an array.
8 changes: 7 additions & 1 deletion src/NamedDims.jl
@@ -1,5 +1,11 @@
module NamedDims
using Base: @propagate_inbounds
using Statistics

greet() = print("Hello World!")
export NamedDimsArray, name2dim, dim_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems weird to only partially follow the _ convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer a2b naming only for Dicts but I don't have a good alternative

Although if name2dim is part of the public API i'd like to try finding a better name :)

Is dim[s] or dimension[s] better or worse?

Copy link
Member Author

@oxinabox oxinabox Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so name_to_dim?
I am ok with that.

Copy link
Member Author

@oxinabox oxinabox Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_dim?
dim?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i quite to like dims(names) and dims(names, name) ...but maybe it's scarily short?

plus NamedDims.dims goes nicely with NamedDims.names

edit: think dims better than dim :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have axes it is only slightly shorter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't do dims,
it is a kwarg on functions that want to call this.

e.g.
sum(xs::NamedDimArray; dims)=sum(parent(xs), dims(xs, dims))
does not work; and
sum(xs::NamedDimArray; dims)=sum(parent(xs), NamedDimArray.dims(xs, dims))
is ugly.


include("name_core.jl")
include("wrapper_array.jl")
include("functions.jl")

end # module
40 changes: 40 additions & 0 deletions src/functions.jl
@@ -0,0 +1,40 @@

# 1 Arg
for (mod, funs) in (
(:Base, (
:sum, :prod, :count, :maximum, :minimum, :extrema, :cumsum, :cumprod,
:sort, :sort!)
),
(:Statistics, (:mean, :std, :var, :median, :cov, :cor)),
)
for fun in funs
@eval function $mod.$fun(a::NamedDimsArray; dims=:, kwargs...)
new_dims = name2dim(a, dims)
return $mod.$fun(parent(a); dims=new_dims, kwargs...)
end
end
end

# 1 arg before
for (mod, funs) in (
(:Base, (:mapslices,)),
)
for fun in funs
@eval function $mod.$fun(f, a::NamedDimsArray; dims=:, kwargs...)
new_dims = name2dim(a, dims)
return $mod.$fun(f, parent(a); dims=new_dims, kwargs...)
end
end
end

# 2 arg before
for (mod, funs) in (
(:Base, (:mapreduce,)),
)
for fun in funs
@eval function $mod.$fun(f1, f2, a::NamedDimsArray; dims=:, kwargs...)
new_dims = name2dim(a, dims)
return $mod.$fun(f1, f2, parent(a); dims=new_dims, kwargs...)
end
end
end
97 changes: 97 additions & 0 deletions src/name_core.jl
@@ -0,0 +1,97 @@

"""
name2dim(dimnames, [name])

For `dimnames` being a tuple of dimnames (symbols) for dimenensions.
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
If called with just the tuple,
returns a named tuple, with each name maps to a dimension.
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
e.g `name2dim((:a, :b)) == (a=1, b=2)`.

If the second `name` argument is given, them the dimension corresponding to that `name`,
is returned.
e.g. `name2dim((:a, :b), :b) == 2`
If that `name` is not found then `0` is returned.
"""
function name2dim(dimnames::Tuple)
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
# Note: This code is runnable at compile time if input is a constant
# If modified, make sure to recheck that it still can run at compile time
# e.g. via `@code_llvm (()->name2dim((:a, :b)))()` which should be very short
ndims = length(dimnames)
return NamedTuple{dimnames, NTuple{ndims, Int}}(1:ndims)
end

function name2dim(dimnames::Tuple, name::Symbol)
# Note: This code is runnable at compile time if inputs are constants
# If modified, make sure to recheck that it still can run at compile time
# e.g. via `@code_llvm (()->name2dim((:a, :b), :a))()` which should just say `return 1`
this_namemap = NamedTuple{(name,), Tuple{Int}}((0,)) # 0 is default we will overwrite
full_namemap = name2dim(dimnames)
dim = first(merge(this_namemap, full_namemap))
return dim
end

function name2dim(dimnames::Tuple, names)
# This handles things like `(:x, :y)` or `[:x, :y]`
# or via the fallbacks `(1,2)`, or `1:5`
return map(name->name2dim(dimnames, name), names)
end

function name2dim(dimnames::Tuple, dim::Union{Integer, Colon})
# This is the fallback that allows `NamedDimsArray`'s to be have dimenstions
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
# referred to by number. This is required to allow functions on `AbstractArray`s
# and that use function like `sum(xs; dims=2)` to continue to work without changes
# `:` is the default for most methods that take `dims`
return dim
end


"""
default_inds(dimnames::Tuple)
This is the defult value for all indexing expressions using the given dimnames.
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
Which is to say: take a full slice on everything
"""
function default_inds(dimnames::Tuple)
# Note: This code is runnable at compile time if input is a constant
# If modified, make sure to recheck that it still can run at compile time
ndims = length(dimnames)
values = ntuple(_->Colon(), ndims)
return NamedTuple{dimnames, NTuple{ndims, Colon}}(values)
end


"""
order_named_inds(dimnames::Tuple; named_inds...)

Returns the values of the `named_inds`, sorted as per the order they appear in `dimnames`,
with any missing dimnames, having there value set to `:`.
An error is thrown if any dimnames are given in `named_inds` that do not occur in `dimnames`.
"""
function order_named_inds(dimnames::Tuple; named_inds...)
# Note: This code is runnable at compile time if input is a constant
# If modified, make sure to recheck that it still can run at compile time
keys(named_inds) ⊆ dimnames || throw(
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
DimensionMismatch("Expected $(dimnames), got $(keys(named_inds))")
)

slice_everything = default_inds(dimnames)
full_named_inds = merge(slice_everything, named_inds)
inds = Tuple(full_named_inds)
end

"""
determine_remaining_dim(dimnames::Tuple, inds...)
Given a tuple of dimension names, e.g.
and a set of index expressesion e.g `1, :, 1:3, [true, false]`,
determine which are not dropped.
Dimensions indexed with scalars are dropped
"""
@generated function determine_remaining_dim(dimnames::Tuple, inds)
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
# TODO: This still allocates once, and it shouldn't have to
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
# See: #@btime (()->determine_remaining_dim((:a, :b, :c), (:,390,:)))()
ind_types = inds.parameters
kept_dims = findall(keep_dim_ind_type, ind_types)
keep_names = [:(getfield(dimnames, $ii)) for ii in kept_dims]
return Expr(:tuple, keep_names...)
end
keep_dim_ind_type(::Type{<:Integer}) = false
keep_dim_ind_type(::Any) = true
115 changes: 115 additions & 0 deletions src/wrapper_array.jl
@@ -0,0 +1,115 @@
# `L` is for labels, it should be a `Tuple` of `Symbol`s
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
struct NamedDimsArray{L, T, N, A<:AbstractArray{T,N}} <: AbstractArray{T,N}
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
data::A
end

function NamedDimsArray{L}(orig::AbstractArray{T,N}) where {L, T, N}
if !(L isa NTuple{N, Symbol})
throw(ArgumentError(
"A $N dimentional array, needs a $N-tuple of dimension names. Got: $L"
))
end
return NamedDimsArray{L, T, N, typeof(orig)}(orig)
end
function NamedDimsArray(orig::AbstractArray{T,N}, names::NTuple{N, Symbol}) where {T, N}
return NamedDimsArray{names}(orig)
end

parent_type(::Type{<:NamedDimsArray{L,T,N,A}}) where {L,T,N,A} = A
Base.parent(x::NamedDimsArray) = x.data


"""
dim_names(A)

Returns a tuple of containing the names of all the dimensions of the array `A`.
"""
dim_names(::Type{<:NamedDimsArray{L}}) where L = L
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
dim_names(x::T) where T<:NamedDimsArray = dim_names(T)


name2dim(a::NamedDimsArray{L}, name) where L = name2dim(L, name)



#############################
# AbstractArray Interface
# https://docs.julialang.org/en/v1/manual/interfaces/index.html#man-interface-array-1

## Minimal
Base.size(a::NamedDimsArray) = size(parent(a))
Base.size(a::NamedDimsArray, dim) = size(parent(a), name2dim(a, dim))


## optional
Base.IndexStyle(::Type{A}) where A<:NamedDimsArray = Base.IndexStyle(parent_type(A))

Base.length(a::NamedDimsArray) = length(parent(a))

Base.axes(a::NamedDimsArray) = axes(parent(a))
Base.axes(a::NamedDimsArray, dim) = axes(parent(a), name2dim(a, dim))


function Base.similar(a::NamedDimsArray{L}, args...) where L
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
return NamedDimsArray{L}(similar(parent(a), args...))
end


###############################
# kwargs indexing

"""
order_named_inds(A, named_inds...)

Returns the indices that have the names and values given by `named_inds`
sorted into the order expected for the dimension of the array `A`.
If any dimensions of `A` are not present in the named_inds,
then they are given the value `:`, for slicing

For example:
```
A = NamedDimArray(rand(4,4), (:x,, :y))
order_named_inds(A; y=10, x=13) == (13,10)
order_named_inds(A; x=2, y=1:3) == (2, 1:3)
order_named_inds(A; y=5) == (:, 5)
```

This provides the core indexed lookup for `getindex` and `setindex` on the Array `A`
"""
order_named_inds(A::AbstractArray; named_inds...) = order_named_inds(dim_names(A); named_inds...)

###################
# getindex / view / dotview
# Note that `dotview` is undocumented but needed for making `a[x=2] .= 3` work

for f in (:getindex, :view, :dotview)
@eval begin
@propagate_inbounds function Base.$f(A::NamedDimsArray; named_inds...)
inds = order_named_inds(A; named_inds...)
return Base.$f(A, inds...)
end

@propagate_inbounds function Base.$f(a::NamedDimsArray, inds::Vararg{<:Integer})
# Easy scalar case, will just return the element
return Base.$f(parent(a), inds...)
end

@propagate_inbounds function Base.$f(a::NamedDimsArray, inds...)
# Some nonscalar case, will return an array, so need to give that names.
data = Base.$f(parent(a), inds...)
L = determine_remaining_dim(dim_names(a), inds)
return NamedDimsArray{L}(data)
end
end
end
oxinabox marked this conversation as resolved.
Show resolved Hide resolved

############################################
# setindex!
@propagate_inbounds function Base.setindex!(A::NamedDimsArray, value; named_inds...)
inds = order_named_inds(A; named_inds...)
return setindex!(A, value, inds...)
end

@propagate_inbounds function Base.setindex!(a::NamedDimsArray, value, inds...)
return setindex!(parent(a), value, inds...)
end
32 changes: 32 additions & 0 deletions test/functions.jl
@@ -0,0 +1,32 @@
using NamedDims
using Test
using Statistics

@testset "sum" begin
nda = NamedDimsArray([10 20; 30 40], (:x, :y))

@test sum(nda) == 100
@test sum(nda; dims=:x) == [40 60]
@test sum(nda; dims=1) == [40 60]
end

@testset "mean" begin
nda = NamedDimsArray([10 20; 30 40], (:x, :y))

@test mean(nda) == 25
@test mean(nda; dims=:x) == [20 30]
@test mean(nda; dims=1) == [20 30]
end

@testset "mapslices" begin
nda = NamedDimsArray([10 20; 30 40], (:x, :y))

@test mapslices(join, nda; dims=:x) == ["1030" "2040"] == mapslices(join, nda; dims=1)
@test mapslices(join, nda; dims=:y) == reshape(["1020", "3040"], Val(2)) == mapslices(join, nda; dims=2)
end

@testset "mapreduce" begin
nda = NamedDimsArray([10 20; 31 40], (:x, :y))
@test mapreduce(isodd, |, nda; dims=:x) == [true false] == mapreduce(isodd, |, nda; dims=1)
@test mapreduce(isodd, |, nda; dims=:y) == [false true]' == mapreduce(isodd, |, nda; dims=2)
end
44 changes: 44 additions & 0 deletions test/name_core.jl
@@ -0,0 +1,44 @@
using NamedDims
using NamedDims: order_named_inds, determine_remaining_dim
using Test


@testset "name2dim" begin
@testset "get map only" begin
@test name2dim((:x, :y)) == (x=1, y=2)

manynames = Tuple(Symbol.('A':'z'))
namemap = name2dim(manynames)
@test keys(namemap) == manynames
@test values(namemap) == Tuple(1:length(manynames))
end
@testset "small case" begin
@test name2dim((:x, :y), :x)==1
@test name2dim((:x, :y), :y)==2
@test name2dim((:x, :y), :z)==0 # not found
end
@testset "large case that" begin
oxinabox marked this conversation as resolved.
Show resolved Hide resolved
@test name2dim((:x, :y, :a, :b, :c, :d), :x)==1
@test name2dim((:x, :y, :a, :b, :c, :d), :a)==3
@test name2dim((:x, :y, :a, :b, :c, :d), :d)==6
@test name2dim((:x, :y, :a, :b, :c, :d), :z)==0 # not found
end
end


@testset "order_named_inds" begin
@test order_named_inds((:x,)) == (:,)
@test order_named_inds((:x,); x=2) == (2,)

@test order_named_inds((:x, :y,)) == (:,:)
@test order_named_inds((:x, :y); x=2) == (2, :)
@test order_named_inds((:x, :y); y=2, ) == (:, 2)
@test order_named_inds((:x, :y); y=20, x=30) == (30, 20)
@test order_named_inds((:x, :y); x=30, y=20) == (30, 20)
end

@testset "determine_remaining_dim" begin
@test determine_remaining_dim((:a, :b, :c), (10,20,30)) == tuple()
@test determine_remaining_dim((:a, :b, :c), (10,:,30)) == (:b,)
@test determine_remaining_dim((:a, :b, :c), (1:1, [true], [20])) == (:a, :b, :c)
end
4 changes: 4 additions & 0 deletions test/runtests.jl
Expand Up @@ -3,4 +3,8 @@ using Test

@testset "NamedDims.jl" begin
# Write your own tests here.

include("name_core.jl")
include("wrapper_array.jl")
include("functions.jl")
end