Skip to content

Conversation

@xtalax
Copy link

@xtalax xtalax commented Aug 1, 2019

From the code:
selectslice(A::AbstractArray, dim::Integer, inds...)

Return a view in to the vector shaped slice of A along the dimension dim with all the other indicies set to inds in order.

Equivalent to view(A, inds[1], inds[2], ..., :, inds[dim], inds[dim+1], ...)

#Examples

julia> A = [1 2 3 4; 5 6 7 8]
2×4 Array{Int64,2}:
 1  2  3  4
 5  6  7  8

julia> selectslice(A, 2, 2)
4-element view(::Array{Int64,2}, 2, :) with eltype Int64:
 5
 6
 7
 8

#Current Benchmark

julia> @benchmark selectslice(a, 2, 3)
BenchmarkTools.Trial: 
  memory estimate:  256 bytes
  allocs estimate:  6
  --------------
  minimum time:     416.206 ns (0.00% GC)
  median time:      448.430 ns (0.00% GC)
  mean time:        523.305 ns (9.30% GC)
  maximum time:     27.160 μs (97.32% GC)
  --------------
  samples:          10000
  evals/sample:     199

Credit to Felix Kastner for helping to improve the performance
Looking for advice on how to improve performance, particularly avoiding [inds...] and other uses of splat.

@StefanKarpinski StefanKarpinski requested a review from mbauman August 1, 2019 13:47
@StefanKarpinski StefanKarpinski added arrays [a, r, r, a, y, s] feature Indicates new feature / enhancement requests forgetmenot labels Aug 1, 2019
@fkastner
Copy link
Contributor

fkastner commented Aug 1, 2019

I am still unsure whether this is an improvement.
In particular it doesn't seem very intuitive to have selectslice(A,2,2) instead of A[2,:].
Or for higher dimensional arrays selectslice(A,4,1,2,3,4) instead of A[1,2,3,:,4].

Maybe it would help if you could describe your/a usecase where only the function form can be used?

@mbauman
Copy link
Member

mbauman commented Aug 1, 2019

Interesting! Thanks for the contribution, and welcome!

Like @fkastner I'd love to see some concrete use-cases here and how they compare to plain old indexing. I'm not worried about the performance — if we decide we want this we can make it fast by using the same tricks selectdim uses.

@xtalax
Copy link
Author

xtalax commented Aug 1, 2019

I am trying to write an N dimensional form of one of my utility functions, shown here in the 3d case, which multiplies every vector like strip of an Array with a matrix like object, along a certain dimension, and stores the result in a new array.

@inline function slicemul(A::AbstractMatrix, u::AbstractArray{T, 3}, dim::Integer) where {T, B<:AtomicBC{T}}
    s = size(u)
    out = similar(u)
    if dim == 1
        for j in 1:s[3]
            for i in 1:s[2]
                out[:,i,j] = A*u[:,i,j]
            end
        end
    elseif dim == 2
        for j in 1:s[3]
            for i in 1:s[1]
                out[i,:,j] = A*u[i,:,j]
            end
        end
    elseif dim == 3
        for j in 1:s[2]
            for i in 1:s[1]
                out[i,j,:] = A*u[i,j,:]
            end
        end
    else
        throw("Dim greater than 3 not supported!")
    end
    return out
end

I also have a version where the multiplying matrix is different for every individual strip.
Using Base.Cartesian, and a dummy array type, an N dimensional slicemul can be written:

import Base.Cartesian.@nloops
import Base.Cartesian.@ntuple

struct SizedArray{N} #Create a dummy array type to setup the ranges for @nloops
    size::NTuple{N, Int64}
end
size(A::SizedArray) = A.size

@generated function _slicemul(A::AbstractMatrix, u::AbstractArray{T,N}, dim::Int, iterable::SizedArray{M}) where {T,N,M}
    quote
        out = similar(u)
        @nloops $M i iterable begin
            j = @ntuple $M i
            selectslice(out, dim, j...) .= A*selectslice(u, dim, j...)
        end
        return out
    end
end

function slicemul(A::AbstractMatrix, u::AbstractArray{T,N}, dim::Int) where {T,N}
    s̄ = size(u)[setdiff(1:N, dim)]
    iterable = SizedArray{N-1}(s̄)
    return _slicemul(A, u, dim, iterable)
end

I could be wrong, but I don't think that there is any other way of doing this at the moment, as there is a need to create a view in to two different arrays, so eachslice can't be used

Co-Authored-By: Felix Kastner <kastner.felix@gmail.com>
@mbauman
Copy link
Member

mbauman commented Aug 1, 2019

An, nice, thanks for that example. I think the way I'd write that is with CartesianIndices:

pre = axes(u)[1:dim-1]
post = axes(u)[dim+1:end]
for J in CartesianIndices(post)
    for I in CartesianIndices(pre)
        out[I, :, J] = A*u[I,:,J]
    end
end

Now, the annoying thing is that the computation of pre and post is type-unstable — it's essentially the same performance problem you're running into here. The way around it is by moving the loop inside a @noinline'd function that takes the pre-computed CartesianIndices as its input.

If you've not seen this before, you might find this blog post interesting — it uses the above approach and gives many more details. https://julialang.org/blog/2016/02/iteration

That's not to say that this isn't a handy utility, but I think the hard part is constructing the iteration space. You still need to do that for selectslice.

@xtalax
Copy link
Author

xtalax commented Aug 2, 2019

Thanks! I have seen CartesianIndices but had not really understood how to use them until now.
That works amazingly - just wondering why the type instability issue is mitigated by putting the loop in a @noinline

@fkastner
Copy link
Contributor

fkastner commented Aug 2, 2019

For a single dimension I think this is what @mbauman meant:

selectslice(A::AbstractArray, dim, inds...) = _selectslice(A,dim,inds[1:dim-1], inds[dim:end])
@noinline function _selectslice(A::AbstractArray{T,N}, dim, pre, post) where {T,N}
    dim >= 1 || throw(ArgumentError("dimension must be ≥ 1, got $dim"))
    dim <= N || throw(ArgumentError("dimension must be ≤ ndims(A), got $dim"))
    length(pre)+length(post) == (N-1) || throw(ArgumentError("there must be ndims(A)-1 inds, got $(length(pre)+length(post))"))
    view(A, pre..., :, post...)
end

For me this is twice as fast as the current code in this PR.

I'm currently trying to come up with a multi-dimensional version of this, that accepts a tuple of dimensions along which the slice should be taken, but a) it's slower and b) it doesn't yet work for some edge cases.
Is this a desired feature? It would/could work like this:

julia> d = [(i,j,k,l) for i in 1:3, j = 1:4, k = 1:5, l=1:6]
3×4×5×6 Array{NTuple{4,Int64},4}:
[...]

julia> selectslice(d,2,1,dims=(2,4))
4×6 view(::Array{NTuple{4,Int64},5}, 2, :, 1, :, 1) with eltype NTuple{4,Int64}:
 (2, 1, 1, 1)  (2, 1, 1, 2)  (2, 1, 1, 3)  (2, 1, 1, 4)  (2, 1, 1, 5)  (2, 1, 1, 6)
 (2, 2, 1, 1)  (2, 2, 1, 2)  (2, 2, 1, 3)  (2, 2, 1, 4)  (2, 2, 1, 5)  (2, 2, 1, 6)
 (2, 3, 1, 1)  (2, 3, 1, 2)  (2, 3, 1, 3)  (2, 3, 1, 4)  (2, 3, 1, 5)  (2, 3, 1, 6)
 (2, 4, 1, 1)  (2, 4, 1, 2)  (2, 4, 1, 3)  (2, 4, 1, 4)  (2, 4, 1, 5)  (2, 4, 1, 6)

This could also be the basis for a function that "hides" some dimension of the array, i.e. for the above example it would return an AbstractArray of size 3x5 where each element is a 4x6 view.

@fkastner
Copy link
Contributor

fkastner commented Aug 2, 2019

I think my implementation works now. It uses a generated function to compute the indices.
Performance tips are welcome (as always).

@generated function slice_inds(dim::T, inds::NTuple{M,T}) where {M,T<:Integer}
    return :((inds[1:dim-1]..., Colon(), inds[dim:end]...))#::NTuple{$(M+1),Union{Colon,$T}})
end
@generated function slice_inds(dim::NTuple{N,T}, inds::NTuple{M,T}) where {N,M,T<:Integer,S<:Integer}
    N == 0 && return :(inds)#::NTuple{$(N+M),Union{Colon,$T}})

    args = Vector{Expr}(undef,2*N+1)
    args[1] = :(inds[1:dim[1]-1]...)
    for i=1:N-1
        args[2*i] = :(Colon())
        args[2*i+1] = :(inds[dim[$i]-$(i-1):dim[$(i+1)]-$(i+1)]...)
    end
    args[2*N] = :(Colon())
    args[2*N+1] = :(inds[dim[$(N)]-$(N-1):end]...)
    ex = :(tuple($(args...)))#::NTuple{$(N+M),Union{Colon,$T}})
end

selectslice_nd(A::AbstractArray, inds...; dims) = _selectslice_nd(A, dims, slice_inds(dims,inds))
@noinline function _selectslice_nd(A, dims, inds)
    all(dims .>= 1) || throw(ArgumentError("dimensions must be ≥ 1, got $dims"))
    all(dims .<= ndims(A)) || throw(ArgumentError("dimensions must be ≤ ndims(A), got $dims"))
    length(inds) == ndims(A) || throw(ArgumentError("there must be ndims(A) indices, got $(length(inds))"))
    view(A, inds...)
end

Example:

julia> d = [(i,j,k,l) for i in 1:3, j = 1:4, k = 1:5, l=1:6]
3×4×5×6 Array{NTuple{4,Int64},4}:
[...]

julia> selectslice(d,3,2,1,4) == selectslice_nd(d,2,1,4, dims=3) == d[2,1,:,4]
true

julia> @btime selectslice($d,3,2,1,4); @btime selectslice_nd($d,2,1,4, dims=3);
  751.151 ns (8 allocations: 400 bytes)
  976.385 ns (7 allocations: 368 bytes)

julia> selectslice_nd(d,1,4, dims=(1,3)) == d[:,1,:,4]
true

julia> @btime selectslice_nd($d,1,4, dims=(1,3));
  921.706 ns (7 allocations: 352 bytes)

@mbauman
Copy link
Member

mbauman commented Aug 2, 2019

just wondering why the type instability issue is mitigated by putting the loop in a @noinline

A type instability means that Julia isn't able to predict what type something is going to be — and that means that it has to use pessimistic storage for it since it doesn't know how big it'll be and it needs to use dynamic dispatch to look up what method to call. Where it gets really bad is that Julia also knows nothing (ahead of time) about what its downstream functions will return — thus needing lots of pessimistic storage (boxes) and lots of subsequent dynamic dispatch... possibly on every single iteration of your loops!

Once you're inside a new function, Julia has now done the work to figure out what the types were and is able to generate super-fast efficient code. So we pay a small cost for the lookup once, and then go into that @noinline'd function for the remainder of the computation where everything is known and optimized. Notably, splatting is free if you're splatting a tuple and Julia knows how many elements there are in the tuple.

@JeffBezanson
Copy link
Member

This should accept a dims tuple to match e.g. eachslice (that function also currently only allows 1 dimension, but that's ok).

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffBezanson is this what you meant?

Comment on lines +152 to +153
@noinline function selectslice(A::AbstractArray{T,N}, dim, inds...) where {T,N}
dim >= 1 || throw(ArgumentError("dimension must be ≥ 1, got $dim"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@noinline function selectslice(A::AbstractArray{T,N}, dim, inds...) where {T,N}
dim >= 1 || throw(ArgumentError("dimension must be ≥ 1, got $dim"))
@noinline function selectslice(A::AbstractArray{T,N}, inds...; dims) where {T, N}
length(dims) == 1 || throw(ArgumentError("only single dimensions are supported"))
dim = first(dims)
dim >= 1 || throw(ArgumentError("dimension must be ≥ 1, got $dim"))

end

"""
selectslice(A::AbstractArray, dim::Integer, inds...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
selectslice(A::AbstractArray, dim::Integer, inds...)
selectslice(A::AbstractArray, inds...; dims)

"""
selectslice(A::AbstractArray, dim::Integer, inds...)
Return a view in to the vector shaped slice of `A` along the dimension `dim` with all the other indices set to `inds` in order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Return a view in to the vector shaped slice of `A` along the dimension `dim` with all the other indices set to `inds` in order.
Return a view into the vector shaped slice of `A` along the dimensions `dims`
with all the other indices set to `inds` in order. (Only a single dimension in
`dims` is currently supported.)

@vtjnash vtjnash added the needs tests Unit tests are required for this change label Oct 28, 2020
@fkastner
Copy link
Contributor

I think with the introduction of multidimensional eachslice this PR is now obsolete.

Using the examples from above we now have:

julia> A = [1 2 3 4; 5 6 7 8]
2×4 Matrix{Int64}:
 1  2  3  4
 5  6  7  8

julia> selectslice(A, 2, 2) == eachslice(A, dims=1)[2]
true

as well as

julia> d = [(i,j,k,l) for i in 1:3, j = 1:4, k = 1:5, l=1:6]
3×4×5×6 Array{NTuple{4,Int64},4}:
[...]

julia> selectslice_nd(d, 1, 4, dims=(1,3)) == eachslice(d, dims=(2,4))[1,4]
true

Additionally eachslice is about 6x faster on my computer (60x for the multidimensional version).
The only difference is the specification of dims. While selectslice leaves the dimensions dims intact, eachslice slices along those dimensions.
This difference could be handled by an invdims keyword argument as also recently proposed in #47377.

@mbauman
Copy link
Member

mbauman commented Dec 19, 2023

Well, thanks for proposing this, @xtalax! I'm sorry your PR here didn't land, but I'm glad we now have the functionality you were after (albeit backwards). It's perhaps also worth noting that the tuple slicing syntaxes I had written above are also now type stable when dim is a constant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrays [a, r, r, a, y, s] feature Indicates new feature / enhancement requests needs tests Unit tests are required for this change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants