Add code that gives an nice Array interface to a vector of arrays #2

gabrielgellner · 2017-04-11T19:12:18Z

Let me know what you need. My gitfu is weak.

coveralls · 2017-04-11T19:53:14Z

Coverage increased (+9.0%) to 77.273% when pulling e02800f on gabrielgellner:vector_of_array into 0c3431d on ChrisRackauckas:master.

codecov-io · 2017-04-11T19:54:34Z

Codecov Report

Merging #2 into master will increase coverage by 14.9%.
The diff coverage is 92%.

@@            Coverage Diff             @@
##           master       #2      +/-   ##
==========================================
+ Coverage   62.36%   77.27%   +14.9%     
==========================================
  Files           2        2              
  Lines          93       66      -27     
==========================================
- Hits           58       51       -7     
+ Misses         35       15      -20

Impacted Files	Coverage Δ
src/RecursiveArrayTools.jl	`68.29% <ø> (ø)`
src/vector_of_array.jl	`92% <92%> (ø)`
src/array_partition.jl
src/utils.jl

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9547070...3807a3c. Read the comment docs.

ChrisRackauckas · 2017-04-13T15:13:52Z

Would it be difficult to check what the cost of the ragged flag is? I am hoping it's essentially zero, in which case that's awesome! If so, then let's make a is_ragged(x), that way the DiffEq solution can overload that to look towards sol.u.

What happens if you just put this on a Vector{Float64}? Curious. I don't think we'll need this, but I wonder what it does.

gabrielgellner · 2017-04-13T15:15:51Z

Yeah I wanted to benchmark this. Do you have any examples of how I might best do this? Should I just make a couple of different sized arrays and check out the overhead of doing indexing (the part that scares me the most)?

ChrisRackauckas · 2017-04-13T15:18:44Z

src/vector_of_array.jl

+Base.sizehint!{T, N}(S::AbstractVectorOfArray{T, N}, i) = sizehint!(S.data, i)
+
+function Base.push!{T, N}(S::AbstractVectorOfArray{T, N}, new_item::AbstractArray)
+  if S.dims[1:(end - 1)] == size(new_item)


if S.ragged && S.dims[1] == size(new_item)

wait actually, this is for not ragged? I don't quite understand what's going on here.

We are updating the dims attribute if it matches the current shape of the other elements (we drop the last element as this is the length of the containing vector). Otherwise we make it ragged. But you show a bug, I need to check that it has already become ragged in which case I don't need to do anymore checking. Will fix this up.

ChrisRackauckas · 2017-04-13T15:19:45Z

src/vector_of_array.jl

+# Based on code from M. Bauman Stackexchange answer + Gitter discussion
+type VectorOfArray{T, N, A} <: AbstractVectorOfArray{T, N}
+  data::A # A <: AbstractVector{<: AbstractArray{T, N - 1}}
+  dims::NTuple{N, Int}


Why is this a tuple if it's changing with each push?

Would it be better as a Vector{Int}?

Yeah I will make that change, I think I originally did this just to make it more Array like, but it clearly doesn't make sense as it is not static.

The only issue is can I make the field readonly? Might cause strange issues if the user mutates the dim field.

Don't worry about that. If a user does that, they deserve problems.

ChrisRackauckas · 2017-04-13T16:31:01Z

It would be good to know the numbers for the overhead of indexing, but since there really isn't much we can do about it (by design) it's moreso for understanding when someone should convert to a matrix/tensor.

There are two things here. Speed of building and speed of indexing. Speed of building is first. I plan to swap out the current array that's used in each solver with a VectorOfArray (currently it's a Vector{Array{...}}). The solvers get values and then iteratively push!. So push! is crucial to stay at the same speed.

But then next is indexing. The solvers and event handlers only actually use the linear index, and that should stay the same speed because that's just passing through to indexing the vector.

The more general indexing is only used on the user side. It's a convenience, and so is the conversion to a matrix function (maybe we should have it just be full instead of this new function?). It would be good to know what exactly the difference is when doing operations like this, but since it's not in the main routine we have more leeway here.

gabrielgellner · 2017-04-13T18:07:03Z

The timings your want make a lot of sense. I will get this rigged up, as it will be really good to know.

ChrisRackauckas · 2017-04-18T05:02:05Z

I took a good look at this. I'm not sure that the extra fields are really worthwhile. The ragged field is only used once:

S.ragged && throw(BoundsError("A ragged VectorOfArray does not support Cartesian indexing"))

But that's unnecessary: in many cases a Cartesian index still makes sense. For example, with this you can't do a[1,:] to get the first component through time, which will make sense for any (standard indices) ragged array. It think it would be better to just let the data throw the index out of bounds if the user hits something that it can't handle. Because of this, I think ragged checks and saving of S.ragged can all go, since that's the only place it's actually used.

But that also means you can just take dims as size(S) == size(S.data), since that was just to support the ragged stuff. So I think you only need S.data.

And then once you do this, the overhead it's a zero-overhead abstraction too. It would still be interesting to test the cost in indexing, but that would be a structural cost and not something you could improve at all.

Even after you trim all of this back, I don't think you hit any of the functionality in the tests, which shows that it wasn't essential anyways.

I rebased you to the current RecursiveArrayTools.jl since a bit has changed here.

gabrielgellner · 2017-04-18T15:14:17Z

Okay this Friday I will look over this. I really hope that removing ragged and dims makes everything awesome, which makes sense! I never liked having the ragged check inside what would be a very commonly used indexing operation (at the user side).

gabrielgellner · 2017-04-21T21:24:28Z

When you did your test did you remove the subtyping from AbstractArray? I am getting a bunch of issues from the dims not really being what we would want, since size(a.data) gives the container vector dimension, but is not really what we would want for something like [[1, 2], [3, 4]] which should be like a matrix.

ChrisRackauckas · 2017-04-21T23:48:54Z

When you did your test did you remove the subtyping from AbstractArray? I am getting a bunch of issues from the dims not really being what we would want, since size(a.data) gives the container vector dimension, but is not really what we would want for something like [[1, 2], [3, 4]] which should be like a matrix.

Why not just define dims? When it's subtyped as AbstractArray{T,N}, is that actually inferable? I would think the computation on the N would make it type unstable to create these when it's subtyped as AbstractArray, and since you're overloading the indexing functions anyways, it may not make sense to subtype that.

ChrisRackauckas · 2017-04-22T07:15:23Z

@gabrielgellner will you be available tomorrow? I would like to work through this and finish this up over the weekend.

gabrielgellner · 2017-04-22T13:09:07Z

Saturday is my wife's birthday. But I plan to power through this on Sunday if that works.

ChrisRackauckas · 2017-04-22T14:09:02Z

Sounds good.

Remove ragged field, and go back to simple AbstractArray interface.

ChrisRackauckas · 2017-04-26T00:02:30Z

src/vector_of_array.jl

-    @boundscheck checkbounds(S, I...) # is this needed?
-    S.data[I[end]][Base.front(I)...]
+@inline function Base.getindex{T, N}(VA::VectorOfArray{T, N}, I::Vararg{Int, N})
+    @boundscheck checkbounds(VA, I...) # is this needed?


I don't think this is needed

ChrisRackauckas · 2017-04-26T00:05:26Z

src/vector_of_array.jl

-    #@assert all(size(vec[1]) == size(v) for v in vec)
-    VectorOfArray(vec, (size(vec[1])..., length(vec)))
-end
+VectorOfArray{T, N}(vec::AbstractVector{T}, dims::NTuple{N}) = VectorOfArray{eltype(T), N, typeof(vec)}(vec)


Is this pure? Can this be inferred?

ChrisRackauckas · 2017-04-26T00:10:45Z

type VectorOfArray{T, N, A} <: AbstractArray{T, N}
    data::A
end
VectorOfArray{T, N}(vec::AbstractVector{T}, dims::NTuple{N}) = VectorOfArray{eltype(T), N, typeof(vec)}(vec)
VectorOfArray(vec::AbstractVector) = VectorOfArray(vec, (size(vec[1])..., length(vec)))

A = rand(4)
@code_warntype VectorOfArray(A)


Variables:
  #self#::Type{VectorOfArray}
  vec::Array{Float64,1}

Body:
  begin 
      (Base.arrayref)(vec::Array{Float64,1},1)::Float64
      return $(Expr(:new, VectorOfArray{Float64,1,Array{Float64,1}}, :(vec)))
  end::VectorOfArray{Float64,1,Array{Float64,1}}

Great! That infers well. I think you cracked the code for how to make it type-inferably <:AbstractArray!

ChrisRackauckas · 2017-04-26T00:25:38Z

This is looking really good. I think we are close to done.

TODO: should we redefine length to be over the VA.data? Currently it is the number of total elements

What do you think? From a quick search over DiffEq, it seems that re-defining length wouldn't be too disruptive (it was defined before on sol as the length of the underlying vector). What do you think is more useful? I think it might help if you're iterating over the linear index, because then the indices are 1:length(sol).

Other thing: this should all be written on some AbstractVectorOfArray instead of on the concrete type. We should probably standardize this all to use .u. Then the DiffEqVectorOfArray <: AbstractVectorOfArray with .u and .t would immediately work. The next question would be: should DESolution <: AbstractVectorOfArray as well? Or should it just pass indices?

gabrielgellner · 2017-04-26T00:36:01Z

My gut feeling is that it should be over the container vector, as like you suggest I think of the length part as being associated with the linear index, whereas the size is over the Cartesian view. In many ways I see what we are doing as having two parts:

We are making a nice growable array, like you did originally that has a convenient getindex overload
We are making iteration, linear indexing, etc as being over the collection of subarrays

I really think these two things together are super convenient and nice.

Also happy to make this all an Abstract type, though I don't have a great feeling over whether the diffeq solution should be a subtype. I guess it is likely a VectorOfArray with extra fields so it does make conceptual sense to me.

gabrielgellner · 2017-04-26T00:39:45Z

One other thing I wanted feedback on. So when the array is not ragged we get sweet printing, just like it was a dense array. Once it is ragged the printing becomes meaningless. As a result I was planning on making our own print, that just gives that standard array of array view. Make sense?

ChrisRackauckas · 2017-04-26T00:40:55Z

That makes sense to me.

gabrielgellner · 2017-04-26T01:42:15Z

Okay I added the length behavior, and moved to use Abstract base type.

ChrisRackauckas · 2017-04-26T02:18:19Z

So just the show methods now?

gabrielgellner · 2017-04-26T02:18:50Z

Yup. Working on that now, and then should be ready to merge.

ChrisRackauckas · 2017-04-26T14:51:22Z

I think the show methods can come in another PR. I like short PRs better anyways. I changed data to u and will merge when the tests are complete, then test what happens if I subtype DESolution as an AbstractVectorOfArray. I think that will work just fine, and thus will expand that whole solution interface. I'm not sure show should be overrided on the abstract type because of this, so I'd just put it on the concrete type.

gabrielgellner · 2017-04-26T14:53:34Z

Nice. I will keep looking into how the display machinery works, and get this ready for a future PR. Once this is merged what do you need from me next? Should I start working on adding an interp output PR?

ChrisRackauckas · 2017-04-26T14:54:22Z

Should I start working on adding an interp output PR?

I think that's the way to go next.

Add code that gives an nice Array interface to a vector of arrays

e02800f

ChrisRackauckas reviewed Apr 13, 2017

View reviewed changes

ChrisRackauckas added 2 commits April 17, 2017 21:04

Merge branch 'master' into vector_of_array

724071c

complete merge

1c865a2

ChrisRackauckas added 2 commits April 18, 2017 16:25

update README

b72c4b4

fix for ode units

6250057

gabrielgellner added 3 commits April 24, 2017 17:01

8240745

Remove ragged field, and go back to simple AbstractArray interface.

Fixup some of the linear indexing

7cd946e

Clean up the code, and remove dims

6f24562

ChrisRackauckas reviewed Apr 26, 2017

View reviewed changes

Update vector_of_array.jl

23172f3

Move to abstract type. Clean up tests.

2ce01ec

ChrisRackauckas approved these changes Apr 26, 2017

View reviewed changes

change to u

3807a3c

ChrisRackauckas merged commit 9a95543 into SciML:master Apr 26, 2017

Uh oh!

Add code that gives an nice Array interface to a vector of arrays #2

Add code that gives an nice Array interface to a vector of arrays #2

Uh oh!

Conversation

gabrielgellner commented Apr 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Apr 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Apr 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ChrisRackauckas commented Apr 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gabrielgellner commented Apr 13, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChrisRackauckas commented Apr 13, 2017

Uh oh!

gabrielgellner commented Apr 13, 2017

Uh oh!

ChrisRackauckas commented Apr 18, 2017

Uh oh!

gabrielgellner commented Apr 18, 2017

Uh oh!

gabrielgellner commented Apr 21, 2017

Uh oh!

ChrisRackauckas commented Apr 21, 2017

Uh oh!

ChrisRackauckas commented Apr 22, 2017

Uh oh!

gabrielgellner commented Apr 22, 2017

Uh oh!

ChrisRackauckas commented Apr 22, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChrisRackauckas commented Apr 26, 2017

Uh oh!

ChrisRackauckas commented Apr 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gabrielgellner commented Apr 26, 2017

Uh oh!

gabrielgellner commented Apr 26, 2017

Uh oh!

ChrisRackauckas commented Apr 26, 2017

Uh oh!

gabrielgellner commented Apr 26, 2017

Uh oh!

ChrisRackauckas commented Apr 26, 2017

Uh oh!

gabrielgellner commented Apr 26, 2017

Uh oh!

ChrisRackauckas commented Apr 26, 2017

Uh oh!

gabrielgellner commented Apr 26, 2017

Uh oh!

ChrisRackauckas commented Apr 26, 2017

gabrielgellner commented Apr 11, 2017 •

edited

Loading

coveralls commented Apr 11, 2017 •

edited

Loading

codecov-io commented Apr 11, 2017 •

edited

Loading

ChrisRackauckas commented Apr 13, 2017 •

edited

Loading

ChrisRackauckas commented Apr 26, 2017 •

edited

Loading