Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: redesign of cat #10155

Closed
wants to merge 4 commits into from
Closed

WIP: redesign of cat #10155

wants to merge 4 commits into from

Conversation

Jutho
Copy link
Contributor

@Jutho Jutho commented Feb 10, 2015

This PR contains a complete redesign of (h)(v)cat, partially to make them more efficient, partially to make the code more consistent. The test/perf/cat/perf.jl results give this on master:

julia,hvcat_small,8.182284,11.243940,8.543121,0.395216
julia,hvcat_large,4.250845,10.075339,6.012962,1.387563
julia,hvcat_setind_small,5.366098,9.216589,6.119101,0.465260
julia,hvcat_setind_large,4.285626,10.801599,6.201728,1.452870
julia,hcat_small,2.805236,6.241830,3.377037,0.302577
julia,hcat_large,3.156908,9.115308,5.590900,1.349966
julia,hcat_setind_small,5.374137,8.158075,6.002154,0.355833
julia,hcat_setind_large,4.315162,9.641637,5.930253,1.347021
julia,vcat_small,19.671022,24.336427,20.224472,0.722260
julia,vcat_large,4.262613,9.657968,6.065484,1.367671
julia,vcat_setind_small,5.449062,8.600296,6.120130,0.378905
julia,vcat_setind_large,4.252420,9.788219,5.907602,1.374637
julia,catnd_small,175.186780,186.137778,180.113657,3.208002
julia,catnd_large,5.970334,11.536173,7.673361,1.372442
julia,catnd_setind_small,31.433578,42.702367,34.161286,2.558648
julia,catnd_setind_large,4.317689,9.697651,5.966775,1.346298

and this with the new design

julia,hvcat_small,7.085453,9.966918,7.459613,0.469310
julia,hvcat_large,4.184319,8.455822,5.607192,1.110591
julia,hvcat_setind_small,5.106077,8.137985,5.769874,0.355682
julia,hvcat_setind_large,4.035018,8.866260,5.601901,1.143155
julia,hcat_small,3.159738,5.359850,3.714419,0.324712
julia,hcat_large,2.952022,7.607253,4.776252,1.060606
julia,hcat_setind_small,5.419611,8.417947,6.103069,0.362133
julia,hcat_setind_large,3.857816,8.724146,5.248658,0.997564
julia,vcat_small,6.115002,9.706509,6.894299,0.436828
julia,vcat_large,4.241297,7.618622,5.524247,0.970087
julia,vcat_setind_small,5.126445,8.360320,5.902463,0.485275
julia,vcat_setind_large,4.241414,7.967729,5.581952,0.991785
julia,catnd_small,57.151200,61.801558,58.443829,1.115747
julia,catnd_large,5.676855,10.968851,7.086844,1.122760
julia,catnd_setind_small,31.872411,37.572134,33.079484,1.090067
julia,catnd_setind_large,3.883781,8.927666,5.092026,0.957001

The current design is more modular and doesn't require treading special cases separately; it's all handled via multiple dispatch. I need to implement some more performance test (e.g. for catting just numbers), but from the above it's clear that (for those tests) vcat and cat are much faster. hvcat is slightly faster. hcat seems to be a tad slower for the small case but not for the large case. The current hcat code contains some fairly specific instructions for this.

I have also reverted the generalized cat behavior for diagonal catting, but reinstated it under the name dcat, once suggested by @jiahao . This automatically provides a general blkdiag implementation.

I still need to add documentation.

catndims(x::AbstractArray) = ndims(x)
catndims(x) = 1

cat_similar(x::AbstractArray, T, sz) = similar(full(x), T, sz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow this, so take these with a big grain of salt:

  • Do we really want to require full to be implemented for all AbstractArrays (or is this used in special circumstances?)
  • It would be lovely if this didn't require actual intantiation of full(x), since in the end it's presumably about types and sizes. OTOH, since we don't have full_type(x), full_size(x) maybe there isn't a good alternative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this behaviour was taken over from the old code. I think the full should be omitted. There was a discussion recently that similar itself should take care of returning an array of a suitable type, i.e. one that has setindex!

@timholy
Copy link
Member

timholy commented Feb 11, 2015

I've never dived into the cat code myself, but I agree this looks good. The performance improvements are quite impressive.

@Jutho
Copy link
Contributor Author

Jutho commented Feb 11, 2015

I am able to reproduce the error in parallel locally, but have not yet been able to figure out what is causing it and how to fix it.

@Jutho
Copy link
Contributor Author

Jutho commented Feb 12, 2015

Some further update on trying to understand why this doesn't pass the test.
On the REPL, Base.runtests(["parallel"]) works. More generally, all tests succeed as long as the test "copy" is not included. The following fails

Base.runtests(["copy","parallel"])

with the following error

    From worker 2:       * copy                 in   0.84 seconds
exception on 3: ERROR: LoadError: ProcessExitedException()
 in worker_from_id at ./multi.jl:352
 in mapreducedim_between! at darray.jl:322
 in mapreducedim! at darray.jl:382
 in mapreducedim at reducedim.jl:211
 in anonymous at no file:63
 in runtests at /usr/local/julia/test/testdefs.jl:78
 in anonymous at multi.jl:866
 in run_work_thunk at multi.jl:617
 in anonymous at task.jl:866
while loading parallel.jl, in expression starting on line 62
ERROR: LoadError: LoadError: ProcessExitedException()
 in anonymous at task.jl:1408
while loading parallel.jl, in expression starting on line 62
while loading /usr/local/julia/usr/share/julia/test/runtests.jl, in expression starting on line 49

While I am currently not able to find the best way to reproduce the following, I was able to run the code around line 62 of test/parallel.jl explicitly as

dims = (20,20,20)
d = drandn(dims);
da = convert(Array, d);
dms=1
mapreducedim(t -> t*t, +, d, dms)

which then led to an error in one of the workers of the form
convert(::Type{Int...},::Int)
which certainly sounds very much like #8631.

Unfortunately, I have no clue how the current cat redesign triggers this behavior in this delicate way.

@Jutho Jutho force-pushed the jh/catredesign branch 3 times, most recently from 79f48c6 to dbbccfa Compare February 19, 2015 11:22
@Jutho
Copy link
Contributor Author

Jutho commented Feb 21, 2015

Finally got this to works thanks to convertalypse being solved. However, I am going to close this in favor of a new PR that explains the latest status.

@Jutho Jutho closed this Feb 21, 2015
@Jutho Jutho deleted the jh/catredesign branch August 23, 2018 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants