optimized hcat of onehot vectors and matrices #1595

racinmat · 2021-05-11T09:35:56Z

PR Checklist

Tests are added
Entry in NEWS.md
Documentation, if applicable
API changes require approval from a committer (different from the author, if applicable)

Fixes #1594 , adds tests for it and also optimized reduce(hcat, xs).
No new features are added, only performance optimization.

darsnack

Thanks for this! I left some suggestions that make sure we hit reshaped arrays as well.

src/onehot.jl

hitting reshaped arrays Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>

johnnychen94 · 2021-05-11T17:22:07Z

I believe the root issue is that we don't have similar defined for OneHotArray; once it's defined, it's very likely that many of these overly-verbose optimizations will be unnecessary.

Adding tests for typeof

darsnack

Let's go back to the previously suggested changes (I have commented them again with some updates). If the constructor is causing issues, then why not just change the constructor to:

OneHotArray(indices::I, L::Integer) where {T, N, I<: AbstractArray{T, N}} = OneHotArray{T, L, N, I}(indices)

Then much more than hcat can see the performance benefit.

src/onehot.jl

darsnack · 2021-05-12T13:03:38Z

I believe the root is is that we don't have similar defined for OneHotArray; once it's defined, it's very likely that many of these overly-verbose optimizations will be unnecessary.

Good point! Though this means we would need to pair similar with setindex! which would require checks to stop someone from breaking the one-hot behavior. I think directly operating on the index arrays will still see a performance benefit.

We might want to define similar anyways to guarantee more type stability in fallback cases (e.g. the reduce(hcat, ...) one).

DhairyaLGandhi · 2021-05-12T13:42:13Z

Ideally we would fix it generally rather than for special cases. That's the flux way.

darsnack · 2021-05-13T03:40:41Z

Here is an implementation of similar:

function Base.similar(x::OneHotArray, ::Type{Bool}, dims::Dims)
  indices = similar(_indices(x), Base.tail(dims))

  return OneHotArray(indices, first(dims))
end

This would need to be paired with Base.setindex! to address the reduce(hcat, ...) type instability (too late for me to figure out the index combos tonight 😅). Though I am guessing that since setindex! will need to perform checks to ensure the one hot property is maintained, this will not be as fast as an optimized custom reduce(::typeof(hcat), ...). I think we should define similar no matter what though.

making hcat more generic for onehotlike Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>

racinmat · 2021-05-13T08:20:54Z

Ok, the performance is now good even for reshaped arrays. Should I now include the Base.similar in this PR, or not, because of pairing with Base.setindex!?
Is it ready to merge?
I agree that generic fix would be the best solution, but I think this is good as an intermediate thing before the proper implementation.

darsnack · 2021-05-13T15:02:42Z

I think the best move is to remove the reduce(hcat, ...) paths from this PR. Leave #1596 open, and we address it later with similar and setindex!.

darsnack

Just some readability changes in addition to the request to remove the reduce(hcat, ...) paths for now.

src/onehot.jl

more readable code Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>

test/onehot.jl

racinmat · 2021-05-13T15:58:23Z

Done, reduce and tests for it are removed.

test/onehot.jl

darsnack

Looks great, thanks!

DhairyaLGandhi · 2021-05-13T16:28:05Z

bors r+

bors · 2021-05-13T16:42:05Z

Build succeeded:

buildkite/flux-dot-jl

onehot hcat optimizations

20ca411

racinmat force-pushed the master branch from 1ccc2c4 to 20ca411 Compare May 11, 2021 09:56

darsnack requested changes May 11, 2021

View reviewed changes

src/onehot.jl Outdated Show resolved Hide resolved

src/onehot.jl Outdated Show resolved Hide resolved

src/onehot.jl Outdated Show resolved Hide resolved

Apply suggestions from code review

2a2834d

hitting reshaped arrays Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>

reverting to original code, which is faster.

f725d3d

Adding tests for typeof

darsnack requested changes May 12, 2021

View reviewed changes

src/onehot.jl Outdated Show resolved Hide resolved

src/onehot.jl Outdated Show resolved Hide resolved

racinmat and others added 2 commits May 13, 2021 10:08

Apply suggestions from code review

1992006

making hcat more generic for onehotlike Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>

much faster constructors

22f4398

darsnack requested changes May 13, 2021

View reviewed changes

src/onehot.jl Outdated Show resolved Hide resolved

src/onehot.jl Outdated Show resolved Hide resolved

racinmat and others added 3 commits May 13, 2021 17:52

Apply suggestions from code review

3490b3a

more readable code Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>

removing reductions

a6db4fe

removing tests for reduction

1d90eb7

darsnack requested changes May 13, 2021

View reviewed changes

test/onehot.jl Show resolved Hide resolved

test/onehot.jl Show resolved Hide resolved

darsnack reviewed May 13, 2021

View reviewed changes

test/onehot.jl Show resolved Hide resolved

added type test

9d2282c

darsnack approved these changes May 13, 2021

View reviewed changes

bors bot merged commit 4bd20c7 into FluxML:master May 13, 2021

pevnak mentioned this pull request Jun 28, 2021

added support for reduce(hcat, OneHotMatrix) #1640

Closed

3 tasks

darsnack mentioned this pull request Mar 5, 2022

Add similar and setindex! FluxML/OneHotArrays.jl#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimized hcat of onehot vectors and matrices #1595

optimized hcat of onehot vectors and matrices #1595

racinmat commented May 11, 2021 •

edited

Loading

darsnack left a comment

johnnychen94 commented May 11, 2021 •

edited

Loading

darsnack left a comment •

edited

Loading

darsnack commented May 12, 2021 •

edited

Loading

DhairyaLGandhi commented May 12, 2021

darsnack commented May 13, 2021 •

edited

Loading

racinmat commented May 13, 2021 •

edited

Loading

darsnack commented May 13, 2021

darsnack left a comment

racinmat commented May 13, 2021

darsnack left a comment

DhairyaLGandhi commented May 13, 2021

bors bot commented May 13, 2021

optimized hcat of onehot vectors and matrices #1595

optimized hcat of onehot vectors and matrices #1595

Conversation

racinmat commented May 11, 2021 • edited Loading

PR Checklist

darsnack left a comment

Choose a reason for hiding this comment

johnnychen94 commented May 11, 2021 • edited Loading

darsnack left a comment • edited Loading

Choose a reason for hiding this comment

darsnack commented May 12, 2021 • edited Loading

DhairyaLGandhi commented May 12, 2021

darsnack commented May 13, 2021 • edited Loading

racinmat commented May 13, 2021 • edited Loading

darsnack commented May 13, 2021

darsnack left a comment

Choose a reason for hiding this comment

racinmat commented May 13, 2021

darsnack left a comment

Choose a reason for hiding this comment

DhairyaLGandhi commented May 13, 2021

bors bot commented May 13, 2021

racinmat commented May 11, 2021 •

edited

Loading

johnnychen94 commented May 11, 2021 •

edited

Loading

darsnack left a comment •

edited

Loading

darsnack commented May 12, 2021 •

edited

Loading

darsnack commented May 13, 2021 •

edited

Loading

racinmat commented May 13, 2021 •

edited

Loading