Faster iteration #38

christopherzimmerman · 2020-09-14T01:09:00Z

Before merging, need to:

[ ] Remove old mapping methods
[ ] Port over matrix/axis iters to new yielding macro methods
[ ] Determine if unsafe_iter is needed, or if a workaround is possible.

jkthorne · 2020-09-14T04:52:28Z

just curious since this PR is about performance did you run any benchmarks with this?

christopherzimmerman · 2020-09-14T12:23:07Z

@wontruefree I am still optimizing a bit, mostly just removing overhead and speeding up lookups, but here's a small benchmark of 2m elements (1m for the strided tensors).

require "./num"
require "benchmark"

n = 2000000

a = Tensor.random(0.0...1.0, [n])
b = Tensor.random(0.0...1.0, [n])
c = Tensor.random(0.0...1.0, [n])

a_strided = a[{..., 2}]
b_strided = b[{..., 2}]
c_strided = c[{..., 2}]

Benchmark.ips do |bench|
  bench.report("old map") { a.map { |i| i / 2 } }
  bench.report("new map") { a.map_new { |i| i / 2 } }

  bench.report("old map2") { a.map(b) { |i, j| i + j / 2 } }
  bench.report("new map2") { a.map_new(b) { |i, j| i + j / 2 } }

  bench.report("old map3") { a.map(b, c) { |i, j, k| i + j * 2 - k } }
  bench.report("new_map3") { a.map_new(b, c) { |i, j, k| i + k * 2 - k } }

  bench.report("old map strided") { a_strided.map { |i| i / 2 } }
  bench.report("new map strided") { a_strided.map_new { |i| i / 2 } }

  bench.report("old map2 strided") { a_strided.map(b_strided) { |i, j| i + j / 2 } }
  bench.report("new map2 strided") { a_strided.map_new(b_strided) { |i, j| i + j / 2 } }

  bench.report("old map3 strided") { a_strided.map(b_strided, c_strided) { |i, j, k| i + j * 2 - k } }
  bench.report("new_map3 strided") { a_strided.map_new(b_strided, c_strided) { |i, j, k| i + k * 2 - k } }
end

         old map 187.72  (  5.33ms) (± 2.30%)  15.3MB/op   1.99× slower
         new map 295.96  (  3.38ms) (± 2.39%)  15.3MB/op   1.26× slower

        old map2 179.66  (  5.57ms) (± 3.07%)  15.3MB/op   2.08× slower
        new map2 260.18  (  3.84ms) (± 1.92%)  15.3MB/op   1.44× slower

        old map3 175.49  (  5.70ms) (± 2.13%)  15.3MB/op   2.13× slower
        new_map3 253.36  (  3.95ms) (± 2.07%)  15.3MB/op   1.47× slower

 old map strided 255.69  (  3.91ms) (± 2.49%)  7.63MB/op   1.46× slower
 new map strided 373.54  (  2.68ms) (± 1.47%)  7.63MB/op        fastest

old map2 strided 183.26  (  5.46ms) (± 2.56%)  7.63MB/op   2.04× slower
new map2 strided 296.31  (  3.37ms) (± 1.62%)  7.63MB/op   1.26× slower

old map3 strided 159.19  (  6.28ms) (± 1.45%)  7.63MB/op   2.35× slower
new_map3 strided 220.19  (  4.54ms) (± 2.02%)  7.63MB/op   1.70× slower

jkthorne · 2020-09-14T21:16:39Z

That seems like a pretty big improvement!

christopherzimmerman added 3 commits September 12, 2020 19:49

testing yielding over iterators

c2f1828

tri iterator

bb65978

iter complete

155f44b

remove data for now

cec0fb9

license

3369400

christopherzimmerman mentioned this pull request Sep 17, 2020

0.4.2 #40

Closed

4 tasks

christopherzimmerman closed this Sep 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster iteration #38

Faster iteration #38

christopherzimmerman commented Sep 14, 2020

jkthorne commented Sep 14, 2020

christopherzimmerman commented Sep 14, 2020 •

edited

jkthorne commented Sep 14, 2020

Faster iteration #38

Faster iteration #38

Conversation

christopherzimmerman commented Sep 14, 2020

jkthorne commented Sep 14, 2020

christopherzimmerman commented Sep 14, 2020 • edited

jkthorne commented Sep 14, 2020

christopherzimmerman commented Sep 14, 2020 •

edited