# Performance of SharedArray Passing Styles

This notebook demonstrates the performance of different passing styles for SharedArrays.  It doesn't seems to have any significant difference in any methods.

### Single object

Ref|Best Time(ms)|Objects|Object Size|Style
---|------------:|----:--|----------:|:----------------
1|25.548|1|5e7|Passing SharedArray argument
2|25.510|1|5e7|Passing struct with SharedArray 
3|25.420|1|5e7|Access SharedArray in global scope


### Multiple Objects - Size 1e7

Ref|Best Time(ms)|Objects|Object Size|Style
---|------------:|----:--|----------:|:----------------
4|29.814|6|1e7|Passing multiple SharedArray arguments
5|29.682|6|1e7|Passing vector of SharedArray's 
6|29.774|6|1e7|Passing struct with SharedArray's

### Multiple Objects - Size 2e7

Ref|Best Time(ms)|Objects|Object Size|Style
---|------------:|----:--|----------:|:----------------
7|62.928|6|2e7|Passing multiple SharedArray arguments
8|60.144|6|2e7|Passing vector of SharedArray's 
9|60.036|6|2e7|Passing struct with SharedArray's
10|59.972|6|2e7|Passing dict of SharedArray's

In [1]:
using BenchmarkTools

In [2]:
addprocs()

4-element Array{Int64,1}:
 2
 3
 4
 5

In [3]:
@everywhere struct Container
    sa::SharedArray{Float64}
end

In [4]:
N = 50_000_000
SA = SharedArray{Float64}(N);
SB = Container(SharedArray{Float64}(N))
rand!(SA)
rand!(SB.sa)
;

In [5]:
# Style 1. passing shared array across as argument
@everywhere foo(A) = sum(A)

# Style 2. passing shared array as in a struct across as argument
@everywhere bar(B) = sum(B.sa)

In [6]:
@btime fetch(@spawnat 2 foo($SA))

  25.548 ms (161 allocations: 6.56 KiB)


2.499502276869192e7

In [7]:
@btime fetch(@spawnat 2 bar($SB))

  25.510 ms (165 allocations: 6.63 KiB)


2.499940094927577e7

In [8]:
fetch(@spawnat 2 whos())

	From worker 2:	                          Base               Module
	From worker 2:	                     Container    152 bytes  DataType
	From worker 2:	                          Core               Module
	From worker 2:	                          Main               Module
	From worker 2:	                           bar      0 bytes  #bar
	From worker 2:	                           foo      0 bytes  #foo


In [9]:
fetch(@spawnat 2 identity(SA))

50000000-element SharedArray{Float64,1}:
 0.596237 
 0.657967 
 0.570515 
 0.266937 
 0.881716 
 0.923393 
 0.719334 
 0.323095 
 0.569116 
 0.483323 
 0.861799 
 0.452962 
 0.299303 
 ⋮        
 0.700613 
 0.0576408
 0.310462 
 0.392749 
 0.236105 
 0.250487 
 0.021282 
 0.440471 
 0.673856 
 0.302125 
 0.145842 
 0.185251 

In [10]:
fetch(@spawnat 2 whos())

	From worker 2:	                          Base               Module
	From worker 2:	                     Container    152 bytes  DataType
	From worker 2:	                          Core               Module
	From worker 2:	                          Main               Module
	From worker 2:	                            SA 390625 KB     50000000-element SharedArray{Float…
	From worker 2:	                           bar      0 bytes  #bar


In [11]:
# Style 3. Use global SharedArray variable
@everywhere woz() = begin global SA; sum(SA); end

	From worker 2:	                           foo      0 bytes  #foo


In [12]:
@btime fetch(@spawnat 2 woz())

  25.420 ms (129 allocations: 4.69 KiB)


2.499502276869192e7

In [13]:
# reduce size so we can make 6 of them
N = 10_000_000
N * 8 / 1024 / 1024

76.2939453125

In [14]:
@everywhere struct Six
    S1::SharedArray{Float64}
    S2::SharedArray{Float64}
    S3::SharedArray{Float64}
    S4::SharedArray{Float64}
    S5::SharedArray{Float64}
    S6::SharedArray{Float64}
end

In [15]:
# Style 4. pass multiple SharedArray's as argument
@everywhere daw(a,b,c,d,e,f) = sum(a)+sum(b)+sum(c)+sum(d)+sum(e)+sum(f)

# Style 5. pass vector of SharedArray's
@everywhere daz(x) = sum(sum.(x))

# Style 6. pass struct that contains multiple SharedArray's
@everywhere gaw(d) = sum(d.S1) + sum(d.S2) + sum(d.S3) + sum(d.S4) + sum(d.S5) + sum(d.S6)

# Style 7. pass dict of SharedArray's
@everywhere gam(d) = sum(d[:S1]) + sum(d[:S2]) + sum(d[:S3]) + 
    sum(d[:S4]) + sum(d[:S5]) + sum(d[:S6])

In [16]:
# reduce size so we can make 6 of them
N = 10_000_000
S1 = SharedArray{Float64}(N); rand!(S1)
S2 = SharedArray{Float64}(N); rand!(S2)
S3 = SharedArray{Float64}(N); rand!(S3)
S4 = SharedArray{Float64}(N); rand!(S4)
S5 = SharedArray{Float64}(N); rand!(S5)
S6 = SharedArray{Float64}(N); rand!(S6)
SC = [S1, S2, S3, S4, S5, S6]
SD = Six(S1,S2,S3,S4,S5,S6)

Six([0.466808, 0.96324, 0.118648, 0.899985, 0.483694, 0.00410268, 0.819949, 0.233874, 0.373965, 0.887881  …  0.34824, 0.281655, 0.764008, 0.335948, 0.563051, 0.199731, 0.0799466, 0.232247, 0.120913, 0.750186], [0.388344, 0.581768, 0.158713, 0.776752, 0.479047, 0.740093, 0.244597, 0.610926, 0.724766, 0.356857  …  0.677002, 0.102352, 0.424149, 0.85675, 0.112642, 0.862147, 0.775164, 0.919656, 0.159986, 0.481635], [0.405088, 0.550554, 0.881716, 0.422194, 0.700501, 0.980598, 0.164958, 0.412561, 0.485083, 0.613773  …  0.228876, 0.014489, 0.954534, 0.95607, 0.542853, 0.845687, 0.305832, 0.844853, 0.698261, 0.00607508], [0.272337, 0.0902686, 0.0726056, 0.0778901, 0.859491, 0.278753, 0.689527, 0.934602, 0.743676, 0.931294  …  0.195874, 0.682653, 0.692641, 0.863001, 0.852691, 0.986932, 0.117587, 0.181037, 0.18842, 0.657827], [0.536275, 0.218916, 0.216259, 0.609467, 0.934244, 0.705463, 0.20845, 0.886263, 0.244158, 0.225146  …  0.772428, 0.0371161, 0.131306, 0.0404259, 0.0775097, 0.668206, 0.59030

In [17]:
@benchmark daw($S1,$S2,$S3,$S4,$S5,$S6)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     29.814 ms (0.00% GC)
  median time:      33.063 ms (0.00% GC)
  mean time:        34.054 ms (0.00% GC)
  maximum time:     101.854 ms (0.00% GC)
  --------------
  samples:          147
  evals/sample:     1

In [18]:
@benchmark daz($SC)

BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     29.682 ms (0.00% GC)
  median time:      30.460 ms (0.00% GC)
  mean time:        31.397 ms (0.00% GC)
  maximum time:     38.306 ms (0.00% GC)
  --------------
  samples:          160
  evals/sample:     1

In [19]:
@benchmark gaw($SD)

BenchmarkTools.Trial: 
  memory estimate:  224 bytes
  allocs estimate:  13
  --------------
  minimum time:     29.774 ms (0.00% GC)
  median time:      30.206 ms (0.00% GC)
  mean time:        31.313 ms (0.00% GC)
  maximum time:     43.403 ms (0.00% GC)
  --------------
  samples:          160
  evals/sample:     1

In [20]:
# reduce size so we can make 6 of them
N = 20_000_000
S1 = SharedArray{Float64}(N); rand!(S1)
S2 = SharedArray{Float64}(N); rand!(S2)
S3 = SharedArray{Float64}(N); rand!(S3)
S4 = SharedArray{Float64}(N); rand!(S4)
S5 = SharedArray{Float64}(N); rand!(S5)
S6 = SharedArray{Float64}(N); rand!(S6)
SC = [S1, S2, S3, S4, S5, S6]
SD = Six(S1,S2,S3,S4,S5,S6)
SE = Dict(:S1=>S1,:S2=>S2,:S3=>S3,:S4=>S4,:S5=>S5,:S6=>S6);

In [21]:
gc(true)  # get ready

In [22]:
@benchmark daw($S1,$S2,$S3,$S4,$S5,$S6)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     62.928 ms (0.00% GC)
  median time:      66.861 ms (0.00% GC)
  mean time:        75.385 ms (0.00% GC)
  maximum time:     357.765 ms (0.00% GC)
  --------------
  samples:          68
  evals/sample:     1

In [23]:
@benchmark daz($SC)

BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     60.144 ms (0.00% GC)
  median time:      62.250 ms (0.00% GC)
  mean time:        63.212 ms (0.00% GC)
  maximum time:     73.712 ms (0.00% GC)
  --------------
  samples:          80
  evals/sample:     1

In [24]:
@benchmark gaw($SD)

BenchmarkTools.Trial: 
  memory estimate:  224 bytes
  allocs estimate:  13
  --------------
  minimum time:     60.036 ms (0.00% GC)
  median time:      61.815 ms (0.00% GC)
  mean time:        62.826 ms (0.00% GC)
  maximum time:     71.630 ms (0.00% GC)
  --------------
  samples:          80
  evals/sample:     1

In [25]:
@benchmark gam($SE)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     59.972 ms (0.00% GC)
  median time:      61.763 ms (0.00% GC)
  mean time:        62.124 ms (0.00% GC)
  maximum time:     69.707 ms (0.00% GC)
  --------------
  samples:          81
  evals/sample:     1