# Distributed Arrays

We've played with the basic elements of Julia's parallel computing infrastructure. This is a one-sided communication model, and using some of these basic constructs requires understanding the model.

But what if, as a user, you don't really want to think about your parallel model? What if you can abstract out this logic and just use a simple array interface to perform distributed computation?

As earlier, let us ask the cluster manager for 8 Julia processes.

In [None]:
using JuliaRunClient
ctx = Context()
nb = self()

In [None]:
initParallel()
@result setJobScale(ctx, nb, 4)
waitForWorkers(4)

In [1]:
# If you were running this on your local notebook, you would do this to add processes
# addprocs(8)
# Use package for distributed arrays
using DistributedArrays

In [12]:
# Create vector of process IDs
C = procs()

9-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6
 7
 8
 9

Julia uses `map` to apply a function to every element in an `Array`.

In [13]:
# Apply a map to the vector
map(t -> t*t, C)

9-element Array{Int64,1}:
  1
  4
  9
 16
 25
 36
 49
 64
 81

To convert this `Array` into a `DArray` (a distributed array), use `distribute`

In [14]:
# Make the vector distributed
D = distribute(C)

9-element DistributedArrays.DArray{Int64,1,Array{Int64,1}}:
 1
 2
 3
 4
 5
 6
 7
 8
 9

The `distribute` command cuts the `Array` into chunks and then stores them on the different processes

In [16]:
# show how the vector is distributed accross the workers
D.indexes

8-element Array{Tuple{UnitRange{Int64}},1}:
 (1:1,)
 (2:2,)
 (3:3,)
 (4:5,)
 (6:6,)
 (7:7,)
 (8:8,)
 (9:9,)

Now when you run a `map` on a `DArray`, it runs in parallel!

In [19]:
# apply map to distributed vector (looks identical to non-distributed case)
map(t -> t*t, D)

9-element DistributedArrays.DArray{Int64,1,Array{Int64,1}}:
  1
  4
  9
 16
 25
 36
 49
 64
 81

The nice thing about this is `DArray`s aren't restricted to numeric types.

In [20]:
map(t -> Dates.monthname((t - 1) % 12 + 1), D)

9-element DistributedArrays.DArray{UTF8String,1,Array{UTF8String,1}}:
 "January"  
 "February" 
 "March"    
 "April"    
 "May"      
 "June"     
 "July"     
 "August"   
 "September"

See if you can parse and understand this next example.

In [27]:
monthString = map(t -> Dates.monthname((t - 1) % 12 + 1) |> s -> s*" is my favorite month.\n", D) |>
    t -> reduce(*, Array(t))
println(monthString)

January is my favorite month.
February is my favorite month.
March is my favorite month.
April is my favorite month.
May is my favorite month.
June is my favorite month.
July is my favorite month.
August is my favorite month.
September is my favorite month.



We can also declare a distrubted array of matrices via a distributed comprehension. 

In [30]:
D55 = @DArray [randn(5,5) for i = 1:32]

32-element DistributedArrays.DArray{Array{Float64,2},1,Array{Array{Float64,2},1}}:
 5x5 Array{Float64,2}:
 -1.06675   -0.862483   0.068994  -0.954383   0.612892 
  1.23371    0.470195  -0.569822   1.97442   -0.357816 
 -0.678979   0.283267  -0.719494  -0.321645   0.30929  
  0.334916  -1.22503    0.745611  -2.33004    0.0618629
  0.63521   -1.19402   -1.89202   -1.41228   -0.399258                
 5x5 Array{Float64,2}:
 -0.263565   0.792395   1.4206     -1.04345   -1.3819  
 -0.774798  -2.12411    0.233554    0.480621  -1.07603 
  0.106657  -0.635091   1.4687      1.32517   -0.53843 
 -0.109394  -0.351786  -1.76389    -1.26364    0.923438
 -1.03854    0.719064  -0.0939638  -0.795515  -1.6328                 
 5x5 Array{Float64,2}:
  0.552984   0.771221  -1.13863   -0.508559  -0.0113824
 -0.933323   0.108645  -0.156966   0.245402  -1.24307  
 -1.71768   -0.63347   -1.1948    -0.513568   0.102846 
  1.00582    0.5017     0.254344  -0.459188  -1.24642  
  0.578157  -0.955268  -0.486167  

And subsequently `map` a function on them in parallel. 

In [31]:
# Compute singular values of the dsitributed vector of matrices
Dsvd = map(svdvals, D55)

32-element DistributedArrays.DArray{Array{Float64,1},1,Array{Array{Float64,1},1}}:
 [3.999836455760208,2.5044582011232213,1.7051737717267952,0.6770905297265457,0.2298652678344431]    
 [3.5269810373142625,2.9504161034893985,2.3501373295253316,0.904445051394586,0.036825704100338044]  
 [2.6238799978968435,2.1866508742328703,1.707233356369122,1.2765590827198168,0.5839922950173981]    
 [4.105123344980141,1.9854356889225975,1.6245519564067503,1.4201918512438185,0.03971714564736037]   
 [3.8691762964769665,2.8147930897207667,2.352397425888243,0.9912135906748117,0.2145582563848968]    
 [4.2015580368111705,2.2515617032836204,1.5737011557308977,0.44782935379067657,0.28057510281268183] 
 [2.7386924486715265,2.328167427456983,1.674341982152823,1.0797402982138045,0.4057750025903677]     
 [2.8585407690602858,2.4226851529761357,1.9197908080706862,0.7444017726046709,0.3663540362476281]   
 [3.6701118732695526,2.6341723275159223,1.7506859088988276,1.362974720086374,1.1825383948594033]    
 [3.4742