# Parallel computing

### Start and check multiple processes

In [1]:
using BenchmarkTools,  Distributed, LinearAlgebra

In [2]:
addprocs(2) # or start julia in a console with the option -p n_processes
# this start multiple processes not threads (processes doesn't share memory as threads does, but threads
# are limited to a single CPU)

2-element Array{Int64,1}:
 2
 3

In [3]:
for pid in workers()
    println(pid)
end
    

2
3


In [4]:
@everywhere println(myid())

1
      From worker 3:	3
      From worker 2:	2


### Parallel for -  @parallel for macro
- output_var = @parallel (aggregator) for_cycle 

In [5]:
function f(n)
  s = 0.0
  for i = 1:n
    s += i/2
  end
    return s
end

f (generic function with 1 method)

In [7]:
@time f(100000000)

  0.145061 seconds (5 allocations: 176 bytes)


2.500000025e15

In [8]:
function pf(n)
    s = @distributed (+) for i = 1:n # aggregate using sum on variable s   
        i/2                          # last element of for cycle is used by the aggregator
  end
  return s
end

pf (generic function with 1 method)

In [10]:
@time pf(100000000)

  0.087214 seconds (242 allocations: 9.531 KiB)


2.500000025e15

### Parallel map - pmap(f,collection)
 - applies a function f on each elements of the collection

In [11]:
x = [rand(100,100) for i in 1:10];

In [12]:
@benchmark map(svd,x)

BenchmarkTools.Trial: 
  memory estimate:  4.71 MiB
  allocs estimate:  122
  --------------
  minimum time:     40.687 ms (0.00% GC)
  median time:      55.836 ms (0.00% GC)
  mean time:        64.298 ms (2.55% GC)
  maximum time:     172.518 ms (57.89% GC)
  --------------
  samples:          78
  evals/sample:     1

In [13]:
@benchmark pmap(svd,x)

BenchmarkTools.Trial: 
  memory estimate:  1.59 MiB
  allocs estimate:  1584
  --------------
  minimum time:     12.578 ms (0.00% GC)
  median time:      13.245 ms (0.00% GC)
  mean time:        13.693 ms (1.63% GC)
  maximum time:     67.672 ms (78.15% GC)
  --------------
  samples:          365
  evals/sample:     1