## Exercise - Optimization 

In [1]:
using Distributed

#### Ex.1: 

Find out how many cores and threads you can run on your computer.

##### Ex.1 - solution:

There can be many ways to see it, either by typing in the corresponding command (for your specific OS) from the command line (also you can check out `ParallelComputing.ipynb` section 1.2)

From the command line, type:

**linux**:
`lscpu`

**macOS**:
`sysctl -a | grep cpu | grep hw`

**windows**:
`wmic cpu get NumberOfCores,NumberOfLogicalProcessors`

or simply by looking into the system report for your computer (such as in macOS, `Apple` -> `About This Mac` -> `System Report...`)

From Julia, you can use the package `Hwloc`

In [2]:
using Hwloc

Check how many physical cores you have:

In [3]:
num_physical_cores() 

4

Check how many virtual cores (virtual processes) you can have:

In [4]:
num_virtual_cores()

8

You may see that `num_virtual_cores()` is two times the `num_physical_cores()`.

This is because normally on one physical core, there can be two processes (distributed processes, or threads) running concurrently. 

#### Ex.2: Procs() 
Add at least two processes to your instance.

##### Ex.2 - solution:

In [5]:
procs() # run this before adding procs

1-element Vector{Int64}:
 1

In [6]:
addprocs(2)

2-element Vector{Int64}:
 2
 3

In [7]:
procs() # run this after adding procs

3-element Vector{Int64}:
 1
 2
 3

In [8]:
workers() # only `workers` will be `working`

2-element Vector{Int64}:
 2
 3

**Q:** Why are there three processes while only two are working? 

Check out this `StackOverflow` thread: https://stackoverflow.com/questions/75247172/number-of-workers-and-processes-in-julia (and the Julia doc with link also provided in this thread).

#### Ex.3:
Generate four **Uniform(0,1)** random numbers on each core and return the matrix; call that A

##### Ex.3 - solution:

First, let **all workers** know what information they will need to know in order to work: 

In [9]:
@everywhere using Distributions, Random 
# here just to let them "know" the packages

We can then use `pmap` to let the workers to generate the columns of matrix A.

In [10]:
A_columns = pmap(i -> rand(Uniform(0, 1), 4), 1:nworkers())

2-element Vector{Vector{Float64}}:
 [0.7234671238939189, 0.49557716427533594, 0.6747703674198254, 0.8499955415266967]
 [0.3178551907215277, 0.23787811300037798, 0.8500445492222142, 0.9326045372406845]

(as the ordinary `map`, `pmap` returns a collection of the outputs after mapping the )

Then we can use `reduce`, which will apply the operation defined in the first argument on the given collection (the second argument) to "reduce" its dimensionality, to generate the matrix.

In [18]:
A = reduce(hcat, A_columns) # hcat to concatenate horizontally the vectors

4×2 Matrix{Float64}:
 0.723467  0.317855
 0.495577  0.237878
 0.67477   0.850045
 0.849996  0.932605

#### Ex.4: 
Calculate $A(A^TA)^{-1}A^T$ (transpose matrix A, if it is not full rank).

##### Ex.4 - solution:

In [36]:
P = A*(A'A)^(-1)*A'

4×4 Matrix{Float64}:
  0.692072    0.447356   -0.0886151  0.0716135
  0.447356    0.289826   -0.0392786  0.0644744
 -0.0886151  -0.0392786   0.506012   0.490478
  0.0716135   0.0644744   0.490478   0.512089