<a href="https://colab.research.google.com/github/macorony/Workshop-of-Parallel-Programming-in-Julia/blob/main/Multi_threading_with_ThreadsX.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <img src="https://github.com/JuliaLang/julia-logo-graphics/raw/master/images/julia-logo-color.png" height="100" /> _Colab Notebook Template_

## Instructions
1. Work on a copy of this notebook: _File_ > _Save a copy in Drive_ (you will need a Google account). Alternatively, you can download the notebook using _File_ > _Download .ipynb_, then upload it to [Colab](https://colab.research.google.com/).
2. If you need a GPU: _Runtime_ > _Change runtime type_ > _Harware accelerator_ = _GPU_.
3. Execute the following cell (click on it and press Ctrl+Enter) to install Julia, IJulia and other packages (if needed, update `JULIA_VERSION` and the other parameters). This takes a couple of minutes.
4. Reload this page (press Ctrl+R, or ⌘+R, or the F5 key) and continue to the next section.

_Notes_:
* If your Colab Runtime gets reset (e.g., due to inactivity), repeat steps 2, 3 and 4.
* After installation, if you want to change the Julia version or activate/deactivate the GPU, you will need to reset the Runtime: _Runtime_ > _Factory reset runtime_ and repeat steps 3 and 4.

In [1]:
%%shell
set -e

#---------------------------------------------------#
JULIA_VERSION="1.7.1" # any version ≥ 0.7.0
JULIA_PACKAGES="IJulia BenchmarkTools Plots"
JULIA_PACKAGES_IF_GPU="CUDA" # or CuArrays for older Julia versions
JULIA_NUM_THREADS=2
#---------------------------------------------------#

if [ -n "$COLAB_GPU" ] && [ -z `which julia` ]; then
  # Install Julia
  JULIA_VER=`cut -d '.' -f -2 <<< "$JULIA_VERSION"`
  echo "Installing Julia $JULIA_VERSION on the current Colab Runtime..."
  BASE_URL="https://julialang-s3.julialang.org/bin/linux/x64"
  URL="$BASE_URL/$JULIA_VER/julia-$JULIA_VERSION-linux-x86_64.tar.gz"
  wget -nv $URL -O /tmp/julia.tar.gz # -nv means "not verbose"
  tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
  rm /tmp/julia.tar.gz

  # Install Packages
  if [ "$COLAB_GPU" = "1" ]; then
      JULIA_PACKAGES="$JULIA_PACKAGES $JULIA_PACKAGES_IF_GPU"
  fi
  for PKG in `echo $JULIA_PACKAGES`; do
    echo "Installing Julia package $PKG..."
    julia -e 'using Pkg; pkg"add '$PKG'; precompile;"' &> /dev/null
  done

  # Install kernel and rename it to "julia"
  echo "Installing IJulia kernel..."
  julia -e 'using IJulia; IJulia.installkernel("julia", env=Dict(
      "JULIA_NUM_THREADS"=>"'"$JULIA_NUM_THREADS"'"))'
  KERNEL_DIR=`julia -e "using IJulia; print(IJulia.kerneldir())"`
  KERNEL_NAME=`ls -d "$KERNEL_DIR"/julia*`
  mv -f $KERNEL_NAME "$KERNEL_DIR"/julia  

  echo ''
  echo "Successfully installed `julia -v`!"
  echo "Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then"
  echo "jump to the 'Checking the Installation' section."
fi

Unrecognized magic `%%shell`.

Julia does not use the IPython `%magic` syntax.   To interact with the IJulia kernel, use `IJulia.somefunction(...)`, for example.  Julia macros, string macros, and functions can be used to accomplish most of the other functionalities of IPython magics.


# Checking the Installation
The `versioninfo()` function should print your Julia version and some other info about the system:

In [2]:
versioninfo()

Julia Version 1.7.1
Commit ac5cc99908 (2021-12-22 19:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, broadwell)
Environment:
  JULIA_NUM_THREADS = 2


In [3]:
using BenchmarkTools

M = rand(2^11, 2^11)

@btime $M * $M;

  471.490 ms (2 allocations: 32.00 MiB)


In [4]:
if ENV["COLAB_GPU"] == "1"
    using CUDA

    run(`nvidia-smi`)

    # Create a new random matrix directly on the GPU:
    M_on_gpu = CUDA.CURAND.rand(2^11, 2^11)
    @btime $M_on_gpu * $M_on_gpu; nothing
else
    println("No GPU found.")
end

No GPU found.


# Need Help?

* Learning: https://julialang.org/learning/
* Documentation: https://docs.julialang.org/
* Questions & Discussions:
  * https://discourse.julialang.org/
  * http://julialang.slack.com/
  * https://stackoverflow.com/questions/tagged/julia

If you ever ask for help or file an issue about Julia, you should generally provide the output of `versioninfo()`.

Add new code cells by clicking the `+ Code` button (or _Insert_ > _Code cell_).

Have fun!

<img src="https://raw.githubusercontent.com/JuliaLang/julia-logo-graphics/master/images/julia-logo-mask.png" height="100" />

# Parallelizing with ThreadsX.mapreduce
ThreadsX is a multi-threaded Julia library that provides parallel versions of some of the Base functions. 

In [6]:
function digitsin(digits::Int, num) 
    base = 10
    while (digits ÷ base > 0) 
        base *= 10
    end
    while num > 0
        if (num % base) == digits     
            return true
        end
        num ÷= 10
    end
    return false
end

digitsin (generic function with 1 method)

In [None]:
import Pkg; Pkg.add("ThreadsX")

In [11]:
using BenchmarkTools
using ThreadsX

In [12]:
function slow(n::Int64, digits::Int)
  total = ThreadsX.mapreduce(+, 1:n) do i
    if !digitsin(digits, i)
      1.0 / i
    else
      0.0
    end
  end
  return total
end
total = @btime slow(Int64(1e9), 9)
println("total = ", total)

  24.646 s (146 allocations: 8.75 KiB)
total = 14.241913010383238


In [13]:
using BenchmarkTools
@btime sum(!digitsin(9, i) ? 1.0/i : 0 for i in 1:1_000_000_000)

  30.016 s (0 allocations: 0 bytes)


14.2419130103833

In [14]:
using BenchmarkTools, ThreadsX
@btime ThreadsX.sum(!digitsin(9, i) ? 1.0/i : 0 for i in 1:1_000_000_000)

  23.634 s (130 allocations: 8.27 KiB)


14.241913010383238

In [15]:
function numericTerm(i)
  !digitsin(9, i) ? 1.0/i : 0
end
@btime ThreadsX.sum(numericTerm, 1:Int64(1e9))

  23.930 s (130 allocations: 8.27 KiB)


14.241913010383238

In [17]:
n = Int64(1e8)
r = rand(Float32, (n))
r[1:10]
last(r,10)

10-element Vector{Float32}:
 0.8472833
 0.18850714
 0.45455682
 0.9774078
 0.17127883
 0.80731434
 0.35670525
 0.77119374
 0.110624015
 0.17942017

In [18]:
@btime sort(r)

  13.187 s (2 allocations: 381.47 MiB)


100000000-element Vector{Float32}:
 0.0
 0.0
 0.0
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 1.1920929f-7
 ⋮
 0.9999999
 0.9999999
 0.9999999
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994

In [19]:
@btime ThreadsX.sort(r)

  9.572 s (2686866 allocations: 1.07 GiB)


100000000-element Vector{Float32}:
 0.0
 0.0
 0.0
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 5.9604645f-8
 1.1920929f-7
 ⋮
 0.9999999
 0.9999999
 0.9999999
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994
 0.99999994