<a href="https://colab.research.google.com/github/Kontrabass2018/DeepLearningGPU/blob/main/Deep_Learning_GPU.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <img src="https://github.com/JuliaLang/julia-logo-graphics/raw/master/images/julia-logo-color.png" height="100" /> _Colab Notebook Template_

## Instructions
1. Work on a copy of this notebook: _File_ > _Save a copy in Drive_ (you will need a Google account). Alternatively, you can download the notebook using _File_ > _Download .ipynb_, then upload it to [Colab](https://colab.research.google.com/).
2. If you need a GPU: _Runtime_ > _Change runtime type_ > _Harware accelerator_ = _GPU_.
3. Execute the following cell (click on it and press Ctrl+Enter) to install Julia, IJulia and other packages (if needed, update `JULIA_VERSION` and the other parameters). This takes a couple of minutes.
4. Reload this page (press Ctrl+R, or ⌘+R, or the F5 key) and continue to the next section.

_Notes_:
* If your Colab Runtime gets reset (e.g., due to inactivity), repeat steps 2, 3 and 4.
* After installation, if you want to change the Julia version or activate/deactivate the GPU, you will need to reset the Runtime: _Runtime_ > _Factory reset runtime_ and repeat steps 3 and 4.

In [None]:
%%shell
set -e

#---------------------------------------------------#
JULIA_VERSION="1.8.2" # any version ≥ 0.7.0
JULIA_PACKAGES="IJulia BenchmarkTools"
JULIA_PACKAGES_IF_GPU="CUDA" # or CuArrays for older Julia versions
JULIA_NUM_THREADS=2
#---------------------------------------------------#

if [ -z `which julia` ]; then
  # Install Julia
  JULIA_VER=`cut -d '.' -f -2 <<< "$JULIA_VERSION"`
  echo "Installing Julia $JULIA_VERSION on the current Colab Runtime..."
  BASE_URL="https://julialang-s3.julialang.org/bin/linux/x64"
  URL="$BASE_URL/$JULIA_VER/julia-$JULIA_VERSION-linux-x86_64.tar.gz"
  wget -nv $URL -O /tmp/julia.tar.gz # -nv means "not verbose"
  tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
  rm /tmp/julia.tar.gz

  # Install Packages
  nvidia-smi -L &> /dev/null && export GPU=1 || export GPU=0
  if [ $GPU -eq 1 ]; then
    JULIA_PACKAGES="$JULIA_PACKAGES $JULIA_PACKAGES_IF_GPU"
  fi
  for PKG in `echo $JULIA_PACKAGES`; do
    echo "Installing Julia package $PKG..."
    julia -e 'using Pkg; pkg"add '$PKG'; precompile;"' &> /dev/null
  done

  # Install kernel and rename it to "julia"
  echo "Installing IJulia kernel..."
  julia -e 'using IJulia; IJulia.installkernel("julia", env=Dict(
      "JULIA_NUM_THREADS"=>"'"$JULIA_NUM_THREADS"'"))'
  KERNEL_DIR=`julia -e "using IJulia; print(IJulia.kerneldir())"`
  KERNEL_NAME=`ls -d "$KERNEL_DIR"/julia*`
  mv -f $KERNEL_NAME "$KERNEL_DIR"/julia

  echo ''
  echo "Successfully installed `julia -v`!"
  echo "Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then"
  echo "jump to the 'Checking the Installation' section."
fi

Installing Julia 1.8.2 on the current Colab Runtime...
2024-12-12 04:34:48 URL:https://storage.googleapis.com/julialang2/bin/linux/x64/1.8/julia-1.8.2-linux-x86_64.tar.gz [135859273/135859273] -> "/tmp/julia.tar.gz" [1]
Installing Julia package IJulia...
Installing Julia package BenchmarkTools...
Installing Julia package CUDA...
Installing IJulia kernel...
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mInstalling julia kernelspec in /root/.local/share/jupyter/kernels/julia-1.8

Successfully installed julia version 1.8.2!
Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then
jump to the 'Checking the Installation' section.




# First Installation

## Checking the Installation
The `versioninfo()` function should print your Julia version and some other info about the system:

In [1]:
versioninfo()

Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 2 × Intel(R) Xeon(R) CPU @ 2.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake-avx512)
  Threads: 2 on 2 virtual cores
Environment:
  LD_LIBRARY_PATH = /usr/lib64-nvidia
  JULIA_NUM_THREADS = 2


## Installation and importation of useful libraries
In this section we will use the Flux library and this the link to the  [official Julia Flux documentation](https://fluxml.ai/Flux.jl).

We will also use the CUDA package [link to the official Julia CUDA documentation](https://cuda.juliagpu.org/stable/).

In [2]:
# only first time!
using Pkg
#Pkg.add("CUDA")
Pkg.add("Flux")
Pkg.add("HDF5")


[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m   Installed[22m[39m RealDot ──────────────── v0.1.0
[32m[1m   Installed[22m[39m DiffRules ────────────── v1.15.1
[32m[1m   Installed[22m[39m GPUArraysCore ────────── v0.1.5
[32m[1m   Installed[22m[39m IRTools ──────────────── v0.4.14
[32m[1m   Installed[22m[39m IrrationalConstants ──── v0.2.2
[32m[1m   Installed[22m[39m Transducers ──────────── v0.4.84
[32m[1m   Installed[22m[39m ContextVariablesX ────── v0.1.3
[32m[1m   Installed[22m[39m ShowCases ────────────── v0.1.0
[32m[1m   Installed[22m[39m ArgCheck ─────────────── v2.4.0
[32m[1m   Installed[22m[39m Accessors ────────────── v0.1.39
[32m[1m   Installed[22m[39m FLoopsBase ───────────── v0.1.1
[32m[1m   Installed[22m[39m DiffResults ──────────── v1.1.0
[32m[1m   Installed[22m[39m Adapt ────────────────── v3.7.2
[32m[1m   Installed[22m[39m S

In [2]:
using CUDA
using Flux
using HDF5

## Importing Data

In [3]:
using Random
function load_tcga_data(infilename; shfl = true)
    infile = h5open(infilename)
    TCGA_data = infile["data"][:,:]
    labs = string.(infile["labels"][:])
    samples = string.(infile["samples"][:])
    genes = string.(infile["genes"][:])
    biotypes = string.(infile["biotypes"][:])
    close(infile)
    ids = collect(1:size(labs)[1])
    shfl && (ids = shuffle(ids))
    return TCGA_data[ids,:], labs[ids], samples[ids], genes, biotypes
end

function fetch_data(filename; shfl = true)
    if !(filename in readdir("."))
        # Define the URL`
        tcga_data_url = "https://bioinfo.iric.ca/~sauves/VARIA/$filename"

        # Escape the URL to handle special characters
        escaped_url = Base.shell_escape(tcga_data_url)

        # Construct and execute the wget command
        command = `wget $escaped_url`
        run(command)
    end
    load_tcga_data(filename; shfl = shfl)

end

fetch_data (generic function with 1 method)

In [4]:
infilename = "TCGA_19962_TPM_lab.h5"
TCGA_data, labels, samples, genes, biotypes = fetch_data(infilename);

# Installing the DNN architecture

In [7]:
Pkg.add(url="https://github.com/Kontrabass2018/FactorizedEmbeddings")

[32m[1m     Cloning[22m[39m git-repo `https://github.com/Kontrabass2018/FactorizedEmbeddings`
[32m[1m    Updating[22m[39m git-repo `https://github.com/Kontrabass2018/FactorizedEmbeddings`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m   Installed[22m[39m ProgressMeter ─ v1.10.2
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.8/Project.toml`
 [90m [88092329] [39m[92m+ FactorizedEmbeddings v0.1.0 `https://github.com/Kontrabass2018/FactorizedEmbeddings#main`[39m
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.8/Manifest.toml`
 [90m [88092329] [39m[92m+ FactorizedEmbeddings v0.1.0 `https://github.com/Kontrabass2018/FactorizedEmbeddings#main`[39m
 [90m [92933f4c] [39m[92m+ ProgressMeter v1.10.2[39m
[32m[1mPrecompiling[22m[39m project...
[32m  ✓ [39m[90mProgressMeter[39m
[32m  ✓ [39mFactorizedEmbeddings
  2 dependencies successfully precompiled in 13 seconds. 126 already precompiled.


In [5]:
using FactorizedEmbeddings

# Training a Factorized Embeddings model with the input data to GPU

In [6]:
redux_data = fit_transform(TCGA_data, verbose =1, nsteps=20_000)

[33m[1m│ [22m[39m - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
[33m[1m└ [22m[39m[90m@ ProgressMeter ~/.julia/packages/ProgressMeter/kVZZH/src/ProgressMeter.jl:594[39m
[32mProgress: 100%|███████████████████████████████████████████████| Time: 0:05:03 (30.30 ms/it)[39m
[34m  step:     10000[39m
[34m  loss:     0.07134086548185349[39m
[34m  pearson:  0.9381878[39m


2×10344 Matrix{Float32}:
 -0.728248  -0.362204   0.171727  -0.84398   …  0.0995332  0.0255205  -0.0844646  -0.143073
  0.271764   0.755781  -0.145214   0.295076     0.214705   0.531831    1.16004    -0.0437568

## Visualizing the reduced data

In [None]:
using Pkg
Pkg.add("CairoMakie")

[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m   Installed[22m[39m JpegTurbo_jll ───────────── v3.0.4+0
[32m[1m   Installed[22m[39m x265_jll ────────────────── v3.6.0+0
[32m[1m   Installed[22m[39m libfdk_aac_jll ──────────── v2.0.3+0
[32m[1m   Installed[22m[39m ImageIO ─────────────────── v0.6.9
[32m[1m   Installed[22m[39m AxisArrays ──────────────── v0.4.7
[32m[1m   Installed[22m[39m TiffImages ──────────────── v0.11.2
[32m[1m   Installed[22m[39m Libmount_jll ────────────── v2.40.2+0
[32m[1m   Installed[22m[39m LERC_jll ────────────────── v4.0.0+0
[32m[1m   Installed[22m[39m JpegTurbo ───────────────── v0.1.5
[32m[1m   Installed[22m[39m StatsFuns ───────────────── v1.3.2
[32m[1m   Installed[22m[39m HypergeometricFunctions ─── v0.3.25
[32m[1m   Installed[22m[39m AdaptivePredicates ──────── v1.2.0
[32m[1m   Installed[22m[39m PNGFiles ───────────

In [None]:
using CairoMakie

In [None]:
Pkg.add("CSV")
using CSV

In [None]:
## get colors and definition file
TCGA_colors_file = "TCGA_colors_def.txt"
escaped_url = Base.shell_escape("https://bioinfo.iric.ca/~sauves/VARIA/$TCGA_colors_file")
run(`wget $escaped_url`)
## build scatter plot figure
fig = Figure(size = (1024,800));
ax = Axis(fig[1,1],title="Sample atient embedding", xlabel = "Patient-FE-1", ylabel="Patient-FE-2", aspect = 1);
colors_labels_df = CSV.read(TCGA_colors_file,  DataFrame)
## plot train embed with circles.
for (i, group_lab) in enumerate(unique(labs))
    group = labs[train_ids] .== group_lab
    col = colors_labels_df[colors_labels_df[:,"labs"] .== group_lab,"hexcolor"][1]
    name = colors_labels_df[colors_labels_df[:,"labs"] .== group_lab,"name"][1]
    scatter!(ax, train_embed[1,group], train_embed[2,group], strokewidth = 0.1, color = String(col), label = name, marker = :circle)
end
fig

# Need Help?

* Learning: https://julialang.org/learning/
* Documentation: https://docs.julialang.org/
* Questions & Discussions:
  * https://discourse.julialang.org/
  * http://julialang.slack.com/
  * https://stackoverflow.com/questions/tagged/julia

If you ever ask for help or file an issue about Julia, you should generally provide the output of `versioninfo()`.

Add new code cells by clicking the `+ Code` button (or _Insert_ > _Code cell_).

Have fun!

<img src="https://raw.githubusercontent.com/JuliaLang/julia-logo-graphics/master/images/julia-logo-mask.png" height="100" />