# Demo: Considering an easy regression task using the JGep
# Here we start by installing the Julia kernel - this may take a few moments 😴

In [13]:
%%shell
set +e

#---------------------------------------------------#
JULIA_VERSION="1.10.5" # any version ≥ 0.7.0
JULIA_PACKAGES="IJulia BenchmarkTools CSV DataFrames Dates DynamicExpressions FileIO ForwardDiff GZip JSON LineSearches LinearAlgebra Logging Optim OrderedCollections ProgressMeter Random Serialization StaticArrays Statistics Zygote"
JULIA_NUM_THREADS=2
#---------------------------------------------------#

if [ -z `which julia` ]; then
  # Install Julia
  JULIA_VER=`cut -d '.' -f -2 <<< "$JULIA_VERSION"`
  echo "Installing Julia $JULIA_VERSION on the current Colab Runtime..."
  BASE_URL="https://julialang-s3.julialang.org/bin/linux/x64"
  URL="$BASE_URL/$JULIA_VER/julia-$JULIA_VERSION-linux-x86_64.tar.gz"
  if ! wget -nv $URL -O /tmp/julia.tar.gz; then
    echo "Failed to download Julia. Check the URL and your internet connection."
    exit 1
  fi

  if ! tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1; then
    echo "Failed to extract Julia archive. Check if you have sufficient permissions."
    exit 1
  fi

  rm /tmp/julia.tar.gz

  # Install packages
  echo "Installing packages..."
  if ! julia -e "using Pkg; Pkg.add([$(echo $JULIA_PACKAGES | sed "s/ /\", \"/g" | sed "s/^/\"/; s/$/\"/")]); Pkg.precompile()"; then
    echo "Failed to install some packages. Please check the output for details."
  fi

  # Install kernel and rename it to "julia"
  echo "Installing IJulia kernel..."
  if ! julia -e 'using Pkg; Pkg.add("IJulia"); using IJulia; IJulia.installkernel("julia", env=Dict("JULIA_NUM_THREADS"=>"'"$JULIA_NUM_THREADS"'"))'; then
    echo "Failed to install IJulia kernel. Check your internet connection and try again."
    exit 1
  fi

  KERNEL_DIR=`julia -e "using IJulia; print(IJulia.kerneldir())"`
  KERNEL_NAME=`ls -d "$KERNEL_DIR"/julia*`
  if ! mv -f $KERNEL_NAME "$KERNEL_DIR"/julia; then
    echo "Failed to rename kernel. Check if you have sufficient permissions."
    exit 1
  fi

  echo ''
  echo "Successfully installed Julia $JULIA_VERSION with the specified packages!"
  echo "Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then"
  echo "select 'Julia' from the kernel dropdown menu to start using Julia."
else
  echo "Julia is already installed. Version: `julia -v`"
  echo "Updating packages..."
  if ! julia -e "using Pkg; Pkg.add([$(echo $JULIA_PACKAGES | sed "s/ /\", \"/g" | sed "s/^/\"/; s/$/\"/")]); Pkg.update(); Pkg.precompile()"; then
    echo "Failed to update some packages. Please check the output for details."
  fi
fi

Unrecognized magic `%%shell`.

Julia does not use the IPython `%magic` syntax.   To interact with the IJulia kernel, use `IJulia.somefunction(...)`, for example.  Julia macros, string macros, and functions can be used to accomplish most of the other functionalities of IPython magics.


## After that, go to the right corner (small threefold pointing downwards) and change the runtime type to the julia kernel

## In the nextline we just make sure that we have installed it

In [1]:
versioninfo()

Julia Version 1.10.5
Commit 6f3fdf7b362 (2024-08-27 14:19 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 2 × Intel(R) Xeon(R) CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, broadwell)
Threads: 2 default, 0 interactive, 1 GC (on 2 virtual cores)
Environment:
  LD_LIBRARY_PATH = /usr/local/nvidia/lib:/usr/local/nvidia/lib64
  JULIA_NUM_THREADS = 2


In [2]:
# We install the package: - takes another minute :(
using Pkg
Pkg.add(url="https://github.com/maxreiss123/GEP_SBP_.git")

[32m[1m     Cloning[22m[39m git-repo `https://github.com/maxreiss123/GEP_SBP_.git`
[32m[1m    Updating[22m[39m git-repo `https://github.com/maxreiss123/GEP_SBP_.git`
[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.10/Project.toml`
  [90m[2f0a5bb0] [39m[92m+ JGep v0.1.0 `https://github.com/maxreiss123/GEP_SBP_.git#master`[39m
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.10/Manifest.toml`
  [90m[2f0a5bb0] [39m[92m+ JGep v0.1.0 `https://github.com/maxreiss123/GEP_SBP_.git#master`[39m
[32m[1mPrecompiling[22m[39m project...
[32m  ✓ [39mJGep
  1 dependency successfully precompiled in 7 seconds. 130 already precompiled.


In [3]:
#Then we import everthing we need - add further libs if you would like to plot the result
using JGep
using DynamicExpressions
using OrderedCollections
using Random

#If we want to reproduce our results
Random.seed!(1)

TaskLocalRNG()

In [4]:
#Create the utilized symbols: to make the algorithm fast in the backend, we fully tokenize the symbols to Int8 and assign an arity, meaning how many inputs a symbol can have
#The number of the symbols can be chosen arbitrarily, but should match there corresponding representation later on

#Here we use:
#1:=+   which takes 2 arguments
#2:=*   which takes 2 arguments
#3:=/   which takes 2 arguments
#4:=/   which takes 2 arguments
#5:=exp which takes 2 arguments
#
#6 x1   terminal takes 0 arguments
#7 x2   terminal takes 0 arguments
#8 2    terminal takes 0 arguments
#9 0    terminal takes 0 arguments

utilized_syms = OrderedDict{Int8,Int8}(1 => 2, 2 => 2, 3 => 2, 4 => 2, 5 => 1,6 => 0, 7 => 0, 8 => 0, 9 => 0)

OrderedDict{Int8, Int8} with 9 entries:
  1 => 2
  2 => 2
  3 => 2
  4 => 2
  5 => 1
  6 => 0
  7 => 0
  8 => 0
  9 => 0

In [5]:
#Here we create a vector of symbols serving as the connection between the genes (+,*)
connection_syms = Int8[1, 2]

2-element Vector{Int8}:
 1
 2

In [6]:
#Here, we need to create a mapping between our tokenisation and the symbols utilized for DynamicExpression.jl
#Mapping should corespond to the former defined symbols


operators =  OperatorEnum(; binary_operators=[+, -, *, /], unary_operators=[exp])

callbacks = Dict{Int8,Function}(
        3 => (-),
        4 => (/),
        2 => (*),
        1 => (+),
        5 => (exp)
)
nodes = OrderedDict{Int8,Any}(
    6 => Node{Float64}(feature=1),
    7 => Node{Float64}(feature=2),
    8 => 2,
    9 => 0
)


OrderedDict{Int8, Any} with 4 entries:
  6 => x1
  7 => x2
  8 => 2
  9 => 0

In [7]:
#Here we define some hyperparameters for our method


gep_params = Dict{String, AbstractFloat}(
    "one_point_cross_over_prob" => 0.6,
    "two_point_cross_over_prob" => 0.5,
    "mutation_prob" => 1,
    "mutation_rate" => 0.05,
    "dominant_fusion_prob" => 0.1,
    "dominant_fusion_rate" => 0.2,
    "rezessiv_fusion_prob" => 0.1,
    "rezessiv_fusion_rate" => 0.2,
    "fusion_prob" => 0.0,
    "fusion_rate" => 0.0,
    "inversion_prob" => 0.1
)

Dict{String, AbstractFloat} with 11 entries:
  "mutation_rate"             => 0.05
  "dominant_fusion_prob"      => 0.1
  "inversion_prob"            => 0.1
  "dominant_fusion_rate"      => 0.2
  "one_point_cross_over_prob" => 0.6
  "mutation_prob"             => 1.0
  "rezessiv_fusion_rate"      => 0.2
  "fusion_rate"               => 0.0
  "rezessiv_fusion_prob"      => 0.1
  "fusion_prob"               => 0.0
  "two_point_cross_over_prob" => 0.5

## We now define the data according to a function:

$$
y = x_1^2 + 0.5 x_1x_2-2x_2^2
$$

In [9]:
#Generate some data
x_data = randn(Float64, 2, 1000);
y_data = @. x_data[1,:] * x_data[1,:] + x_data[1,:] * x_data[2,:] - 2 * x_data[2,:] * x_data[2,:];

In [10]:
#Setting number of individuals
individuals = 1000

#Setting number of epochs
epochs = 1000

#Setting gene count
gene_count = 3

#Setting head len
head_len = 5;


5

In [11]:
#running the algorithm by using an Mean-squared error
#employing conjugate gradient for the coefficients
#Setting Hall of fame to 1, which means we obtain a list with one element containing the best

best=runGep(individuals, epochs,head_len,gene_count,
            utilized_syms,operators, callbacks, nodes, x_data,y_data, connection_syms, gep_params;
    loss_fun_str="mse", opt_method_const=:cg, hof=1);

[32mProgress: 100%|█████████████████████████████████████████████████████████████| Time: 0:00:05[39m


In [12]:
#Showing the fitness and the function
@show string(best[1].fitness)
@show string(best[1].compiled_function)

string((best[1]).fitness) = "1.5016484907627395e-31"
string((best[1]).compiled_function) = "(((x1 + ((0.0 - x2) - x2)) + 0.0) * x2) + (x1 * (0.0 + x1))"


"(((x1 + ((0.0 - x2) - x2)) + 0.0) * x2) + (x1 * (0.0 + x1))"

[32m[1mStatus[22m[39m `~/.julia/environments/v1.10/Project.toml`
  [90m[7073ff75] [39mIJulia v1.25.0
  [90m[2f0a5bb0] [39mJGep v0.1.0 `https://github.com/maxreiss123/GEP_SBP_.git#master`


LoadError: ArgumentError: Package BenchmarkTools not found in current path.
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.