You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The entire process is illustrated in the code below:
Part 1: Generate some sample data ("train_data")
Part 2: Define the "fitness function" : the objective of my problem is to generate 7 random numbers :
"random_1" (between 80 and 120)
"random_2" (between "random_1" and 120)
"random_3" (between 85 and 120)
"random_4" (between random_2 and 120)
"split_1" (between 0 and 1)
"split_2" (between 0 and 1)
"split_3" (between 0 and 1 )
and use these numbers to perform a series of data manipulation procedures on the train data. At the end of these data manipulation procedures, a "total" mean variable is calculated.
Part 3: The purpose of the "genetic algorithm" is to find the set of these 7 numbers that produce the largest value of the "total".
Below, I illustrate this entire process :
Part 1
#load libraries
library(dplyr)
library(GA)
# create some data for this example
a1 = rnorm(1000,100,10)
b1 = rnorm(1000,100,5)
c1 = sample.int(1000, 1000, replace = TRUE)
train_data = data.frame(a1,b1,c1)
Part 2
#define fitness function
fitness <- function(random_1, random_2, random_3, random_4, split_1, split_2, split_3) {
#bin data according to random criteria
train_data <- train_data %>% mutate(cat = ifelse(a1 <= random_1 & b1 <= random_3, "a", ifelse(a1 <= random_2 & b1 <= random_4, "b", "c")))
train_data$cat = as.factor(train_data$cat)
#new splits
a_table = train_data %>%
filter(cat == "a") %>%
select(a1, b1, c1, cat)
b_table = train_data %>%
filter(cat == "b") %>%
select(a1, b1, c1, cat)
c_table = train_data %>%
filter(cat == "c") %>%
select(a1, b1, c1, cat)
#calculate quantile ("quant") for each bin
table_a = data.frame(a_table%>% group_by(cat) %>%
mutate(quant = quantile(c1, prob = split_1)))
table_b = data.frame(b_table%>% group_by(cat) %>%
mutate(quant = quantile(c1, prob = split_2)))
table_c = data.frame(c_table%>% group_by(cat) %>%
mutate(quant = quantile(c1, prob = split_3)))
#create a new variable ("diff") that measures if the quantile is bigger tha the value of "c1"
table_a$diff = ifelse(table_a$quant > table_a$c1,1,0)
table_b$diff = ifelse(table_b$quant > table_b$c1,1,0)
table_c$diff = ifelse(table_c$quant > table_c$c1,1,0)
#group all tables
final_table = rbind(table_a, table_b, table_c)
# calculate the total mean : this is what needs to be optimized
mean = mean(final_table$diff)
}
Part 3
#run the genetic algorithm (20 times to keep it short):
GA <- ga(type = "real-valued",
fitness = function(x) fitness(x[1], x[2], x[3], x[4], x[5], x[6], x[7]),
lower = c(80, 80, 80, 80, 0,0,0), upper = c(120, 120, 120, 120, 1,1,1),
popSize = 50, maxiter = 20, run = 20)
The above code (Part 1, Part 2, Part 3) all work fine.
Problem: Now, I am trying to produce some the of the visual plots from the tutorial:
First Plot - This Works:
plot(GA)
But I can't seem to produce the other plots from the tutorial:
Second Plot: Does Not Work
lbound <- 80
ubound <- 120
curve(fitness, from = lbound, to = ubound, n = 1000)
points(GA@solution, GA@fitnessValue, col = 2, pch = 19)
Error: Problem with `mutate()` column `cat`.
i `cat = ifelse(...)`.
x argument "random_3" is missing, with no default
Run `rlang::last_error()` to see where the error occurred.
Error in xy.coords(x, y) : 'x' and 'y' lengths differ
Third Plot : Does Not Work
random_1 <- random_2 <- seq(80, 120, by = 0.1)
f <- outer(x1, x2, fitness)
persp3D(x1, x2, fitness, theta = 50, phi = 20, col.palette = bl2gr.colors)
Error: Problem with `mutate()` column `cat`.
i `cat = ifelse(...)`.
x argument "random_3" is missing, with no default
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Error: Problem with `mutate()` column `cat`.
i `cat = ifelse(...)`.
x argument "random_3" is missing, with no default
Run `rlang::last_error()` to see where the error occurred.
Error in z[-1, -1] : object of type 'closure' is not subsettable
Fourth Plot: Does Not Work
filled.contour(random_1, random_2, fitness, color.palette = bl2gr.colors)
Error in min(x, na.rm = na.rm) : invalid 'type' (list) of argument
Can someone please show me how to fix these errors?
Thanks
The text was updated successfully, but these errors were encountered:
The issue is that you try to adapt R code for a one-dimensional and two-dimensional problems to your case, which is a 7-dimensional problem. Simply you can't do that and when you try R will give errors.
For instance, when you use
curve(fitness, from = lbound, to = ubound, n = 1000)
you are trying to plot the one-dimensional function fitness from lbound to ubound on a regular grid of n points. But fitness has 7 input arguments. How you can do that?
I am working with R. I am following this tutorial (https://cran.r-project.org/web/packages/GA/vignettes/GA.html) and am learning how to optimize functions using the "genetic algorithm".
The entire process is illustrated in the code below:
Part 1: Generate some sample data ("train_data")
Part 2: Define the "fitness function" : the objective of my problem is to generate 7 random numbers :
and use these numbers to perform a series of data manipulation procedures on the train data. At the end of these data manipulation procedures, a "total" mean variable is calculated.
Part 3: The purpose of the "genetic algorithm" is to find the set of these 7 numbers that produce the largest value of the "total".
Below, I illustrate this entire process :
Part 1
Part 2
Part 3
The above code (Part 1, Part 2, Part 3) all work fine.
Problem: Now, I am trying to produce some the of the visual plots from the tutorial:
First Plot - This Works:
plot(GA)
But I can't seem to produce the other plots from the tutorial:
Second Plot: Does Not Work
Third Plot : Does Not Work
Fourth Plot: Does Not Work
Can someone please show me how to fix these errors?
Thanks
The text was updated successfully, but these errors were encountered: