Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R if in for loop error - how can I save selected model? #539

Closed
TomJang opened this issue Feb 23, 2022 · 5 comments
Closed

R if in for loop error - how can I save selected model? #539

TomJang opened this issue Feb 23, 2022 · 5 comments

Comments

@TomJang
Copy link

TomJang commented Feb 23, 2022

I am new to R and have difficulties using "if" and "for-loop". sorry if it is duplicated.

as you can see a chuck of a code below, I try to create 100 lm models and save when the R is more than 0.7.

However, the code saved all 100 lm models.

I suspect the statement (!is.na(lm.cv.r[i]) < 0.60) is wrong but I cannot figure it out.

let's use USArrests data as an example

``
data("USArrests")
head(USArrests)
df.norm <- USArrests

set.seed(100)
lm.cv.mse <- NULL
lm.cv.r <- NULL
k <- 100

for(i in 1:k){

index.cv <- sample(1:nrow(df.norm),round(0.8*nrow(df.norm)))
df.cv.train <- df.norm[index.cv, ]
df.cv.test <- df.norm[-index.cv, ]

lm.cv <- glm(Rape~., data = df.cv.train) 

lm.cv.predicted <- predict(lm.cv, df.cv.test)

lm.cv.mse[i] <- sum((df.cv.test$target - lm.cv.predicted)^2)/nrow(df.cv.test)
lm.cv.r[i] <- as.numeric(round(cor(lm.cv.predicted, df.cv.test$target, method = "pearson"), digits = 3))


if (!is.na(lm.cv.r[i]) > 0.70){
  saveRDS(lm.cv, file = paste("lm.cv", lm.cv.r[i], ".rds", sep = ''))
}

}
``

@jlooper
Copy link
Contributor

jlooper commented Feb 23, 2022

hi @R-icntay could you lend your expertise here please?

@R-icntay
Copy link
Contributor

R-icntay commented Feb 26, 2022

Hello @TomJang, @jlooper

Firstly, thank you for providing a reproducible example. You were almost there, so good job!
The only thing that was missing is evaluating for a second condition, i.e if lm.cv.r[i] > 0.70. I have modified your example and it works as expected and ensures that R does not accidentally overwrite a similar previous value.

data("USArrests")
head(USArrests)
df.norm <- USArrests

set.seed(100)
lm.cv.mse <- NULL
lm.cv.r <- NULL
k <- 100

for(i in 1:k){

index.cv <- sample(1:nrow(df.norm),round(0.8*nrow(df.norm)))
df.cv.train <- df.norm[index.cv, ]
df.cv.test <- df.norm[-index.cv, ]

lm.cv <- glm(Rape~., data = df.cv.train) 

lm.cv.predicted <- predict(lm.cv, df.cv.test)

lm.cv.mse[i] <- sum((df.cv.test$rape - lm.cv.predicted)^2)/nrow(df.cv.test)
lm.cv.r[i] <- as.numeric(round(cor(lm.cv.predicted, df.cv.test$Rape, method = "pearson"), digits = 3))


if (!is.na(lm.cv.r[i]) && lm.cv.r[i] > 0.70){
  saveRDS(lm.cv, file = paste("lm.cv", i, lm.cv.r[i], ".rds", sep = '_'))
}
}

We invite you to check to check out our R lessons that show you how to build Machine Learning models using the Tidymodels framework: https://github.com/microsoft/ML-For-Beginners.

Do enjoy the ride and feel free to reach out in case of any difficulty.
Happy leaRning!

@TomJang
Copy link
Author

TomJang commented Feb 28, 2022 via email

@jlooper
Copy link
Contributor

jlooper commented Mar 1, 2022

all set? should I close this? thanks everyone!

@R-icntay
Copy link
Contributor

R-icntay commented Mar 1, 2022

Yes yes Jen.

All good here!

@jlooper jlooper closed this as completed Mar 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants