perform_regression

What does the function do?

Splits data into 80-20%
Does LOOCV on training set to pick the best hyperparameter model: in case of the stepwise, it is the number of predictors that is allowed in the final model (nvmax: data.frame(nvmax = 1:46)) in case of the ridge and lasso regressions is the lambda (lambda : lambda = 10^seq(-3, 3, length = 100)
Predicts the result based on the best model chosen on the training set (the lowest RMSE value)

INPUT

my_df - is the data
split - the proportion of the testing/training set (e.g. .80)
type_regres
- "leapForward" - forward stepwise
- "leapBackward" - backward stepwise
- "ridge" - ridge regression (makes the coefficient close to zero but never 0)
- "lasso" - lasso regression (makes the coefficients zero of some predictors)
seed - a random number: different seed values result in different data splits

OUTPUT

data table with RMSE, R, MSE values
coefficient values of the best model that made the prediction on the testing set
the best model formula
the plot of the important predictors
the regression plot of prediction on the testing set

perform_regression_permute

What does the function do?

INPUT

mydtt - is the data
Niteration - how many times you want to permute
data_split - the proportion of the testing/training set (e.g. .80)
type_regres
- "leapForward" - forward stepwise
- "leapBackward" - backward stepwise
- "ridge" - ridge regression (makes the coefficient close to zero but never 0)
- "lasso" - lasso regression (makes the coefficients zero of some predictors)

OUTPUT

data table with RMSE, R, MSE values
coefficient values of the best model that made the prediction on the testing set (there will be best model per each permutation)
the array of seeds used to split data