Template for finding the best potential predictive models for a given dataset, using a shotgun approach.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
classification_data
predictive_analysis_classification_files/figure-markdown_github
predictive_analysis_regression_files/figure-markdown_github
regression_data
.gitignore
README.md
predictive_analysis_classification.Rmd
predictive_analysis_classification.md
predictive_analysis_regression.Rmd
predictive_analysis_regression.md

README.md

r-predictive-analysis-template

The main objective of the associated *.Rmds are for finding the best potential predictive models for a given dataset, using a shotgun approach of trying many different models with reasonable defaults.

The intent is not to skip the thinking process. The intent is to get a lot of information in a relatively short amount of time.

The information will help determine which potential models are worth spending time on and further optimizing/improving.

predictive_analysis_regression.Rmd

The two main outputs for the regression analysis are:

  • shotgun approach: plot showing cross-validated RMSE and R-Squared for a variety of models (with reasonable defaults) training on a training set.
    • For example:

spot_check

  • final models: plot showing RMSE, MAE, and correlation on the top x (e.g. 5) models that have been retrained on the entire training set (as oppossed to cross-validated), and the tested on the test set (data-points that the model has not seen).
    • For example:

final_models

predictive_analysis_classification.Rmd

The two main outputs for the classification analysis are:

  • shotgun approach: plot showing cross-validated ROC/AUC, Sensitivity, and Specificity for a variety of models (with reasonable defaults) training on a training set.

    • For example: spot_check
  • final models: plot showing ROC/AUC, Sensitivity, and Specificity, on the top x (e.g. 5) models that have been retrained on the entire training set (as oppossed to cross-validated), and the tested on the test set (data-points that the model has not seen).

    • For example:

final_models