Project carried out on April 2020 for "Programming for Big Data" (Higher Diploma in Data Analytics at National College of Ireland)
First of all,the data sources can be downloaded from:
- White Wine: https://www.openml.org/d/40498
- Red Wine: https://www.openml.org/d/40691
In order to run the code the following libraries are needed:
library(tidyverse)
library(tidyr)
library(tibble)
library(ggplot2)
library(readr)
library(dplyr)
library(openintro)
library(tools)
library(GGally)
library(forcats)
library(ggpubr)
library(mvShapiroTest)
library(gridExtra)
library(moments)
library(cowplot)
library(nortest)
library(rstatix)
library(FSA)
library(ggcorrplot)
library(rpart)
library(rpart.plot)
library(caret)
library(e1071)
library(randomForest)
library(gdata)
library(DiagrammeR)