This task involves the statistical analysis with R of a diamond dataset, to find correlations between its variables, and if there is any value in predicting any of its value using an adapted machine learning model.
The csv dataset contains 50.000 observation, that describes diamonds using 10 different variables, such as the cut, the carat, the depth, the color, the clarity and the price. The project starts first with an exploratory data analysis to spot the relationships in data.
The commented and explained R code can be found in the .RMD file. More information about the task and its steps can be found here