Skip to content

lindseyberlin/Mod5Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goals

Our goal is to create classification models using a collection of wine reviews on Kaggle, originally from Wine Enthusiast.

Lindsey will be creating an a classification model that attempts to discern whether a wine is white or red.

Harrison will be seeing if he can predict the score or price of the wine based on the length of the review, using a classification model after creating score and price categories.

Results

Lindsey iterated through many different types of classification models, to arrive at an XGBoosted model which was able to predict whether a wine was white or red with 80% accuracy on the test data.

XGBoost model results

Harrison found that review length as a predictor of point value was not much better than random chance, but review length was a much better predictor of a wine's price - but only for wines under $100! For wines above $100, the review length was not a good predictor of price.

Description length versus price boxplots

Harrison also worked through a supplemental dataset on wine composition to see the effects of the random forest classifier in a different context.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published