Wine Deep Learning
After watching Somm (a documentary on master sommeliers) I wondered how I could create a predictive model to identify wines through blind tasting like a master sommelier. My overall goal is to create a model that can identify the variety, winery, and location of a wine based on a description that a sommelier could give after tasting a wine. Another fun future project would be to give wine recommendations based on food dishes. If anyone has any ideas or insights please share them.
WineEnthusiast review data
As a first step to creating my sommelier model was gathering some data. I started by scraping ~150k wine reviews from WineEnthusiast.
The data consists of 10 fields:
- Points: the number of points WineEnthusiast rated the wine on a scale of 1-100 (though they say they only post reviews for wines that score >=80)
- Title: the title of the wine review, which often contains the vintage if you're interested in extracting that feature
- Variety: the type of grapes used to make the wine (ie Pinot Noir)
- Description: a few sentences from a sommelier describing the wine's taste, smell, look, feel, etc.
- Country: the country that the wine is from
- Province: the province or state that the wine is from
- Region 1: the wine growing area in a province or state (ie Napa)
- Region 2: sometimes there are more specific regions specified within a wine growing area (ie Rutherford inside the Napa Valley), but this value can sometimes be blank
- Winery: the winery that made the wine
- Designation: the vineyard within the winery where the grapes that made the wine are from
- Price: the cost for a bottle of the wine
- Taster Name: name of the person who tasted and reviewed the wine
- Taster Twitter Handle: Twitter handle for the person who tasted ane reviewed the wine
UPDATED 11/24/2017 Title, Taster Name, and Taster Twitter Handle were collected and the issue with duplicate entires was resolved
I did not include the dataset that I scraped in this repository because of size, but feel free to run the scraper on your own or use the dataset that I provided on Kaggle.
Places you may have seen this
Connect with me
If you'd like to collaborate on a project, learn more about me, or just say hi, feel free to contact me using any of the social channels listed below.