Skip to content

Latest commit

 

History

History
30 lines (19 loc) · 1.68 KB

README.md

File metadata and controls

30 lines (19 loc) · 1.68 KB

Imbibing in NLP: an analysis of expert wine reviews

by Michelle L. Gill, Ph.D.


Under Construction, 2016/09/26
Note: this repo is under clean-up and is currently missing a few notebooks. It will be updated to completeness within the week. This note will be deleted when code clean-up is complete.


Summary

This is my fourth project for the Summer 2016 Metis Data Science Bootcamp, which incorporated unsupervised machine learning and natural language processing. Expert wine reviews were scraped and used in K-means clustering, latent semantic analysis (LSA), and latent dirichlet allocation (LDA). Sentiment analysis was also performed to see if review sentiment was higher in vintage years.

Blog Post

A blog post on themodernscientist will be available the week of 2016/09/26. This text will be updated when the website is posted.

Repo Contents

  • environment.yml: list of conda python libraries that were used during analysis
  • figures: images used in the presentation
  • notebooks: Jupyter notebooks used for analysis
  • presentation: A PDF version of the final presentation
  • visualization: D3 visualization of LDA clusters. A movie of the visualization is also available

Data Sources

  1. Expert wine reviews from Wine Enthusiast
  2. User reviews and tasting notes from various other websites