A movie review sentiment classifier based on the IMDB ratings dataset from Stanford.
Stochastic Gradient Descent is a great option for building data science web apps, where new data is constantly coming in (IOT is a use-case too) and for processing large amounts of data on a single machine.
Assuming you have sklearn installed:
- Install Pyprind (conda/pip install pyprind)
- After cloning/uzipping file, unzip DATA\ratings_shuffled.zip into DATA\
From Command line:
- Navigate to project folder that contains sgd_movie_classifier
- Run command: python sgd_movie_classifier.py
Otherwise, load in your favorite IDE.
Once the model is built, you can use the model in your own app. See link above to see this model in action.
Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011).