Skip to content

A iPython notebook that tests Graphify's feature extraction and selection algorithm as a logistic regression classifier

License

Notifications You must be signed in to change notification settings

kbastani/sentiment-analysis-movie-reviews

Repository files navigation

Movie Review Sentiment Analysis Benchmark

An iPython notebook that tests Graphify's feature extraction and selection algorithm as a logistic regression classifier. This classifier is benchmarked against Stanford's Large Movie Review Dataset and Cornell Movie Review Dataset.

Content

Classification Accuracy

###Feature learning

  • Features are extracted and learned using Java and Neo4j, and evaluated by building a logistic regression classifier on a weighted tf-idf feature vector.

Viewing the notebooks online

The content of the notebooks can be viewed online through nbviewer.ipython.org.

Installing Python

For a true interactive use of the notebooks you need to install Python, IPython (for notebooks) and the required libraries scikit-learn, matplotlib and numpy.

Windows

You can install everything at once using a complete scientific Python distribution. Two good ones are the Enthought Python distribution (EPD, free for academic use) or Python-(x, y) (free for everyone).

Mac

For OS X, you can also use the Enthought Python distribution or the scipy-superpack.

Linux

Just use your package manager, for example on ubuntu or debian, use apt-get install python ipython python-matplotlib python-numpy python-sklearn.

Version requirements

You need to make sure to have at least IPython >= 0.11 installed. You can update using the programm easy_install.

Installing Scikit-learn

More tips on installing scikit-learn can be found on the scikit-learn website.

More Resources

This repository was modeled off of tutorial_ml_gkbionics.

About

A iPython notebook that tests Graphify's feature extraction and selection algorithm as a logistic regression classifier

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published