New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Sentiment Analysis for Amazon Product Reviews" by Apala Guha #156

Closed
brunogrande opened this Issue Jan 17, 2017 · 0 comments

Comments

1 participant
@brunogrande
Member

brunogrande commented Jan 17, 2017

Description

Workshop leader: @apalaguha

Websites such as Facebook, Twitter, Amazon, IMDB carry a wealth of information in the form of Tweets, comments, and, reviews. Sentiment analysis aims to extract the sentiment from such natural language constructs. In this workshop we will be exploring building a machine learning pipeline for sentiment analysis on Amazon product reviews. We will be touching on text preprocessing, feature vector formation using hashing as well as neural networks, and regression model training.

Time and Place

Where: Room 7010, Library Research Commons, SFU Burnaby Campus

When: January 31st, 2017 @ 3:00 PM

Registration

TBA

Required Preparation

Assumed Knowledge

This workshop assumes you have basic Python knowledge.

Software Dependencies

You need to install Docker on your computer. You can find the Docker installer for your platform here.

Setup

# Commands on local computer
docker pull ubuntu
docker build -t sciprog:test https://gist.githubusercontent.com/brunogrande/451c49dfed3582760b0e1e86b3f9aa6e/raw/15b4c6ed8bcbad670c66538f10d053f945e80a70/Dockerfile
docker run -it sciprog:test

# You will now be the root user on the Docker instance
adduser foobar  # Set any password, such as "123"
su foobar

# You will now be the foobar user on the Docker instance
cd
wget https://gist.githubusercontent.com/brunogrande/451c49dfed3582760b0e1e86b3f9aa6e/raw/15b4c6ed8bcbad670c66538f10d053f945e80a70/analyze_sentiment.py
wget http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/reviews_Pet_Supplies_5.json.gz
export PATH="/home/spark/spark-2.0.0-bin-hadoop2.6/bin:$PATH"

# Replace <num_cores> with the number of cores on your local computer
# If you are not sure, set it to 1
spark-submit --master local[<num_cores>] --deploy-mode client analyze_sentiment.py reviews_Pet_Supplies_5.json.gz

Links

Solution Script: Link

@brunogrande brunogrande added this to the Spring 2017 milestone Jan 17, 2017

@dfornika dfornika added this to Completed Workshops in Workshop Planning Feb 6, 2017

@brunogrande brunogrande removed the workshop label May 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment