## Part 1: Existing Machine Learning Services

<a href="https://colab.research.google.com/github/peckjon/hosting-ml-as-microservice/blob/master/part1/score_reviews_via_service.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Setting up the environment**

In [12]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


### Obtain labelled reviews

In order to test any of the sentiment analysis APIs, we need a labelled dataset of reviews and their sentiment polarity. We'll use NLTK to download the movie_reviews corpus.

In [13]:
from nltk import download
import pandas as pd

download('movie_reviews')

[nltk_data] Downloading package movie_reviews to /root/nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!


True

### Load the data

The files in movie_reviews have already been divided into two sets: positive ('pos') and negative ('neg'), so we can load the raw text of the reviews into two lists, one for each polarity.

In [14]:
from nltk.corpus import movie_reviews

# extract words from reviews, pair with label
reviews_pos = []
for fileid in movie_reviews.fileids('pos'):
    review = movie_reviews.raw(fileid)
    if(len(review)<= 5000): # since Amazon Comprehend has a limit of 5000 characters
      reviews_pos.append(review)

reviews_neg = []
for fileid in movie_reviews.fileids('neg'):
    review = movie_reviews.raw(fileid)
    if(len(review)<= 5000): # since Amazon Comprehend has a limit of 5000 characters
      reviews_neg.append(review)

print(f'There are {len(reviews_pos)} positive reviews')
print(f'There are {len(reviews_neg)} negative reviews')

There are 745 positive reviews
There are 854 negative reviews


### Connect to the scoring API

Fill in this function with code that connects to one of these APIs, and uses it to score a single review:

* [Amazon Comprehend: Detect Sentiment](https://docs.aws.amazon.com/comprehend/latest/dg/API_DetectSentiment.html)
* [Google Natural Language: Analyzing Sentiment](https://cloud.google.com/natural-language/docs/analyzing-sentiment)
* [Azure Cognitive Services: Sentiment Analysis](https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/how-tos/text-analytics-how-to-sentiment-analysis)
* [Algorithmia: Sentiment Analysis](https://algorithmia.com/algorithms/nlp/SentimentAnalysis)

Your function must return either 'pos' or 'neg', so you'll need to make some decisions about how to map the results of the API call to one of these values. For example, Amazon Comprehend can return "NEUTRAL" or "MIXED" for the Sentiment -- if this happens, you may with to inspect the numeric values under the SentimentScore to see whether it leans toward positive or negative.


In [15]:
# load AWS API secrets from an external file
credentials = pd.read_csv("/content/drive/MyDrive/Hosting ML-Service/credentials.csv")

aws_access_key_id = credentials['aws_access_key_id'][0]
aws_secret_access_key = credentials['aws_secret_access_key'][0]

In [18]:
# install the boto3 SDK that allows us to use AWS services from Python
import boto3

def score_review(review):
  # connect to AWS 
  comprehend = boto3.client(service_name='comprehend', region_name='us-east-1',  aws_access_key_id=aws_access_key_id,
      aws_secret_access_key=aws_secret_access_key)

  # get the sentiment
  detectSentiment = comprehend.detect_sentiment(Text=review, LanguageCode='en')
  sentiment = detectSentiment['SentimentScore']

  # classify the sentiment reviews as POSITIVE or NEGATIVE
  if sentiment['Negative'] > sentiment['Positive']:
      return 'neg'
  elif sentiment['Negative'] < sentiment['Positive']:
      return 'pos'
  else:
      return 'pos'

### Score each review

Now, we can use the function you defined to score each of the reviews.

#### *Note on Testing*

While most of the services listed have free tiers they may be limited to a few thousand requests per week or month, depending on the service. On some platforms you may be billed after reaching that limit. For this reason it is recommended to first test on a smaller set of the reviews, `subset_pos` and `subset_neg`. Once you're happy with your code swap those subsets for the full review sets `reviews_pos` and `reviews_neg`.

In [19]:
# create 2 smaller subsets for testing
subset_pos = reviews_pos[:10]
subset_neg = reviews_neg[:10]

results_pos = []
for review in subset_pos:
    result = score_review(review)
    results_pos.append(result)

results_neg = []
for review in subset_neg:
    result = score_review(review)
    results_neg.append(result)

### Calculate accuracy

For each of our known positive reviews, we can count the number which our function scored as 'pos', and use this to calculate the % accuracy. We repeaty this for negative reviews, and also for overall accuracy.

In [20]:
correct_pos = results_pos.count('pos')
accuracy_pos = float(correct_pos) / len(results_pos)
correct_neg = results_neg.count('neg')
accuracy_neg = float(correct_neg) / len(results_neg)
correct_all = correct_pos + correct_neg
accuracy_all = float(correct_all) / (len(results_pos)+len(results_neg))

print('Positive reviews: {}% correct'.format(accuracy_pos*100))
print('Negative reviews: {}% correct'.format(accuracy_neg*100))
print('Overall accuracy: {}% correct'.format(accuracy_all*100))

Positive reviews: 40.0% correct
Negative reviews: 70.0% correct
Overall accuracy: 55.00000000000001% correct
