## Part 1: Existing Machine Learning Services

<a href="https://colab.research.google.com/github/peckjon/hosting-ml-as-microservice/blob/master/part1/score_reviews_via_service.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Obtain labelled reviews

In order to test any of the sentiment analysis APIs, we need a labelled dataset of reviews and their sentiment polarity. We'll use NLTK to download the movie_reviews corpus.

In [111]:
from nltk import download

download('movie_reviews')

[nltk_data] Downloading package movie_reviews to /home/js/nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!


True

### Load the data

The files in movie_reviews have already been divided into two sets: positive ('pos') and negative ('neg'), so we can load the raw text of the reviews into two lists, one for each polarity.

In [112]:
from nltk.corpus import movie_reviews

# extract words from reviews, pair with label

reviews_pos = []
for fileid in movie_reviews.fileids('pos'):
    review = movie_reviews.raw(fileid)
    reviews_pos.append(review)

reviews_neg = []
for fileid in movie_reviews.fileids('neg'):
    review = movie_reviews.raw(fileid)
    reviews_neg.append(review)

In [113]:
n_pos = len(reviews_pos)
n_neg = len(reviews_neg)
print(f"Number of postive reviews to score = {n_pos}")
print(f"Number of negative reviews to score = {n_neg}")

Number of postive reviews to score = 1000
Number of negative reviews to score = 1000


### Connect to the scoring API

Fill in this function with code that connects to one of these APIs, and uses it to score a single review:

* [Amazon Comprehend: Detect Sentiment](https://docs.aws.amazon.com/comprehend/latest/dg/API_DetectSentiment.html)
* [Google Natural Language: Analyzing Sentiment](https://cloud.google.com/natural-language/docs/analyzing-sentiment)
* [Azure Cognitive Services: Sentiment Analysis](https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/how-tos/text-analytics-how-to-sentiment-analysis)
* [Algorithmia: Sentiment Analysis](https://algorithmia.com/algorithms/nlp/SentimentAnalysis)

Your function must return either 'pos' or 'neg', so you'll need to make some decisions about how to map the results of the API call to one of these values. For example, Amazon Comprehend can return "NEUTRAL" or "MIXED" for the Sentiment -- if this happens, you may with to inspect the numeric values under the SentimentScore to see whether it leans toward positive or negative.


In [114]:
import os
from dotenv import find_dotenv, load_dotenv

# find .env by traversing up directories until found
dotenv_path = find_dotenv()

# load up the entries as environment variables
load_dotenv(dotenv_path)

# key and endpoint for azure service
key = os.environ.get("AZURE_TEXT_KEY")
endpoint = os.environ.get("AZURE_TEXT_ENDPOINT")


In [115]:
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

def authenticate_client():
    ta_credential = AzureKeyCredential(key)
    text_analytics_client = TextAnalyticsClient(
            endpoint=endpoint, credential=ta_credential)
    return text_analytics_client

client = authenticate_client()

In [116]:
def score_review(review):
    # call the service and return 'pos' or 'neg'
    response = client.analyze_sentiment(documents = [review])[0]
    d = response.confidence_scores 
    d_max_sent = {key: value for (key, value) in d.items() if value == max(d.values()) }
    major_sentiment = list(d_max_sent.keys())[0]
    
    if major_sentiment == 'negative':
        return 'neg'
    elif major_sentiment == 'positive':
        return 'pos'
    else:
        return None   # Not scored


### Score each review

Now, we can use the function you defined to score each of the reviews

In [117]:
text_limit = 5120
results_pos = []

for i, review in enumerate(reviews_pos):
    print(f"Pos: {i/n_pos}")
    result = score_review(review[:text_limit])
    results_pos.append(result)

results_neg = []
for i, review in enumerate(reviews_neg):
    print(f"Neg: {i/n_neg}")
    result = score_review(review[:text_limit])
    results_neg.append(result)

Pos: 0.0
Pos: 0.001
Pos: 0.002
Pos: 0.003
Pos: 0.004
Pos: 0.005
Pos: 0.006
Pos: 0.007
Pos: 0.008
Pos: 0.009
Pos: 0.01
Pos: 0.011
Pos: 0.012
Pos: 0.013
Pos: 0.014
Pos: 0.015
Pos: 0.016
Pos: 0.017
Pos: 0.018
Pos: 0.019
Pos: 0.02
Pos: 0.021
Pos: 0.022
Pos: 0.023
Pos: 0.024
Pos: 0.025
Pos: 0.026
Pos: 0.027
Pos: 0.028
Pos: 0.029
Pos: 0.03
Pos: 0.031
Pos: 0.032
Pos: 0.033
Pos: 0.034
Pos: 0.035
Pos: 0.036
Pos: 0.037
Pos: 0.038
Pos: 0.039
Pos: 0.04
Pos: 0.041
Pos: 0.042
Pos: 0.043
Pos: 0.044
Pos: 0.045
Pos: 0.046
Pos: 0.047
Pos: 0.048
Pos: 0.049
Pos: 0.05
Pos: 0.051
Pos: 0.052
Pos: 0.053
Pos: 0.054
Pos: 0.055
Pos: 0.056
Pos: 0.057
Pos: 0.058
Pos: 0.059
Pos: 0.06
Pos: 0.061
Pos: 0.062
Pos: 0.063
Pos: 0.064
Pos: 0.065
Pos: 0.066
Pos: 0.067
Pos: 0.068
Pos: 0.069
Pos: 0.07
Pos: 0.071
Pos: 0.072
Pos: 0.073
Pos: 0.074
Pos: 0.075
Pos: 0.076
Pos: 0.077
Pos: 0.078
Pos: 0.079
Pos: 0.08
Pos: 0.081
Pos: 0.082
Pos: 0.083
Pos: 0.084
Pos: 0.085
Pos: 0.086
Pos: 0.087
Pos: 0.088
Pos: 0.089
Pos: 0.09
Pos: 0.091

Pos: 0.753
Pos: 0.754
Pos: 0.755
Pos: 0.756
Pos: 0.757
Pos: 0.758
Pos: 0.759
Pos: 0.76
Pos: 0.761
Pos: 0.762
Pos: 0.763
Pos: 0.764
Pos: 0.765
Pos: 0.766
Pos: 0.767
Pos: 0.768
Pos: 0.769
Pos: 0.77
Pos: 0.771
Pos: 0.772
Pos: 0.773
Pos: 0.774
Pos: 0.775
Pos: 0.776
Pos: 0.777
Pos: 0.778
Pos: 0.779
Pos: 0.78
Pos: 0.781
Pos: 0.782
Pos: 0.783
Pos: 0.784
Pos: 0.785
Pos: 0.786
Pos: 0.787
Pos: 0.788
Pos: 0.789
Pos: 0.79
Pos: 0.791
Pos: 0.792
Pos: 0.793
Pos: 0.794
Pos: 0.795
Pos: 0.796
Pos: 0.797
Pos: 0.798
Pos: 0.799
Pos: 0.8
Pos: 0.801
Pos: 0.802
Pos: 0.803
Pos: 0.804
Pos: 0.805
Pos: 0.806
Pos: 0.807
Pos: 0.808
Pos: 0.809
Pos: 0.81
Pos: 0.811
Pos: 0.812
Pos: 0.813
Pos: 0.814
Pos: 0.815
Pos: 0.816
Pos: 0.817
Pos: 0.818
Pos: 0.819
Pos: 0.82
Pos: 0.821
Pos: 0.822
Pos: 0.823
Pos: 0.824
Pos: 0.825
Pos: 0.826
Pos: 0.827
Pos: 0.828
Pos: 0.829
Pos: 0.83
Pos: 0.831
Pos: 0.832
Pos: 0.833
Pos: 0.834
Pos: 0.835
Pos: 0.836
Pos: 0.837
Pos: 0.838
Pos: 0.839
Pos: 0.84
Pos: 0.841
Pos: 0.842
Pos: 0.843
Pos: 0.84

Neg: 0.506
Neg: 0.507
Neg: 0.508
Neg: 0.509
Neg: 0.51
Neg: 0.511
Neg: 0.512
Neg: 0.513
Neg: 0.514
Neg: 0.515
Neg: 0.516
Neg: 0.517
Neg: 0.518
Neg: 0.519
Neg: 0.52
Neg: 0.521
Neg: 0.522
Neg: 0.523
Neg: 0.524
Neg: 0.525
Neg: 0.526
Neg: 0.527
Neg: 0.528
Neg: 0.529
Neg: 0.53
Neg: 0.531
Neg: 0.532
Neg: 0.533
Neg: 0.534
Neg: 0.535
Neg: 0.536
Neg: 0.537
Neg: 0.538
Neg: 0.539
Neg: 0.54
Neg: 0.541
Neg: 0.542
Neg: 0.543
Neg: 0.544
Neg: 0.545
Neg: 0.546
Neg: 0.547
Neg: 0.548
Neg: 0.549
Neg: 0.55
Neg: 0.551
Neg: 0.552
Neg: 0.553
Neg: 0.554
Neg: 0.555
Neg: 0.556
Neg: 0.557
Neg: 0.558
Neg: 0.559
Neg: 0.56
Neg: 0.561
Neg: 0.562
Neg: 0.563
Neg: 0.564
Neg: 0.565
Neg: 0.566
Neg: 0.567
Neg: 0.568
Neg: 0.569
Neg: 0.57
Neg: 0.571
Neg: 0.572
Neg: 0.573
Neg: 0.574
Neg: 0.575
Neg: 0.576
Neg: 0.577
Neg: 0.578
Neg: 0.579
Neg: 0.58
Neg: 0.581
Neg: 0.582
Neg: 0.583
Neg: 0.584
Neg: 0.585
Neg: 0.586
Neg: 0.587
Neg: 0.588
Neg: 0.589
Neg: 0.59
Neg: 0.591
Neg: 0.592
Neg: 0.593
Neg: 0.594
Neg: 0.595
Neg: 0.596
Neg: 0.5

### Calculate accuracy

For each of our known positive reviews, we can count the number which our function scored as 'pos', and use this to calculate the % accuracy. We repeaty this for negative reviews, and also for overall accuracy.

In [118]:
correct_pos = results_pos.count('pos')
accuracy_pos = float(correct_pos) / len(results_pos)
correct_neg = results_neg.count('neg')
accuracy_neg = float(correct_neg) / len(results_neg)
correct_all = correct_pos + correct_neg
accuracy_all = float(correct_all) / (len(results_pos)+len(results_neg))

print('Positive reviews: {}% correct'.format(accuracy_pos*100))
print('Negative reviews: {}% correct'.format(accuracy_neg*100))
print('Overall accuracy: {}% correct'.format(accuracy_all*100))

Positive reviews: 23.400000000000002% correct
Negative reviews: 98.4% correct
Overall accuracy: 60.9% correct
