This notebook illustrates one way to address the "no training data" scenario in NLP. If there is a cloud based service provider offering the kind of NLP system you are looking for, that can be a good starting point in such cases.  

As discussed in the talk, we still need some sort of an evaluation dataset. I am using the "Sentiment Labelled Sentences](http://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences) dataset from UCI repository. This dataset consists of 500 positive and 500 negative sentiment sentences for each of the three sources: amazon.com, yelp.com and imdb.com. I split them into two groups: train (imdb+yelp) and test (amazon). Since this is a tutorial, and not a real world scenario, I am using the **test** dataset to evaluate Azure's sentiment analysis API. I won't use the training data in this example, as our goal is to use the off the shelf services and check their efficiency. 


To setup Azure for using this service, [visit this page and follow instructions](https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/quickstarts/client-libraries-rest-api?tabs=version-3-1&pivots=programming-language-python). I am using their free tier. 


In [1]:
key = "XXXXXXXXXX"
endpoint = "XXXXXXX"
#enter your own key/endpoint here. 

In [2]:
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

In [3]:
def authenticate_client():
    ta_credential = AzureKeyCredential(key)
    text_analytics_client = TextAnalyticsClient(
            endpoint=endpoint, credential=ta_credential)
    return text_analytics_client

client = authenticate_client()


In [5]:
#read in the test data, and send it to azure analytics API for sentiment analysis.

testfilepath = "../files/test_labelled.txt" #tab seperated file.
sentences = []
sentiments = [] #0 is negative, 1 is positive
preds = [] #0 neg, 1 positive, 2 neutral or mixed
preds_dict = {'positive':1, 'negative':0, 'neutral':2, 'mixed':2}
count = 0
for line in open(testfilepath):
    sentence, sentiment = line.strip().split("\t")
    pred = preds_dict[client.analyze_sentiment(documents = [sentence])[0].sentiment]
    preds.append(pred)
    sentences.append(sentence)
    sentiments.append(int(sentiment))
    
    #adding this part to check how far did it go.
    count +=1
    if count%50 ==0:
        print("Completed processing ", count, " sentences")
        
#Note: You can send batch requests to Azure, instead of sending one after another
#like I am doing, although there is a limit on the number of items per batch. 

Completed processing  50  sentences
Completed processing  100  sentences
Completed processing  150  sentences
Completed processing  200  sentences
Completed processing  250  sentences
Completed processing  300  sentences
Completed processing  350  sentences
Completed processing  400  sentences
Completed processing  450  sentences
Completed processing  500  sentences
Completed processing  550  sentences
Completed processing  600  sentences
Completed processing  650  sentences
Completed processing  700  sentences
Completed processing  750  sentences
Completed processing  800  sentences
Completed processing  850  sentences
Completed processing  900  sentences
Completed processing  950  sentences
Completed processing  1000  sentences


In [6]:
#rite all predictions to a file as a backup
print(len(sentences), len(sentiments), len(preds))

fw=open("actual-preds.csv", "w") 
fw.write("actual,predicted"+"\n")
for i in range(0,1000):
    fw.write(str(sentiments[i])+","+str(preds[i]))
    fw.write("\n")
fw.close()

1000 1000 1000


In [7]:
#Look at how good Azure's API was, comparing its predictions with the actual sentiment categories in this dataset. 
from sklearn.metrics import classification_report
print(classification_report(sentiments, preds))

              precision    recall  f1-score   support

           0       0.96      0.81      0.88       500
           1       0.93      0.88      0.90       500
           2       0.00      0.00      0.00         0

    accuracy                           0.84      1000
   macro avg       0.63      0.56      0.59      1000
weighted avg       0.95      0.84      0.89      1000



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


We have 84% accuracy without building any sentiment classifier ourselves! That is a great starting point (however, keep in mind, we don't find such readymade solutions to our real world problems most of the time!)

In [8]:
#Let us look at the confusion matrix for Azure's predictions
from sklearn.metrics import confusion_matrix
print(confusion_matrix(sentiments,preds,labels=[0,1,2]))

[[405  34  61]
 [ 15 439  46]
 [  0   0   0]]


In [9]:
#Just how many of our sentences were predicted as "neutral" or "mixed"?
import collections
print(collections.Counter(preds))

Counter({1: 473, 0: 420, 2: 107})


So, over 10% of our sentences are identified as neither positive nor negative by Azure's API. If this was a case with our real world scenario, this is perhaps not a good long term option, although it is a good starting point when we don't have any data. Perhaps one can use this as a solution for this problem in a short term, and move to weak supervision/semi-supervised/active learning soon? 