#Text Analytics with Microsoft Cognitive Services

[Cognitive Services](https://www.microsoft.com/cognitive-services) consists of a series of APIs for advanced text, vision, and speech integration. 

In order to take advantage of these APIs such as Text Analytics, you can sign up in the [Azure Portal](https://portal.azure.com). You can also sign up for Preview APIs such as Lingustic Analysis directly from the Cognitive Services website.

![](https://github.com/deldersveld/ms-cognitive/raw/master/images/cogntive-services.PNG)

## What does the Text Analytics API offer?

This demo focuses on connecting to the [Text Analytics API](https://www.microsoft.com/cognitive-services/en-us/text-analytics/documentation) to obtain sentiment scores and extract key phrases. In addition to those endpoints, the API also offers language and topic detection. The R language is used for the demo, but any language or application that can send a POST request can take advantage of the API.

![](https://github.com/deldersveld/ms-cognitive/raw/master/images/text-analytics-endpoints.PNG)


## Background

For data, we use a small sample of Amazon reviews ([original source](https://www.kaggle.com/snap/amazon-fine-food-reviews)). We will connect to and download a CSV file from Azure Blob storage, then process the data using R. We connect to the Sentiment endpoint to obtain a score from 0 (negative) to 1 (positive). We then connect to the Key Phrases endpoint to get a list of words or phrases that are helpful for categorizing each review. 

Note that by using the API free tier, you can process 5,000 transactions per month. In the case of this demo, we send 100 records to the API and obtain sentiment scores and phrases for each record. Since we connect to two endpoints with 100 records, we use 200 transactions.


## Sign Up in the Azure Portal

If you do not already have a key for the Text Analytics API, you can sign up in the Azure Portal. If you have an existing API key, you can skip this section.

Go to http://portal.azure.com, login, and go to New (+), Data + Analytics, and select Cognitive Services APIs.

![](https://github.com/deldersveld/ms-cognitive/raw/master/images/portal-sign-up-1.PNG)

Enter an Account Name, select your Azure subscription, and choose "Text Analytics API" as your API Type. Select "Free" or another option as your Pricing Tier, then complete the rest of the form. When ready, click Create at the bottom of the panel.

![](https://github.com/deldersveld/ms-cognitive/raw/master/images/portal-sign-up-2.PNG)

Open the Cognitive Services account and click Settings, then Keys. Copy the KEY 1 value for later use.

![](https://github.com/deldersveld/ms-cognitive/raw/master/images/portal-sign-up-3.PNG)

## Setup R environment

Once you have an API key, you can run the following R code to define the base URL for the Text Analytics API and create a few helper functions for connecting to various API endpoints. This code is hosted in a Jupyter notebook, but it should run without issue in any R environment.

In [12]:
library(httr)
library(jsonlite)
library(dplyr)

base.url <- "https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/"

Request <- function(call.url, call.source){
  headers <- add_headers("Content-Type" = "application/json", "Ocp-Apim-Subscription-Key" = key)
  raw.result <- POST(call.url, headers, body = call.source)
  text.result <- fromJSON(content(raw.result, "text"))
  final.result <- as.data.frame(text.result[1])
  return(final.result)
}

Sentiment <- function(source){
  sentiment.url <- paste0(base.url, "sentiment")
  sentiment.body <- toJSON(list(documents = source))
  sentiment.result <- Request(sentiment.url, sentiment.body)
  colnames(sentiment.result) <- c("sentiment", "id")
  return(sentiment.result)
}

KeyPhrases <- function(source){
  phrases.url <- paste0(base.url, "keyPhrases")
  phrases.body <- toJSON(list(documents = source))
  phrases.result <- Request(phrases.url, phrases.body)
  colnames(phrases.result) <- c("key.phrases", "id")
  return(phrases.result)
}

Languages <- function(source){
  languages.url <- paste0(base.url, "languages")
  languages.body <- toJSON(list(documents = source))
  languages.result <- Request(languages.url, languages.body)
  colnames(languages.result) <- c("id", "detected.languages")
  return(languages.result)
}

## API key

Enter your API key within quotation marks and run the following code.

In [13]:
#enter key manually for now as readLine or other key storage does not work for Jupyter input
#you can regenerate your key in the Azure Portal so that anyone who publicly views it can no longer use it

key <- "[Enter API key]"

## Download sample file

Run the following code block to download the sample of 100 Amazon reviews from Azure Blob storage.

In [14]:
GET("https://drecognitive.blob.core.windows.net/samples/amazon-fine-food-samples.csv", 
    write_disk("amazon-fine-food-samples.csv", overwrite=TRUE))

Response [https://drecognitive.blob.core.windows.net/samples/amazon-fine-food-samples.csv]
  Date: 2016-07-28 18:00
  Status: 200
  Content-Type: text/csv
  Size: 45.2 kB
<ON DISK>  amazon-fine-food-samples.csvNULL

## Prepare the data

Run the following code to read the sample CSV file into an R data frame, then select and rename the relevant columns.
The output displays a preview of the data that will be sent to the Text Analytics API.

In [15]:
raw <- read.csv("amazon-fine-food-samples.csv", stringsAsFactors = FALSE)
language <- "en"
text.source <- select(raw, Id, Text)
text.source <- data.frame(language, text.source, stringsAsFactors = FALSE)
colnames(text.source) <- c("language", "id", "text")
text.source$id <- as.character(text.source$id)
head(text.source)

Unnamed: 0,language,id,text
1,en,1,I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.
2,en,2,"Product arrived labeled as Jumbo Salted Peanuts...the peanuts were actually small sized unsalted. Not sure if this was an error or if the vendor intended to represent the product as ""Jumbo""."
3,en,3,"This is a confection that has been around a few centuries. It is a light, pillowy citrus gelatin with nuts - in this case Filberts. And it is cut into tiny squares and then liberally coated with powdered sugar. And it is a tiny mouthful of heaven. Not too chewy, and very flavorful. I highly recommend this yummy treat. If you are familiar with the story of C.S. Lewis' ""The Lion, The Witch, and The Wardrobe"" - this is the treat that seduces Edmund into selling out his Brother and Sisters to the Witch."
4,en,4,If you are looking for the secret ingredient in Robitussin I believe I have found it. I got this in addition to the Root Beer Extract I ordered (which was good) and made some cherry soda. The flavor is very medicinal.
5,en,5,"Great taffy at a great price. There was a wide assortment of yummy taffy. Delivery was very quick. If your a taffy lover, this is a deal."
6,en,6,"I got a wild hair for taffy and ordered this five pound bag. The taffy was all very enjoyable with many flavors: watermelon, root beer, melon, peppermint, grape, etc. My only complaint is there was a bit too much red/black licorice-flavored pieces (just not my particular favorites). Between me, my kids, and my husband, this lasted only two weeks! I would recommend this brand of taffy -- it was a delightful treat."


## Get sentiment

Run the following code to pass the review data to the previously defined *Sentiment* function and store the response to a variable called *text.sentiment*. The output displays a preview of the response with sentiment scores ranging from 0 (negative) to 1 (positive).

In [16]:
text.sentiment <- Sentiment(text.source)
head(text.sentiment)

Unnamed: 0,sentiment,id
1,0.9536904,1
2,0.2745261,2
3,0.996214,3
4,0.9855343,4
5,0.974648,5
6,0.4996719,6


## Get key phrases

Run the following code to pass the review data to the *KeyPrhases* function and store the response to a variable called *text.key.phrases*. The output displays a preview of the response with a selection of key phrases from the review text.

In [17]:
text.key.phrases <- KeyPhrases(text.source)
head(text.key.phrases)

Unnamed: 0,key.phrases,id
1,"stew, Vitality canned dog food products, good quality, processed meat, Labrador",1
2,"Jumbo Salted Peanuts, Product, vendor, error",2
3,"Witch, nuts, Lewis, Brother, tiny squares, tiny mouthful of heaven, pillowy citrus gelatin, light, treat, familiar, story, powdered sugar, Edmund, Wardrobe, Lion, Sisters, case Filberts, confection, centuries",3
4,"addition, Root Beer, secret ingredient, Robitussin, cherry soda, flavor",4
5,"Great taffy, taffy lover, wide assortment of yummy taffy, great price, deal, Delivery",5
6,"brand of taffy, husband, root beer, black licorice-flavored pieces, watermelon, flavors, grape, peppermint, pound bag, wild hair, kids, weeks, delightful treat, particular favorites, complaint",6


## Combine outputs

Run the following code to join the sentiment and key phrase responses back to the original data frame. 
The output displays a preview of the final data frame.

In [18]:
combined <- list(text.source, text.sentiment, text.key.phrases)
text.results <- Reduce(inner_join, combined)
head(text.results)

Joining by: "id"
Joining by: "id"


Unnamed: 0,language,id,text,sentiment,key.phrases
1,en,1,I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.,0.9536904,"stew, Vitality canned dog food products, good quality, processed meat, Labrador"
2,en,2,"Product arrived labeled as Jumbo Salted Peanuts...the peanuts were actually small sized unsalted. Not sure if this was an error or if the vendor intended to represent the product as ""Jumbo"".",0.2745261,"Jumbo Salted Peanuts, Product, vendor, error"
3,en,3,"This is a confection that has been around a few centuries. It is a light, pillowy citrus gelatin with nuts - in this case Filberts. And it is cut into tiny squares and then liberally coated with powdered sugar. And it is a tiny mouthful of heaven. Not too chewy, and very flavorful. I highly recommend this yummy treat. If you are familiar with the story of C.S. Lewis' ""The Lion, The Witch, and The Wardrobe"" - this is the treat that seduces Edmund into selling out his Brother and Sisters to the Witch.",0.996214,"Witch, nuts, Lewis, Brother, tiny squares, tiny mouthful of heaven, pillowy citrus gelatin, light, treat, familiar, story, powdered sugar, Edmund, Wardrobe, Lion, Sisters, case Filberts, confection, centuries"
4,en,4,If you are looking for the secret ingredient in Robitussin I believe I have found it. I got this in addition to the Root Beer Extract I ordered (which was good) and made some cherry soda. The flavor is very medicinal.,0.9855343,"addition, Root Beer, secret ingredient, Robitussin, cherry soda, flavor"
5,en,5,"Great taffy at a great price. There was a wide assortment of yummy taffy. Delivery was very quick. If your a taffy lover, this is a deal.",0.974648,"Great taffy, taffy lover, wide assortment of yummy taffy, great price, deal, Delivery"
6,en,6,"I got a wild hair for taffy and ordered this five pound bag. The taffy was all very enjoyable with many flavors: watermelon, root beer, melon, peppermint, grape, etc. My only complaint is there was a bit too much red/black licorice-flavored pieces (just not my particular favorites). Between me, my kids, and my husband, this lasted only two weeks! I would recommend this brand of taffy -- it was a delightful treat.",0.4996719,"brand of taffy, husband, root beer, black licorice-flavored pieces, watermelon, flavors, grape, peppermint, pound bag, wild hair, kids, weeks, delightful treat, particular favorites, complaint"


## Next steps

The final output can then be stored, analyzed, or used elsewhere as needed. For example, you can rank reviews by sentiment, categorize them by parsing key phrases. You can even visualize this data using R or another application such as Microsoft Power BI.

In [19]:
text.results

Unnamed: 0,language,id,text,sentiment,key.phrases
1,en,1.0,I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.,0.9536904,"stew, Vitality canned dog food products, good quality, processed meat, Labrador"
2,en,2.0,"Product arrived labeled as Jumbo Salted Peanuts...the peanuts were actually small sized unsalted. Not sure if this was an error or if the vendor intended to represent the product as ""Jumbo"".",0.2745261,"Jumbo Salted Peanuts, Product, vendor, error"
3,en,3.0,"This is a confection that has been around a few centuries. It is a light, pillowy citrus gelatin with nuts - in this case Filberts. And it is cut into tiny squares and then liberally coated with powdered sugar. And it is a tiny mouthful of heaven. Not too chewy, and very flavorful. I highly recommend this yummy treat. If you are familiar with the story of C.S. Lewis' ""The Lion, The Witch, and The Wardrobe"" - this is the treat that seduces Edmund into selling out his Brother and Sisters to the Witch.",0.996214,"Witch, nuts, Lewis, Brother, tiny squares, tiny mouthful of heaven, pillowy citrus gelatin, light, treat, familiar, story, powdered sugar, Edmund, Wardrobe, Lion, Sisters, case Filberts, confection, centuries"
4,en,4.0,If you are looking for the secret ingredient in Robitussin I believe I have found it. I got this in addition to the Root Beer Extract I ordered (which was good) and made some cherry soda. The flavor is very medicinal.,0.9855343,"addition, Root Beer, secret ingredient, Robitussin, cherry soda, flavor"
5,en,5.0,"Great taffy at a great price. There was a wide assortment of yummy taffy. Delivery was very quick. If your a taffy lover, this is a deal.",0.974648,"Great taffy, taffy lover, wide assortment of yummy taffy, great price, deal, Delivery"
6,en,6.0,"I got a wild hair for taffy and ordered this five pound bag. The taffy was all very enjoyable with many flavors: watermelon, root beer, melon, peppermint, grape, etc. My only complaint is there was a bit too much red/black licorice-flavored pieces (just not my particular favorites). Between me, my kids, and my husband, this lasted only two weeks! I would recommend this brand of taffy -- it was a delightful treat.",0.4996719,"brand of taffy, husband, root beer, black licorice-flavored pieces, watermelon, flavors, grape, peppermint, pound bag, wild hair, kids, weeks, delightful treat, particular favorites, complaint"
7,en,7.0,"This saltwater taffy had great flavors and was very soft and chewy. Each candy was individually wrapped well. None of the candies were stuck together, which did happen in the expensive version, Fralinger's. Would highly recommend this candy! I served it at a beach-themed party and everyone loved it!",0.7225071,"great flavors, beach-themed party, Fralinger, expensive version, saltwater taffy, candy, candies"
8,en,8.0,This taffy is so good. It is very soft and chewy. The flavors are amazing. I would definitely recommend you buying it. Very satisfying!!,0.9966703,"taffy, flavors"
9,en,9.0,Right now I'm mostly just sprouting this so my cats can eat the grass. They love it. I rotate it around with Wheatgrass and Rye too,0.8462628,"cats, Wheatgrass, Rye"
10,en,10.0,This is a very healthy dog food. Good for their digestion. Also good for small puppies. My dog eats her required amount at every feeding.,0.8214344,"healthy dog food, small puppies, feeding, digestion"
