# Custom Vision with R

## Not Hotdog

This example uses the Microsoft Custom Vision API create a function that identifies
whether an image on the web is a hot dog ... or not a hotdog. (Inspired by an [episode of Silicon Valley](https://www.youtube.com/watch?v=ACmydtFDTGs).) 

For an overview of the application, take a look at [this blog post](http://blog.revolutionanalytics.com/2018/04/not-hotdog.html). If you'd prefer to use R or RStudio on your laptop to run this script, you can [download this Github repository](https://github.com/revodavid/nothotdog).

Here are some useful references:

* [Overview of Custom Vision API](https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/home)
* [Custom Vision Training API reference](https://southcentralus.dev.cognitive.microsoft.com/docs/services/fde264b7c0e94a529a0ad6d26550a761/operations/59568ae208fa5e09ecb9983a)
* [Custom Vision Prediction API reference](https://southcentralus.dev.cognitive.microsoft.com/docs/services/57982f59b5964e36841e22dfbfe78fc1/operations/5a3044f608fa5e06b890f164)

The Custom Vision API also has a Web interface. As you work through this example, you can see the 
effects of your API calls as you go by browsing to
https://www.customvision.ai/projects and logging in with your Microsoft Account.

**Note**: The Custom Vision API is in preview, which means that these APIs may change in the future. Custom Vision is only
available in the South Central US region while it's in preview, so we won't worry about regions in this example.

## Create authorization keys for Custom Vision

1. Visit https://portal.azure.com (and sign in if needed)
2. Click "+ Create a Resource" (top-left corner)
3. With the "Search the Marketplace" box, search for "Custom Vision Service"
4. Select "Custom Vision Service (preview)" and click "Create"
    * Name: qcon-customvision
    * Subscription: _there should be just one option_
    * Location: South Central US
    * Prediction Pricing Tier: F0 (free, 2 transactions per second)
    * Training pricing Tier: F0 (2 projects)
    * Resource Group: Use existing "odsc" group
5. Click "Create"


In [None]:
library(httr)
library(jsonlite)

Download the `keys.txt` file and edit it to provide your own API keys. Follow the directions in `README.md` to create a Custom Vision resource in the Azure Portal, and retrieve Key 1 from the `qconcustomvision` and `qconcustomvision_Prediction` resources.

In [2]:
## Retrieve API keys from keys.txt file, set API endpoint 
keys <- read.table("keysds.txt", header=TRUE, stringsAsFactors = FALSE)

region <- keys["region",1]
## We won't actually use the region here, except as a check that keys.txt has been edited
if (region=="ERROR-EDIT-KEYS.txt-FILE") 
 stop("Edit the file keys.txt to provide valid keys. See README.md for details.")

## retrieve custom vision key
cvision_api_key <- keys["custom",1]
cvision_pred_key <- keys["cvpred",1]
cvision_api_endpoint <- "https://southcentralus.api.cognitive.microsoft.com/customvision/v1.1/Training"

print(keys)


                                    key
region                           eastus
vision bd27c70c52b341b0bbdf32eea070f28e
custom 56ca7a2584d84152af136db676de702e


In [3]:
cvision_pred_key

The first step is to create a project using the Custom Vision API. When creating a project, you can optionally use specialized domains like Landmarks or Retail to better identify images. We'll use the "Food" domain, and look up the ID code of the Food domain using the API, and them use that when creating our project.

In [None]:
## Get the list of available training domains
domainsURL <- paste0(cvision_api_endpoint, "/domains")

APIresponse = GET(url = domainsURL,
                   content_type_json(),
                   add_headers(.headers= c('Training-key' = cvision_api_key)),
                   body="",
                   encode="json")

domains <- content(APIresponse)
print(domains)
domains.Food <- domains[[2]]$Id

## Create a project
createURL <- paste0(cvision_api_endpoint, "/projects?",
                    "name=qconhotdog&",
                    'description=NotHotdog&',
                    'domainId=',domains.Food)

APIresponse = POST(url = createURL,
                   content_type_json(),
                   add_headers(.headers= c('Training-key' = cvision_api_key)),
                   body="",
                   encode="json")

cvision_id <- content(APIresponse)$Id

In [None]:
## Next, create tags we will use to label the images
## We will use "hotdog" for hot dog images and "nothotdog" for similar looking foods
## We will save the tag ids returned by the API for use later

## function to create one tag, and return its id
createTag <- function(id, tagname) {
 eURL <- paste0(cvision_api_endpoint, "/projects/", id, "/tags?",
                "name=",tagname)
 
 APIresponse = POST(url = eURL,
                    content_type_json(),
                    add_headers(.headers= c('Training-key' = cvision_api_key)),
                    body="",
                    encode="json")

 content(APIresponse)$Id 
}

hotdog_tag <- createTag(cvision_id, "hotdog")
nothotdog_tag <- createTag(cvision_id, "nothotdog")
tags <- c(hotdog = hotdog_tag, nothotdog=nothotdog_tag)
tags

This is a good opportunity to visit https://customvision.ai and log in to see your project. It's empty for now, but with our two tags defined. We'll upload pictures of hotdogs under the "hotdog" tag, and pictures of tacos, hamburgers, etc. under the "nothotdog" tag.

Note that our app will actually be classifying _three_ types of images: hotdogs, tacos/hamburgers, and eveything else. When building a classifier, it's useful to include a category of images likely to be mistaken for your target image -- that's why we chose tacos and hamburgers. You can use the same technique to create a classifier that detects multiple image types; for example, you could add a "pizza" tag and also upload images of pizza.

We created the files `hotdogs-good.txt` and `nothotdogs-good.txt`
using ImageNet data and some visual inspection. See the file 
`nothotdog-find-data.R` if you want to see how it was done.

In [None]:
## Read in a file of URLs of images of hotdogs, and also a file
## of URL of images that are somewhat similar to, but not, hotdogs
hotdogs <- scan("hotdogs-good.txt",what=character())
nothotdogs <- scan("nothotdogs-good.txt", what=character())

## A function to upload images to Custom Vision
uploadURLs <- function(id, tagname, urls) {
 ## id: Project ID
 ## tagname: one tag (applued to all URLs), as a tag ID
 ## urls: vector of image URLs

 eURL <- paste0(cvision_api_endpoint, "/projects/", id, "/images/url")
 success <- logical(0)
  
 ## The API accepts 64 URLs at a time, max, so:
 while(length(urls) > 0) {

  N <- min(length(urls), 64) 
  urls.body <- toJSON(list(TagIds=tagname, Urls=urls[1:N]))

  APIresponse = POST(url = eURL,
                    content_type_json(),
                    add_headers(.headers= c('Training-key' = cvision_api_key)),
                    body=urls.body,
                    encode="json")
 
  success <- c(success,content(APIresponse)$IsBatchSuccessful)
  urls <- urls[-(1:N)]
 }
 all(success)
}

## Upload the images to Custom Vision. Should return TRUE in both cases, indicating success.
## If you do get some FALSE results, it's probably because some URLs were unavailable.
uploadURLs(cvision_id, tags["hotdog"], hotdogs)
uploadURLs(cvision_id, tags["nothotdog"], nothotdogs)

Take another look at your project at https://customvision.ai . You'll now see the project populated with images, classified as `hotdog` and `nothotdog`.

We can alse check on the status of the project via the API:

In [7]:
## Get status of projects
projURL <- paste0(cvision_api_endpoint, "/projects/")

APIresponse = GET(url = projURL,
                   content_type_json(),
                   add_headers(.headers= c('Training-key' = cvision_api_key)),
                   body="",
                   encode="json")

projStatus <- content(APIresponse)

## These two will be the same unless you have more than one project
print(projStatus[[1]]$Id)
print(cvision_id)

ERROR: Error in GET(url = projURL, content_type_json(), add_headers(.headers = c(`Training-key` = cvision_api_key)), : could not find function "GET"


In [None]:
## Train project
trainURL <- paste0(cvision_api_endpoint, "/projects/",
                   cvision_id,
                   "/train")

APIresponse = POST(url = trainURL,
                   content_type_json(),
                   add_headers(.headers= c('Training-key' = cvision_api_key)),
                   body="",
                   encode="json")

trainOut <- content(APIresponse)

if(!is.null(trainOut$Code)) print(trainOut$Message) else
  train.id <- content(APIresponse)$Id

print(train.id)

In [None]:
## Function to check status of a trained model (iteration)

iterStatus <- function(id) {
 iterURL <- paste0(cvision_api_endpoint, "/projects/",
                    cvision_id,
                    "/iterations/",
                    id)
 
 APIresponse = GET(url = iterURL,
                    content_type_json(),
                    add_headers(.headers= c('Training-key' = cvision_api_key)),
                    body="",
                    encode="json")
 
 content(APIresponse)$Status
}

In [None]:
## Keep checking this until the status is: Completed
iterStatus(train.id)

Now, visit https://customvision.ai, select your project and click the "Performance" tab. Experiment with moving the Probability Threshold slider, and note that you can decrease it (improving Recall) while generally maintaining Precision. This will be useful later.

We now need to provide our Prediction API key. If you didn't already retrieve it from the Azure Portal and include it
in `keys.txt` you can follow these instructions instead:

1. Visit https://customvision.ai
2. Click "Sign In"
3. Wait for projects to load, and then click your "qconhotdog" project
4. Click on Performance. Here you can check the precision and recall of your trained model.
5. Click on Prediction URL, and look at the "If you have an image URL" section
6. Check that the first part of the URL in the gray box matches cvision_api_endpoint_pred, below
7. Copy the key listed by "Set Prediction-Key Header to:" to `cvision_pred_key` below

You can also retrieve this key from the Azure Portal, by inspecting keys of the `qconcustomvision_Prediction` resource (this was created automatically when you created the `qconcustomvision` resource).

In [6]:
cvision_api_endpoint_pred <- "https://southcentralus.api.cognitive.microsoft.com/customvision/v1.1/Prediction"

## If you did not edit keys.txt to include your own cvpred key, then
## replace the key below with your prediction key, per the instructions above
## (this one won't work)
if(is.na(cvision_pred_key))
 cvision_pred_key<-"f39a6e96c89c4b08ae660b2e0d4145c5"
print(cvision_pred_key)

[1] "f39a6e96c89c4b08ae660b2e0d4145c5"


Now, we'll write a function to classify an image.

The prediction API will return a classification probability for both of our tags, `hotdog` and `nothotdog`. We'll use the following rules, with a default threshold of 50%:

* If the `hotdog` probability is above the threshold, classify as "hotdog"
* Otherwise, if the `nothotdog` probability is above the threshold, classify as "non-hotdog food"
* Otherwise, classify as "not hotdog".

In [None]:
## Function to generate predictions from a single URL, with classifier cutoff threshold (0-1)
hotdog_predict <- function(imageURL, threshold = 0.5) {
 predURL <- paste0(cvision_api_endpoint_pred, "/", cvision_id,"/url?",
                   "iterationId=",train.id,
                   "&application=R"
                   )

 body.pred <- toJSON(list(Url=imageURL[1]), auto_unbox = TRUE)

 APIresponse = POST(url = predURL,
                    content_type_json(),
                    add_headers(.headers= c('Prediction-key' = cvision_pred_key)),
                    body=body.pred,
                    encode="json")
 
 out <- content(APIresponse)
 
 if(!is.null(out$Code)) msg <- paste0("Can't analyze: ", out$Message) else
 {  
  predmat <- matrix(unlist(out$Predictions), nrow=3)
  preds <- as.numeric(predmat[3,])
  names(preds) <- predmat[2,]
  
  ## uncomment this to see the class predictions
  ## print(preds)
  
  if(preds["hotdog"]>threshold) msg <- "Hotdog" else
   if(preds["nothotdog"]>threshold) msg <- "Not Hotdog (but it looks delicious!)" else
    msg <- "Not Hotdog"
  }

  ## print the URL -- it will become clickable in the notebook
  cat(imageURL[1],"\n")
  msg
}

In [None]:
## Since these images were in our training data, most of these should be correct.
## The exact nunber is determined by the threshold (here, 50%) and the Recall statistic
hotdog_predict(hotdogs[1])
hotdog_predict(nothotdogs[1])

In [None]:
## here are some images to try, from a Google Image Search for "hotdog
example.hotdogs <- c(
 "http://www.wienerschnitzel.com/wp-content/uploads/2014/10/hotdog_mustard-main.jpg",
 "https://qz.com/wp-content/uploads/2017/07/hotdogs2__2__720.jpg?quality=80&strip=all",
 "http://www.americangarden.us/wp-content/uploads/2016/10/Recipe_Hot-dog-sandwich.jpg",
 "http://www.hot-dog.org/sites/default/files/pictures/hot-dogs-on-the-grill-sm.jpg",
 "https://www.dairyqueen.com/Global/Food/Hot-Dogs_8-to-1_470x500.jpg?width=&height=810"
)

## and a few Not Hotdog images to try
example.nothotdogs <- c(
 "https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Burrito_with_rice.jpg/1200px-Burrito_with_rice.jpg", #burrito
 "https://www.biggerbolderbaking.com/wp-content/uploads/2015/12/IMG_8761.jpg", # croissant
 "https://bigoven-res.cloudinary.com/image/upload/t_recipe-480/sausage-rolls.jpg", #sausage roll
 "https://www.recipetineats.com/wp-content/uploads/2017/09/Spring-Rolls-6.jpg", #spring rolls
 "https://images-gmi-pmc.edge-generalmills.com/b8488ce5-b076-420d-b0d0-e83039cae278.jpg" # jelly roll
)

In [None]:
## try out the other examples as well
hotdog_predict(example.hotdogs[2])
hotdog_predict(example.nothotdogs[2])

In [None]:
## Here's an example where the classification is wrong, at the 50% threshold
hotdog_predict(example.nothotdogs[4])

In [None]:
## We can be more conservative, at the expense of misclassifying some actual hotdogs
hotdog_predict(example.nothotdogs[4], threshold = 0.70)
hotdog_predict(example.hotdogs[3], threshold = 0.7)