<a href="https://colab.research.google.com/github/jessecanada/MAPS/blob/master/MAPS_4_Phenotype_Classification_Azure.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **MAPS Module 4 - Phenotype Classification**
This notebook will guide you through classifying phenotypes with Azure Custom Vision.


## Set up Azure environtment

In [1]:
!pip -q install azure-cognitiveservices-vision-customvision

[K     |████████████████████████████████| 71kB 4.5MB/s 
[K     |████████████████████████████████| 92kB 6.1MB/s 
[K     |████████████████████████████████| 51kB 6.0MB/s 
[?25h

In [2]:
# data and file processing libraries
import numpy as np
import pandas as pd
import cv2
import matplotlib.pyplot as plt
import os
%matplotlib inline

# Azure related libraries
from azure.cognitiveservices.vision.customvision.training import CustomVisionTrainingClient
from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from msrest.authentication import ApiKeyCredentials
from azure.cognitiveservices.vision.customvision.training.models import ImageFileCreateBatch, ImageFileCreateEntry, Region

Setup your Azure trainer and predictor. Follow [this guide](https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/quickstarts/object-detection?tabs=visual-studio&pivots=programming-language-python) to locate the attributes

In [3]:
ENDPOINT = "your-endpoint" # ex: https://westus2.api.cognitive.microsoft.com/
training_key = "your-training-key"
prediction_key = "your-prediction-key"

In [4]:
credentials = ApiKeyCredentials(in_headers={"Training-key": training_key})
trainer = CustomVisionTrainingClient(ENDPOINT, credentials)
prediction_credentials = ApiKeyCredentials(in_headers={"Prediction-key": prediction_key})
predictor = CustomVisionPredictionClient(ENDPOINT, prediction_credentials)

In [5]:
# list your projects
for project_name in trainer.get_projects():
  print(project_name)

{'additional_properties': {}, 'id': '1eae5342-91d5-4f2c-9848-9652c1e13b36', 'name': 'PTEN_classification', 'description': 'classify PTEN variant localization', 'settings': <azure.cognitiveservices.vision.customvision.training.models._models_py3.ProjectSettings object at 0x7fe04c8e0a20>, 'created': datetime.datetime(2020, 1, 9, 22, 59, 28, 490000, tzinfo=<isodate.tzinfo.Utc object at 0x7fe04c989d68>), 'last_modified': datetime.datetime(2020, 1, 9, 22, 59, 28, 490000, tzinfo=<isodate.tzinfo.Utc object at 0x7fe04c989d68>), 'thumbnail_uri': None, 'dr_mode_enabled': False, 'status': 'Succeeded'}
{'additional_properties': {}, 'id': '852eead8-f80d-4645-9c3d-5ba1fa221df2', 'name': 'PTEN_obj_detect', 'description': 'detect cells expressing PTEN', 'settings': <azure.cognitiveservices.vision.customvision.training.models._models_py3.ProjectSettings object at 0x7fe04c8e0dd8>, 'created': datetime.datetime(2019, 10, 4, 15, 53, 58, 703000, tzinfo=<isodate.tzinfo.Utc object at 0x7fe04c989d68>), 'last_m

In [6]:
# copy the 'id' value of your object detection project and paste it below
project = trainer.get_project(project_id="1eae5342-91d5-4f2c-9848-9652c1e13b36")
# if project is loaded successfully you should see it returned
project.id

'1eae5342-91d5-4f2c-9848-9652c1e13b36'

In [12]:
# list published iterations of your obj detection model
for it in trainer.get_iterations(project.id):
  print(it.name)

Iteration 7
Iteration 4
Iteration 2


In [13]:
# specify the iteration you want to use (without spaces)
publish_iteration_name = "Iteration2"

## Get the ROI files ready for classification

In [7]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [8]:
# unzip zip file containing individual ROI images
!unzip -q -d /content/ path-to-your-ROI-zip

In [9]:
# confirm how many cells are to be analyzed
!ls path-to-ROIs-folder | wc -l

547


## Azure classification predictions

In [None]:
wrk_dir = "/content/single_cells_C124R/" # replace with your ROI folder path
temp_list = []

for entry in os.scandir(wrk_dir):
  if entry.name.endswith('.jpg'):
    image_ID = entry.name[:-4]
    print(f'image_ID: {image_ID}')

    # open an image and get back the prediction results
    with open(wrk_dir+entry.name, mode="rb") as image: # rb: 'read binary' (for images)
      results = predictor.classify_image(project.id, publish_iteration_name, image)
    
      # get prediction results
      tags = [prediction.tag_name for prediction in results.predictions]
      probabilities = [prediction.probability*100 for prediction in results.predictions]
      # make a dictionary of tag:prob pairs
      predictions_dict = dict(zip(tags, probabilities))
      # sort the tags in alphabetical order, append the corresponding prob of the sorted tags
      predictions_list = [predictions_dict[i] for i in sorted(predictions_dict)]
      # add image_ID to the beginning of the list
      predictions_list.insert(0, image_ID)
      # append the sorted list to a list as a compound list
      temp_list.append(predictions_list)
    
      for i in sorted(predictions_dict) : 
        print(f'{i}: {predictions_dict[i]:.2f}') 
      print()

convert prediction results into a dataframe

In [16]:
col_names = [i for i in sorted(predictions_dict)]
col_names.insert(0, 'image_ID')
df_predict = pd.DataFrame(temp_list, columns = col_names)
df_predict.head(10)

Unnamed: 0,image_ID,diffused,non_nuclear,nuclear
0,merged_191120110001_C01f318_0,2.824828e-11,1.619647e-13,100.0
1,merged_191120110001_C01f339_7,5.389369e-07,100.0,7.847335e-26
2,merged_191120110001_C01f16_4,0.2339669,99.76603,7.171362e-07
3,merged_191120110001_C01f377_3,1.155122e-15,5.636735e-06,100.0
4,merged_191120110001_C01f34_7,1.6344530000000002e-17,1.196953e-20,100.0
5,merged_191120110001_C01f190_0,99.40806,1.519623e-07,0.5919323
6,merged_191120110001_C01f19_1,1.086208e-17,3.6884859999999996e-24,100.0
7,merged_191120110001_C01f257_4,9.557129e-18,2.803921e-25,100.0
8,merged_191120110001_C01f73_1,0.3693393,38.37759,61.25307
9,merged_191120110001_C01f32_3,3.303315,96.69129,0.005401373


save the dataframe to a csv file

In [None]:
df_predict.to_csv('your-csv-name.csv', index=False)