# Custom Vision Deep Dive
## Contents
1. Deploy Resources
- Prepare Data
  - Convert VIA's COCO annotation json files to PASCAL VOC xml format
  - Resize images to less than 6MB
- Create a Custom Vision service project
  - Apply resizing factor and normalize tag coordinates
  - Upload images and tag to project
  - Train the project
  - Evaluate model
  
### Useful Resources
- Python SDK Docs: https://docs.microsoft.com/en-us/python/api/azure-cognitiveservices-vision-customvision/?view=azure-python
- Python Custom Vision Client Lib: https://pypi.org/project/azure-cognitiveservices-vision-customvision/
- Azure SDK for Python: https://github.com/Azure/azure-sdk-for-python/

## Deploy Resources
In your Azure Portal,tart by deploying
1. Custom Vision https://portal.azure.com/#create/Microsoft.CognitiveServicesCustomVision
  - Deploy both Training and Prediciton Resources in the same location (e.g. East US) at same pricing tier (S0)
  
- Machine Learning https://portal.azure.com/#create/Microsoft.MachineLearningServices
  - Deploy in same location with "Enterprise" as the Workspace Edition

## Prepare data 
### Convert VIA's COCO annotation json files to PASCAL VOC xml format

In [1]:
import argparse, json
import cytoolz
from lxml import etree, objectify
import os, re
import urllib.request, pdb
from pathlib import Path
import glob, os

USE_SUBFOLDERS = False
USE_ORIG_CODE = False
DOWNLOAD_IMAGES = True

In [2]:
# Helper function: Takes a COCO json and returns first level xml file structure

def instance2xml_base(anno):
    E = objectify.ElementMaker(annotate=False)
    anno_tree = E.annotation(
        
        E.folder('VOC2014_instance/{}'.format(anno['category_id'])),
        E.filename(anno['file_name']),
        E.source(
            E.database('MS COCO 2014'),
            E.annotation('MS COCO 2014'),
            E.image('Flickr'),
            E.url(anno['coco_url'])
        ),
        E.size(
            E.width(anno['width']),
            E.height(anno['height']),
            E.depth(3)
        ),
        E.segmented(0),
    )
    return anno_tree

In [3]:
# Helper function: Takes bounding box and returns xml element with structured coordinate info

def instance2xml_bbox(anno, bbox_type='xyxy'):
    """bbox_type: xyxy (xmin, ymin, xmax, ymax); xywh (xmin, ymin, width, height)"""
    assert bbox_type in ['xyxy', 'xywh']
    if bbox_type == 'xyxy':
        xmin, ymin, w, h = anno['bbox']
        xmax = xmin+w
        ymax = ymin+h
    else:
        xmin, ymin, xmax, ymax = anno['bbox']
    E = objectify.ElementMaker(annotate=False)
    anno_tree = E.object(
        E.name(anno['category_id']),
        E.bndbox(
            E.xmin(xmin),
            E.ymin(ymin),
            E.xmax(xmax),
            E.ymax(ymax)
        ),
        E.difficult(anno['iscrowd'])
    )
    return anno_tree

In [21]:
# Takes COCO json and returns PASCAL VOC

def parse_instance(content, outdir):
    categories = {d['id']: d['name'] for d in content['categories']}
    # merge images and annotations: id in images vs image_id in annotations
    
    # Converts "image_id" in json from str to int
    if not USE_ORIG_CODE: 
        for i in range(len(content['annotations'])):
            content['annotations'][i]['image_id'] = int(content['annotations'][i]['image_id'])

    # PATRICK - need to download images and update the filename field
    if DOWNLOAD_IMAGES:
        im_dir = os.path.join(outdir, "images")
        if not os.path.exists(im_dir):
            os.makedirs(im_dir)
    anno_dir = os.path.join(outdir, "annotations")
    if not os.path.exists(anno_dir):
        os.makedirs(anno_dir)

    # PATRICK - need to update 'filename' field so that looks like a local file and not a url
    if DOWNLOAD_IMAGES:
        if not USE_ORIG_CODE: 
            for obj in content['images']:
                im_local_filename = os.path.splitext(os.path.basename(obj['file_name']))[0] + ".jpg"
                obj['file_name'] = im_local_filename

                # download image
                dst_path = os.path.join(im_dir, im_local_filename)
                urllib.request.urlretrieve(obj['coco_url'], dst_path)


    merged_info_list = list(map(cytoolz.merge, cytoolz.join('id', content['images'], 'image_id', content['annotations'])))

    # convert category id to name
    for instance in merged_info_list:
        try:
            instance['category_id'] = categories[instance['category_id']]
        except KeyError:
            instance['category_id'] = -1
            print(instance)
    # group by filename to pool all bbox in same file


    img_filenames = {}
    for name, groups in cytoolz.groupby('file_name', merged_info_list).items():
        anno_tree = instance2xml_base(groups[0])
        # if one file have multiple different objects, save it in each category sub-directory
        filenames = []
        for group in groups:
            filename = os.path.splitext(name)[0] + ".xml"

            # PATRICK - save all annotations in single folder, rather than separate folders for each object 
            if USE_SUBFOLDERS:
                filenames.append(os.path.join(outdir, re.sub(" ", "_", group['category_id']), filename)) 
            else:
                filenames.append(os.path.join(outdir, "annotations", filename))

            anno_tree.append(instance2xml_bbox(group, bbox_type='xyxy'))

        for filename in filenames:
            print(filename)
            etree.ElementTree(anno_tree).write(filename, pretty_print=True)
        print("Formating instance xml file {} done!".format(name))

In [22]:
parser = argparse.ArgumentParser()
parser.add_argument("--anno_file", help="annotation file for object instance/keypoint")
parser.add_argument("--type", type=str, help="object instance or keypoint", choices=['instance', 'keypoint'])
parser.add_argument("--output_dir", help="output directory for voc annotation xml file")

args = parser.parse_args(['--anno_file', './test.json', '--type', 'instance', '--output_dir', './output'])

def coco2voc(args):
    if not os.path.exists(args.output_dir):
        os.makedirs(args.output_dir)
    content = json.load(open(args.anno_file, 'r'))
    if args.type == 'instance':
        parse_instance(content, args.output_dir)

In [24]:
filepath = "VIA - AeroLabs 2019_12.12 Project 02 SCOTT 2020_01.29 EDDIE 2020_01.23 export_coco.json"
fileStem = Path(filepath).stem # removes file extension
output_dir = './'
args = parser.parse_args(['--anno_file', filepath, '--type', 'instance', '--output_dir', output_dir])

coco2voc(args)

{'id': 111, 'width': -1, 'height': -1, 'file_name': '_PRCuKUcqrNJZ1cnty50AYA==..jpg', 'license': 1, 'flickr_url': 'https://customersiteimages.blob.core.windows.net/tenant-7/_PRCuKUcqrNJZ1cnty50AYA==..JPG', 'coco_url': 'https://customersiteimages.blob.core.windows.net/tenant-7/_PRCuKUcqrNJZ1cnty50AYA==..JPG', 'date_captured': '', 'image_id': 10, 'segmentation': [3233, 522, 2920, 518, 2882, 3413, 3074, 3400], 'area': 1016145, 'bbox': [2882, 518, 351, 2895], 'iscrowd': 0, 'category_id': -1}
{'id': 181, 'width': -1, 'height': -1, 'file_name': '_w05T7MCiPFNBTqdZ7J4Ebw==..jpg', 'license': 1, 'flickr_url': 'https://customersiteimages.blob.core.windows.net/tenant-7/_w05T7MCiPFNBTqdZ7J4Ebw==..JPG', 'coco_url': 'https://customersiteimages.blob.core.windows.net/tenant-7/_w05T7MCiPFNBTqdZ7J4Ebw==..JPG', 'date_captured': '', 'image_id': 19, 'segmentation': [2828, 819, 3333, 835, 2907, 3609, 2636, 3580], 'area': 1944630, 'bbox': [2636, 819, 697, 2790], 'iscrowd': 0, 'category_id': -1}
{'id': 257, 'w

## Prepare data 
### Resize images to less than 6MB

In [48]:
from PIL import Image
import glob
import ntpath
import shutil
import json
import pdb
from pathlib import Path

In [50]:
resizeFactors = {}

source = './images/*'
dest = './imagesResized/'
newSize = 6000000 # bytes equal to 6MB

for filepath in glob.iglob(source):
    fileName = ntpath.basename(filepath)
    fileSize = os.stat(filepath).st_size

    if(fileSize > newSize):
        #resize if larger than newSize
        image = Image.open(filepath)
        resizeFactor = 1 - (fileSize - newSize)/fileSize
        resizeFactors[fileName] = resizeFactor
        newX = round(image.size[0] * resizeFactor)
        newY = round(image.size[1] * resizeFactor)

        image = image.resize((newX,newY), Image.ANTIALIAS)
        image.save(dest + fileName, optimize=True, quality=95)
        
        saveSize = os.stat(dest + fileName).st_size
        print(fileName + ' resized. New size:', str(saveSize/1000000), 'Factor:', str(resizeFactor))
    else:
        #copy without resizing
        shutil.copy(filepath, dest)
        print(fileName + ' copied without resizing. Size:', str(fileSize/1000000))

#write resize factors to json file
json = json.dumps(resizeFactors)
f = open("./resizeFactors.json","w")
f.write(json)
f.close()

43473-39705_r8oNR+IVSJ+IsZLyGdj1bw==..jpg copied without resizing. Size: 4.83416
43489-39586_sDQABWrQ8qV_rm3I+dXXYA==..jpg resized. New size: 2.767931 Factor: 0.4909496701309166
43592-39684_kPzk8zqAB_C_KTmnUx7z3Q==..jpg resized. New size: 2.518606 Factor: 0.4483103072442388
43725-39733_yGt4MZnXZqTu5S0YJNdCoQ==..jpg resized. New size: 2.783641 Factor: 0.49918271310297213
43795-39759_hhAvT6bZSUqiX1Uvnm_htw==..jpg copied without resizing. Size: 4.344486
43968-39826_i+19otkVp_NCWi94YYimFw==..jpg resized. New size: 2.529202 Factor: 0.4603239852272827
44183-39843_xkWlCFjb85re0JdZbTgAOQ==..jpg resized. New size: 4.593747 Factor: 0.9811896139116988
45577-39954_EUPG_1RCGsN_30l5Qimp0g==..jpg copied without resizing. Size: 4.027278
45598-40066_8Y3lqDsgdAaSoUXvDjxwIA==..jpg resized. New size: 3.023037 Factor: 0.5268668274722765
47956-45186_g515Tcx6AHWlXCskf8gCNA==..jpg resized. New size: 2.80573 Factor: 0.4779530989403541
48026-45295_1VCnpc2LJhqhznRoPhzktg==..jpg resized. New size: 2.850919 Factor

## Create a Custom Vision service project


In [28]:
# Get workspace environment configs
import os
import azureml.core
from azureml.core.compute import ComputeTarget, DataFactoryCompute
from azureml.exceptions import ComputeTargetException
from azureml.core import Workspace, Experiment
from azureml.pipeline.core import Pipeline
from azureml.core.datastore import Datastore
from azureml.data.data_reference import DataReference
from azureml.pipeline.steps import DataTransferStep

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

SDK version: 1.0.85
exelonclearsight
exelonclearsight
northcentralus
c9cf37d1-b965-431c-8eab-91b60c77b93a


In [29]:
from azure.cognitiveservices.vision.customvision.training import CustomVisionTrainingClient
from azure.cognitiveservices.vision.customvision.training.models import ImageFileCreateEntry, Region

ENDPOINT = "https://mscvpoc-cogservices.cognitiveservices.azure.com/"

# Replace with a valid key
training_key = "5d9d4101ef7d417b94441bd8d2d07b08"
prediction_key = "221b7c70884e4b268dd43ce40842fab0"
prediction_resource_id = "/subscriptions/834ecbdb-c8e6-49c7-81da-2ceb47cb97a6/resourceGroups/MS-CV-POC-RG/providers/Microsoft.CognitiveServices/accounts/mscvpoccogservices-Prediction"#"/subscriptions/c9cf37d1-b965-431c-8eab-91b60c77b93a/resourceGroups/exelonClearsight/providers/Microsoft.CognitiveServices/accounts/exelonClearsight-Prediction"

trainer = CustomVisionTrainingClient(training_key, endpoint=ENDPOINT)

# Find the object detection domain
obj_detection_domain = next(domain for domain in trainer.get_domains() if domain.type == "ObjectDetection" and domain.name == "General")


In [30]:
# Create a new project
print ("Creating project...")
project = trainer.create_project("Custom Vision Deep Dive", domain_id=obj_detection_domain.id)

if(project is not None):
    print('Project created successfully')
else:
    print('Project creation failed')

Creating project...
Project created successfully


## Create a Custom Vision service project
### Apply resizing factor and normalize tag coordinates

In [31]:
import xml.etree.ElementTree as ET
from pathlib import Path
from PIL import Image
import ntpath

# Import resize factors from json file
import json
f = open('resizeFactors.json')
resizeDict = json.load(f)
f.close()

# Create dictionary of tag names and tag object
tagDict = {}

# Helper function to return dictionary of all tag names and corresponding normalized coordinates
def getTagsNorm(xml_file: str, imagePath):
    img = Image.open(imagePath)
    width, height = img.size
    
    key = ntpath.basename(imagePath)
    resizeList = list(resizeDict.keys())

    if key in resizeList:
        resizeFactor = resizeDict[key]
    else:
        resizeFactor = 1

    tree = ET.parse(xml_file)
    root = tree.getroot()

    list_with_all_boxes = []
    
    filename = root.find('filename').text
    
    #Create Dict from tagDict
    Dict = {key: [] for key in tagDict}
    
    for boxes in root.iter('object'):

        tag = boxes.find('name').text
        ymin, xmin, ymax, xmax = None, None, None, None

        for box in boxes.findall("bndbox"):
            xmin = int(box.find("xmin").text) * resizeFactor
            ymin = int(box.find("ymin").text) * resizeFactor
            xmax = int(box.find("xmax").text) * resizeFactor
            ymax = int(box.find("ymax").text) * resizeFactor
            
            # Normalize coordinates are given in the order: left, top, width, height.
            xNorm = xmin / width
            YNorm = ymin / height
            w = (xmax - xmin)/width
            h = (ymax - ymin)/height

        list_with_single_boxes = [xNorm, YNorm, w, h]
        list_with_all_boxes.append(list_with_single_boxes)
        
        if tag in tagDict:
            Dict[tag].append(list_with_single_boxes)
        else:
            print("Create new tag:", tag)
            tagDict[tag] = trainer.create_tag(project.id, tag)
            Dict[tag] = [list_with_single_boxes]
            
    return filename, Dict

## Create a Custom Vision service project
### Upload images and tag to project

In [32]:
# Upload and tag images
from pathlib import Path
import glob
import ntpath

xmlPath = "./annotations/"
imagePath = "./imagesResized/"

for filepath in glob.iglob(imagePath +'*'):
    fileName = Path(filepath).name
    fileStem = Path(filepath).stem # removes file extension
    xmlsFilePath = xmlPath + fileStem + '.xml'
    
    name, tags = getTagsNorm(xmlsFilePath, filepath)
    regions = []
    
    for t in tags:
        tagObj = tagDict[t]
        
        for box in tags[t]:
            x,y,w,h = box
            regions.append( Region(tag_id=tagObj.id, left=x,top=y,width=w,height=h) )
            
    with open(filepath, mode="rb") as image_contents:
        entry = ImageFileCreateEntry(name=fileName, contents=image_contents.read(), regions=regions)

    upload_result = trainer.create_images_from_files(project.id, images=[entry])
    if not upload_result.is_batch_successful:
        print("Image batch upload failed.")
        for image in upload_result.images:
            print("Image status: ", image.status)
        exit(-1)
    else:
        print("Upload of " + fileName + " successful!")

Create new tag: Crossarm
Create new tag: Pole
Create new tag: Insulator
Upload of 43473-39705_r8oNR+IVSJ+IsZLyGdj1bw==..jpg successful!
Create new tag: -1
Create new tag: Pole Top
Upload of 43489-39586_sDQABWrQ8qV_rm3I+dXXYA==..jpg successful!
Upload of 43592-39684_kPzk8zqAB_C_KTmnUx7z3Q==..jpg successful!
Upload of 43725-39733_yGt4MZnXZqTu5S0YJNdCoQ==..jpg successful!
Upload of 43795-39759_hhAvT6bZSUqiX1Uvnm_htw==..jpg successful!
Upload of 43968-39826_i+19otkVp_NCWi94YYimFw==..jpg successful!
Upload of 44183-39843_xkWlCFjb85re0JdZbTgAOQ==..jpg successful!
Upload of 45577-39954_EUPG_1RCGsN_30l5Qimp0g==..jpg successful!
Upload of 45598-40066_8Y3lqDsgdAaSoUXvDjxwIA==..jpg successful!
Upload of 47956-45186_g515Tcx6AHWlXCskf8gCNA==..jpg successful!
Upload of 48026-45295_1VCnpc2LJhqhznRoPhzktg==..jpg successful!
Upload of 48027-45005_XJ+yidGceiNZ1B0PfaRjCg==..jpg successful!
Upload of 48033-45002_pi56n0Q4+uyKqrwY+yLuqg==..jpg successful!
Create new tag: Switch
Upload of 48058-44984_SC4V7cY

## Create a Custom Vision service project
### Train the project - Select the tags you want to train on

### Select the tags you want to train on
`train_project(project_id, 
training_type=None,
reserved_budget_in_hours=0, 
force_train=False, 
notification_email_address=None, 
selected_tags=None, 
custom_headers=None, 
raw=False, **operation_config)`

In [34]:
SELECTED_TAGS = ['Crossarm', 'Insulator', 'Pole', 'Transformer']
import time
print ("Training...")
iteration = trainer.train_project(project.id, selected_tags=SELECTED_TAGS)
while (iteration.status == "Training"):
    iteration = trainer.get_iteration(project.id, iteration.id)
    print ("Training status: " + iteration.status)
    time.sleep(30)

Training...
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Training
Training status: Completed
