# Use Amazon Rekognition

Amazon Rekognition is an unmanaged AI service that uses pre-trained models for various computer visions tasks.

Amazon Rekognition makes it easy to add image and video analysis to your applications using proven, highly scalable, deep learning technology that requires no machine learning expertise to use. With Amazon Rekognition, you can identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content. Amazon Rekognition also provides highly accurate facial analysis and facial search capabilities that you can use to detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases.

With Amazon Rekognition Custom Labels, you can identify the objects and scenes in images that are specific to your business needs. For example, you can build a model to classify specific machine parts on your assembly line or to detect unhealthy plants. Amazon Rekognition Custom Labels takes care of the heavy lifting of model development for you, so no machine learning experience is required. You simply need to supply images of objects or scenes you want to identify, and the service handles the rest.

## Setting up your account
Once you singned up for AWS you need to create a new user account and configure the AWS command line interface (CLI) with the user's credentials:
- Create user: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html#id_users_create_console
- Configure CLI https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html

In [2]:
! pip3 install boto3

Collecting boto3
  Downloading boto3-1.14.47-py2.py3-none-any.whl (129 kB)
[K     |████████████████████████████████| 129 kB 2.7 MB/s eta 0:00:01
[?25hCollecting jmespath<1.0.0,>=0.7.1
  Using cached jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting s3transfer<0.4.0,>=0.3.0
  Using cached s3transfer-0.3.3-py2.py3-none-any.whl (69 kB)
Collecting botocore<1.18.0,>=1.17.47
  Downloading botocore-1.17.47-py2.py3-none-any.whl (6.5 MB)
[K     |████████████████████████████████| 6.5 MB 8.3 MB/s eta 0:00:011
[?25hCollecting urllib3<1.26,>=1.20; python_version != "3.4"
  Downloading urllib3-1.25.10-py2.py3-none-any.whl (127 kB)
[K     |████████████████████████████████| 127 kB 8.8 MB/s eta 0:00:01
Collecting docutils<0.16,>=0.10
  Using cached docutils-0.15.2-py3-none-any.whl (547 kB)
Installing collected packages: jmespath, urllib3, docutils, botocore, s3transfer, boto3
Successfully installed boto3-1.14.47 botocore-1.17.47 docutils-0.15.2 jmespath-0.10.0 s3transfer-0.3.3 urllib3-1.25.1

In [8]:
import os
jp = os.path.join
import numpy as np
import pandas as pd
import json

In [3]:
import boto3

### Create a bucket
The following code will create a bucket named `kingmolnar-msa8650`. **You need to use a unique name.**


In [None]:
! aws s3 mb s3://kingmolnar-msa8650

In [14]:
s3 = boto3.client('s3')
s3_bucket = 'kingmolnar-msa8650'
s3_prefix = 'lpr-assignment/tmp'

In [63]:
# define path to local data directory
DATAPATH = "data_redatcted"

## Load list if IR images

In [12]:
ir_images = list(map(
    lambda f: f.strip(),
    os.popen(f"ls {jp(DATAPATH, 'ir_patch')}/*.jpg").readlines()
))
print(f"Number of IR images: {len(ir_images):,}")

Number of IR images: 1,821


In [58]:
# Set path to data files, inside DATAPATH should be the ir_patch folder
DATAPATH = "data_redatcted"

# Amazon S3 client
s3 = boto3.client('s3')
s3_bucket = 'kingmolnar-msa8650'
s3_prefix = 'lpr-assignment/tmp'

# Amazon Rekognition client
reko = boto3.client('rekognition')

T_0_loop = datetime.datetime.now()

cnt = 0
for i, fn in enumerate(ir_images):
    if i%100 == 0:
        print(f"{i:,}\t{fn}")
    
    # create filename for results
    res_fn = jp(
                DATAPATH,
                'amazon_rekognition',
                os.path.basename(fn).replace('ir_patch', 'amazon_rekognition').replace('.jpg', '.json')
    )
    
    if not os.path.exists(res_fn):
        cnt += 1
        T_0 = datetime.datetime.now()
        
        # upload image file to S3
        s3.upload_file(fn, s3_bucket, jp(s3_prefix, 'tmp_ir_path.jpg'))

        # call Amazon Rekognition
        response = reko.detect_text(Image={'S3Object':{'Bucket':s3_bucket, 'Name': jp(s3_prefix, 'tmp_ir_path.jpg')}})
        if response['ResponseMetadata']['HTTPStatusCode'] != 200:
            print(f"Rekognition failed:\n")
            pprint.pprint(response)
            break

        response['ProcessingTime'] = str(datetime.datetime.now() - T_0)
        with open(res_fn, 'w') as io:
            json.dump(response, io)

print(f"\n\nDone. Number of images processed: {cnt:,}  Total time: {datetime.datetime.now() - T_0_loop}")

0	data_redatcted/ir_patch/14134_19700101194928245_BHA6172_1_ir_patch.jpg
100	data_redatcted/ir_patch/14264_19700101121843079_CAY9621_1_ir_patch.jpg
200	data_redatcted/ir_patch/14387_19700101135914931_PYW8543_1_ir_patch.jpg
300	data_redatcted/ir_patch/14511_19700101191607559_BMD3363_1_ir_patch.jpg
400	data_redatcted/ir_patch/14633_19700101113340173_PEV8894_1_ir_patch.jpg
500	data_redatcted/ir_patch/14749_19700101130318095_PNT2817_1_ir_patch.jpg
600	data_redatcted/ir_patch/14985_19700101182653117_PNU6081_1_ir_patch.jpg
700	data_redatcted/ir_patch/15109_19700101221730832_PVE3685_1_ir_patch.jpg
800	data_redatcted/ir_patch/15285_19700101115249476_AAA7863_1_ir_patch.jpg
900	data_redatcted/ir_patch/15421_19700101132658396_PAA3532_1_ir_patch.jpg
1,000	data_redatcted/ir_patch/15550_19700101184427670_AEX1024_1_ir_patch.jpg
1,100	data_redatcted/ir_patch/15672_19700101221131205_PJV9626_1_ir_patch.jpg
1,200	data_redatcted/ir_patch/15884_19700101120507517_L08_1_ir_patch.jpg
1,300	data_redatcted/ir_p

It took less than 30 minutes to process over 1,800 images. I tested five images before running the final loop. Those images were not submitted again.
```
Done. Number of images processed: 1,816  Total time: 0:28:11.670673
```

# Results
We create a JSON file for each processed image. 

In [55]:
! ls -l data_redatcted/amazon_rekognition/ | head -5

total 40
-rw-r--r--  1 pmolnar  342652723  2719 Aug 23 18:23 15110_19700101222002895_BQG0279_1_amazon_rekognition.json
-rw-r--r--  1 pmolnar  342652723  1315 Aug 23 18:23 15111_19700101222046389_CCC4807_1_amazon_rekognition.json
-rw-r--r--  1 pmolnar  342652723  1281 Aug 23 18:23 15112_19700101222047313_CCC48_1_amazon_rekognition.json
-rw-r--r--  1 pmolnar  342652723  1271 Aug 23 18:23 15113_19700101222047577_CCC480_1_amazon_rekognition.json
-rw-r--r--  1 pmolnar  342652723  2667 Aug 23 18:23 15114_19700101222135823_BPE7451_1_amazon_rekognition.json


In [57]:
pprint.pprint(json.load(open(jp(DATAPATH, 'amazon_rekognition', 
                                '15114_19700101222135823_BPE7451_1_amazon_rekognition.json'))))

{'ProcessingTime': '0:00:00.935562',
 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '2112',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sun, 23 Aug 2020 22:23:16 GMT',
                                      'x-amzn-requestid': '7a5430e7-ee55-465a-92b7-4ff02cf0dba0'},
                      'HTTPStatusCode': 200,
                      'RequestId': '7a5430e7-ee55-465a-92b7-4ff02cf0dba0',
                      'RetryAttempts': 0},
 'TextDetections': [{'Confidence': 99.2896957397461,
                     'DetectedText': 'BPE 7451',
                     'Geometry': {'BoundingBox': {'Height': 0.3292461037635803,
                                                  'Left': 0.311930775642395,
                                                  'Top': 0.32807275652885437,
                                                  'Width': 0.3606607615

# Performance Evaluation

## Build Meta Database
Create a data table with meta information

In [79]:
import xml.etree.ElementTree as ET

### File List

In [87]:
xml_list = [x for x in filter(lambda s: s.endswith('.xml'), map(lambda s: s.strip(), os.popen('ls %s/xml' % DATAPATH).readlines()))]
context_list = [x for x in filter(lambda s: s.endswith('.jpg'), map(lambda s: s.strip(), os.popen('ls %s/context' % DATAPATH).readlines()))]
ir_list = [x for x in filter(lambda s: s.endswith('.jpg'), map(lambda s: s.strip(), os.popen('ls %s/ir_patch' % DATAPATH).readlines()))]
rek_list = [x for x in filter(lambda s: s.endswith('.json'), map(lambda s: s.strip(), os.popen('ls %s/amazon_rekognition' % DATAPATH).readlines()))]
xml_list[:3], context_list[:3], ir_list[:3], rek_list[:3]

(['14134_19700101194928245_BHA6172_1.xml',
  '14135_19700101195631172_BMU2999_1.xml',
  '14136_19700101195849178_PFF9889_1.xml'],
 ['14134_19700101194928245_BHA6172_1_context.jpg',
  '14135_19700101195631172_BMU2999_1_context.jpg',
  '14136_19700101195849178_PFF9889_1_context.jpg'],
 ['14134_19700101194928245_BHA6172_1_ir_patch.jpg',
  '14135_19700101195631172_BMU2999_1_ir_patch.jpg',
  '14136_19700101195849178_PFF9889_1_ir_patch.jpg'],
 ['14134_19700101194928245_BHA6172_1_amazon_rekognition.json',
  '14135_19700101195631172_BMU2999_1_amazon_rekognition.json',
  '14136_19700101195849178_PFF9889_1_amazon_rekognition.json'])

### Original Meta Data

In [90]:
meta_df = pd.DataFrame()
for xml in xml_list:
    tree = ET.parse(jp(DATAPATH, 'xml', xml))
    el_dict = {}
    for x in tree.iter():
        el_dict[x.tag] = [x.text]
    el_dict
    r_df = pd.DataFrame(el_dict)
    r_df['xml_filename'] = xml
    r_df.index = [xml.split('_')[0]]
    meta_df = pd.concat([meta_df, r_df])
for c in ['IRImagePatch', 'ContextImage']:
    print(f'fix {c}')
    meta_df[c] = meta_df[c].map(lambda p: p.replace(r'data_redatcted/', ''))
print(meta_df.shape)
meta_df.head(5)

fix IRImagePatch
fix ContextImage
(1839, 31)


Unnamed: 0,plate_read,InstanceID,CameraID,TimeStamp,TimeStampError,LaneID,VehicleDirection,PlateNotRead,VRN,VRNConfidence,...,PlateWidth,PlateHeight,ANPRImageWidth,ANPRImageHeight,IRImage,IRImagePatch,ContextImage,ContextImagePatch,ContextVideo,xml_filename
14134,\n\t,14134,2,1970-01-01T15:49:28.245-0400,0,1,A,0,BHA6172,85,...,280,80,1280,1024,,ir_patch/14134_19700101194928245_BHA6172_1_ir_...,context/14134_19700101194928245_BHA6172_1_cont...,,,14134_19700101194928245_BHA6172_1.xml
14135,\n\t,14135,2,1970-01-01T15:56:31.172-0400,0,1,A,0,BMU2999,71,...,280,80,1280,1024,,ir_patch/14135_19700101195631172_BMU2999_1_ir_...,context/14135_19700101195631172_BMU2999_1_cont...,,,14135_19700101195631172_BMU2999_1.xml
14136,\n\t,14136,2,1970-01-01T15:58:49.178-0400,0,1,A,0,PFF9889,90,...,280,80,1280,1024,,ir_patch/14136_19700101195849178_PFF9889_1_ir_...,context/14136_19700101195849178_PFF9889_1_cont...,,,14136_19700101195849178_PFF9889_1.xml
14137,\n\t,14137,2,1970-01-01T16:02:04.933-0400,0,1,A,0,PTA2105,87,...,280,80,1280,1024,,ir_patch/14137_19700101200204933_PTA2105_1_ir_...,context/14137_19700101200204933_PTA2105_1_cont...,,,14137_19700101200204933_PTA2105_1.xml
14140,\n\t,14140,2,1970-01-01T16:04:42.871-0400,0,1,A,0,CCD7351,92,...,280,80,1280,1024,,ir_patch/14140_19700101200442871_CCD7351_1_ir_...,context/14140_19700101200442871_CCD7351_1_cont...,,,14140_19700101200442871_CCD7351_1.xml


### Amazon Rekognition Results
Take the LINE record with highest confidence. Remove space in VRN.

In [118]:
rekognition_df = pd.DataFrame()
for ak in rek_list:
    res = json.load(open(jp(DATAPATH, 'amazon_rekognition', ak)))
    tmpdf = pd.DataFrame(res['TextDetections'])
    tmpdf2 = tmpdf[tmpdf['Type']=='LINE'].sort_values('Confidence', ascending=False).head(1).copy()
    tmpdf2['InstanceID'] = [ak.split('_')[0]]
    tmpdf2.index = [ak.split('_')[0]]
    rekognition_df = pd.concat([rekognition_df, tmpdf2])
rekognition_df['Rekognition_VRN'] = rekognition_df['DetectedText'].map(lambda s: s.replace(' ', ''))
print(rekognition_df.shape)
display(rekognition_df.head(5))

(1821, 8)


Unnamed: 0,DetectedText,Type,Id,Confidence,Geometry,ParentId,InstanceID,Rekognition_VRN
14134,BHA 6172,LINE,0,99.231064,"{'BoundingBox': {'Width': 0.3537510931491852, ...",,14134,BHA6172
14135,BMU 2999,LINE,0,99.947304,"{'BoundingBox': {'Width': 0.29746976494789124,...",,14135,BMU2999
14136,PFF9880,LINE,1,99.517075,"{'BoundingBox': {'Width': 0.29602688550949097,...",,14136,PFF9880
14137,PTA2105,LINE,0,96.10862,"{'BoundingBox': {'Width': 0.33217155933380127,...",,14137,PTA2105
14140,CCD7351,LINE,0,99.057617,"{'BoundingBox': {'Width': 0.3717344403266907, ...",,14140,CCD7351


In [122]:
df = pd.merge(meta_df, rekognition_df, on='InstanceID', suffixes=['', '_2'])
print(df.shape)

(1839, 38)


In [128]:
df.columns

Index(['plate_read', 'InstanceID', 'CameraID', 'TimeStamp', 'TimeStampError',
       'LaneID', 'VehicleDirection', 'PlateNotRead', 'VRN', 'VRNConfidence',
       'Tag', 'TagConfidence', 'Classification', 'Country', 'Velocity',
       'VelocityError', 'XYType', 'XCoord', 'YCoord', 'PlateXCoord',
       'PlateYCoord', 'PlateWidth', 'PlateHeight', 'ANPRImageWidth',
       'ANPRImageHeight', 'IRImage', 'IRImagePatch', 'ContextImage',
       'ContextImagePatch', 'ContextVideo', 'xml_filename', 'DetectedText',
       'Type', 'Id', 'Confidence', 'Geometry', 'ParentId', 'Rekognition_VRN'],
      dtype='object')

In [129]:
df[['VRN', 'Rekognition_VRN', 'xml_filename']].head()

Unnamed: 0,VRN,Rekognition_VRN,xml_filename
0,BHA6172,BHA6172,14134_19700101194928245_BHA6172_1.xml
1,BMU2999,BMU2999,14135_19700101195631172_BMU2999_1.xml
2,PFF9889,PFF9880,14136_19700101195849178_PFF9889_1.xml
3,PTA2105,PTA2105,14137_19700101200204933_PTA2105_1.xml
4,CCD7351,CCD7351,14140_19700101200442871_CCD7351_1.xml


In [134]:
print(f"Number of records that match: {np.sum(df.VRN==df.Rekognition_VRN):,}")
print(f"Number of records that do not match: {np.sum(df.VRN!=df.Rekognition_VRN):,}")

Number of records that match: 1,119
Number of records that do not match: 720


In [130]:
df[df.VRN!=df.Rekognition_VRN][['VRN', 'Rekognition_VRN', 'xml_filename']].head()

Unnamed: 0,VRN,Rekognition_VRN,xml_filename
2,PFF9889,PFF9880,14136_19700101195849178_PFF9889_1.xml
5,CDI4388,LII4388,14141_19700101200554349_CDI4388_1.xml
7,728JFW,728JFM,14143_19700101201336085_728JFW_1.xml
9,AGV6998,AGY6998,14145_19700101203157953_AGV6998_1.xml
11,1WF412I,WF4121,14147_19700101203445593_1WF412I_1.xml


![](data_redatcted/ir_patch/14136_19700101195849178_PFF9889_1_ir_patch.jpg)
![](data_redatcted/ir_patch/14141_19700101200554349_CDI4388_1_ir_patch.jpg)
![](data_redatcted/ir_patch/14143_19700101201336085_728JFW_1_ir_patch.jpg)
![](data_redatcted/ir_patch/14145_19700101203157953_AGV6998_1_ir_patch.jpg)
![](data_redatcted/ir_patch/14147_19700101203445593_1WF412I_1_ir_patch.jpg)