# Managing users in Amazon Rekognition Face collections

Amazon Rekognition can store information about detected faces in server-side containers known as collections. 
You can store individual faces and associate multiple individual faces with a single user. 

Individual faces are stored as face vectors, a mathematical representation of the face (not the actual image of the face).
Multiple face vectors can then be aggregated to create and store user vectors. 

User vectors are more robust representations, as they contain multiple face vectors with varying degrees of lighting, sharpness, poses, appearance differences, etc. 

Face matching with user vectors can improve accuracy by up to 45% compared to individual face vectors. You can use faces detected in images, stored videos, and streaming videos to search against stored face vectors and/or user vectors for face matching purposes.

### Environment Setup

First step let's import the necessary libraries to run the notebook and create an Amazon Rekognition client with Boto3.

In [None]:
import io
import os
import boto3
import json
from IPython.display import Image as IImage
import pandas as pd

%store -r bucket_name
mySession = boto3.session.Session()
aws_region = mySession.region_name
print("AWS Region: {}".format(aws_region))
print("AWS Bucket: {}".format(bucket_name))


In [None]:
s3_client  = boto3.client('s3')
rek_client = boto3.client('rekognition')

### Create a new collection

Before we begin to create users in Rekognition, we must have an existing face collection. 

In [None]:
collection_id='My_Face_Collection1' # Remember you must use a unique name if you are creating a new collection

In [None]:
def create_collection(collection_id):
    print('Creating collection:' + collection_id)
    response=rek_client.create_collection(CollectionId=collection_id)
    print('Collection ARN: ' + response['CollectionArn'])
    print('Status code: ' + str(response['StatusCode']))
    print('Done...')
    
create_collection(collection_id)

### Confirm your collection creation
Let's display the collections in our account to verify the previous collection was completed correctly.

In [None]:
def list_collections():

    max_results=10

    print('Displaying collections...')
    response=rek_client.list_collections(MaxResults=max_results)
    collection_count=0
    done=False

    while not done:
        collections=response['CollectionIds']

        for collection in collections:
            print ("- "+ collection)
            collection_count+=1
        if 'NextToken' in response:
            nextToken=response['NextToken']
            response=rek_client.list_collections(NextToken=nextToken,MaxResults=max_results)

        else:
            done=True
            
    return collection_count

collection_count=list_collections()

print("Collections: " + str(collection_count))

### Create a new user
Once the collection is created, we can proceed to create a user. We are going to use the **create_user** method, which creates a new user in a collection and returns a unique user ID.

In [None]:
user_id = "Daniel"
user2_id = "John"
def create_user(user_id):
    response = rek_client.create_user(
        CollectionId=collection_id, 
        UserId=user_id,
    )
    print(response)

create_user(user_id)
create_user(user2_id)

### Confirm your collection creation
With the **list_users** method we can see the created users in our collection. 

The **UserStatus** reflects the status of an operation which updates a User representation with a list of given faces. The can be:
- ACTIVE - All associations or disassociations of FaceID(s) for a User are complete.
- CREATED - A User has been created,but has no FaceID(s) associated with it.
- UPDATING - A User is being updated and there are current associations or disassociations of FaceID(s) taking place.

In [None]:
# ListUsers - Lists the users in a collection.
def list_users():
    response = rek_client.list_users(
        CollectionId=collection_id
    )
    print(response["Users"])

list_users()


### Add faces to a collection
Now we have our user created, let's populate the face collection with photos which will later be associated to the user. 

In [None]:
# -- read the image map into a pandas dataframe --
obj = s3_client.get_object(Bucket=bucket_name, Key='IDVImageMapping.xlsx')

image_map = pd.read_excel(io.BytesIO(obj['Body'].read()), engine='openpyxl')
image_map.head()

In [None]:
## Index several faces 
dict_of_faces = image_map[["reference_name","reference_image"]].to_dict('records')

for rec in dict_of_faces:
    try:
        response = rek_client.index_faces(
            CollectionId= collection_id,
            Image={
                'S3Object': {
                    'Bucket': bucket_name,
                    'Name': rec["reference_image"],
                }
            },
            ExternalImageId=rec['reference_name'],
            DetectionAttributes=[
                'DEFAULT',
            ],
            MaxFaces=1, # maximum faces detected 
            QualityFilter='AUTO' # apply the quality filter. 
            )
        face_id = response['FaceRecords'][0]['Face']['FaceId']
        print("ImageName: {}, FaceID: {}".format(rec["reference_image"], face_id))
    except:
         print("Failed: ImageName: {}, FaceID: {}".format(rec["reference_image"], face_id))
    

print("indexing complete")
    

### List faces in the collection
Review the faces have been correctly indexed into the collection.

In [None]:
def list_collection_faces(collection_id):
    response = rek_client.list_faces(
        CollectionId=collection_id
    )
    faces = []
    for face in response["Faces"]:
        faces.append({"Name":face["ExternalImageId"],"FaceId":str(face["FaceId"])})
        print("Image: {}, FaceId: {}".format(face["ExternalImageId"],face["FaceId"]))
    return faces

faces = list_collection_faces(collection_id)

### Search face by image
Let's compare searching against a collection with a single photo of a user vs the results when you associate multiple faces to a user vector.

In [None]:
def search_face_by_image(data, collection):
    searchresults = rek_client.search_faces_by_image(CollectionId=collection,
                                                    Image={'Bytes':data},
                                                    FaceMatchThreshold=50)
    return searchresults

In [None]:
file = open("media/test/test1.jpg", "rb") # opening for [r]eading as [b]inary
data = file.read() 

In [None]:
results = search_face_by_image(data,collection_id)["FaceMatches"][0]
print("The similarity searching against a single low quality image is: {}".format(results["Similarity"]))

### Associate faces in the collections to a user
It's time to associate the faces in our collection to our user. For this task we will use the **associate_faces** method.

This method takes an array of FaceIds. Each FaceId that is present in the list is associated with the provided User. The maximum number of total FaceIds per User is 100.

The parameter specifies the minimum User match confidence required for the face to be associated with a User that has at least one faceID already associated. This ensures that the FaceIds are associated with the right User. The value ranges from 0-100 and default value is 75.

#### Associate a single face to a user
Let's associate a single face from our faces array. 

In [None]:
def get_faceid_by_name(name, data):
    for item in data:
        if 'Name' in item and item['Name'] == name:
            return item.get('FaceId', None)
    return None

faceIds = []
name_to_find = 'Dani'

face_id = get_faceid_by_name(name_to_find, faces)

if face_id is not None:
    faceIds.append(face_id)
    print(f"The FaceId for '{name_to_find}' is: {face_id}")
    print(f"The FaceIds array to attach to the user is: {faceIds}")
else:
    print(f"No FaceId found for '{name_to_find}'.")

In [None]:
def associate_one_face(faceid, collection_id, user_id):
    response = rek_client.associate_faces(
        CollectionId=collection_id,
        UserId=user_id,
        FaceIds=faceid
    )
    print(response)

associate_one_face(faceIds, collection_id, user_id)

#### Associate multiple faces in the collections to a user
In the previous cells you learned how to associate a single faceId to a user. You can associate multiple faceIds by passing in an array of faceIds.

First, let's index more faces of our user into the collection. 

In [None]:
images_to_index = ["dani_0.jpeg", "dani_1.jpeg", "dani_2.jpeg","dani_3.jpeg"]
external_image_id = "Dani"
faceIds = []
for image_name in images_to_index:
    
    response = rek_client.index_faces(
        CollectionId= collection_id,
        Image={
            'S3Object': {
                'Bucket': bucket_name,
                'Name': image_name,
            }
        },
        ExternalImageId=external_image_id,
        DetectionAttributes=[
            'DEFAULT',
        ],
        MaxFaces=1, # maximum faces detected 
        QualityFilter='AUTO' # apply the quality filter. 
        )
    face_id = response['FaceRecords'][0]['Face']['FaceId']
    faceIds.append(face_id)
    print("ImageName: {}, FaceID: {}".format(image_name, face_id))
    

print("indexing complete")
print(f"The FaceIds array to attach to the user is: {faceIds}")

Let's associate the faces from our faces array

In [None]:
def associate_multiple_faces(faces, collection_id, user_id):
    response = rek_client.associate_faces(
        CollectionId=collection_id,
        UserId=user_id,
        FaceIds=faces,
        UserMatchThreshold=75
    )
    print(response)

associate_multiple_faces(faceIds, collection_id, user_id)

### Search Users by UserId or FaceId
Searches for Users within a collection based on a or UserId. This API can be used to find the closest User (with a highest similarity) to associate a face. 

The request must be provided with either FaceId or UserId. The operation returns an array of User that matches the FaceId or UserId, ordered by similarity score with the highest similarity first.

In [None]:
def search_users(collection_id,face_id):
    response = rek_client.search_users(
        CollectionId=collection_id,
        FaceId=face_id
        #UserId=user_id
    )
    print(response)
    
search_users(collection_id,faceIds[0])

### Search Users by Image
Searches for Users using a supplied image. It first detects the largest face in the image, and then searches a specified collection for matching Users.

The operation returns an array of Users that match the face in the supplied image, ordered by similarity score with the highest similarity first. It also returns a bounding box for the face found in the input image.

In [None]:
image="test/test1.jpg"

def search_users_by_image(collection_id,image):
    file = open("media/{}".format(image), "rb") # opening for [r]eading as [b]inary
    data = file.read() 
    response = rek_client.search_users_by_image(
        CollectionId=collection_id,
        Image={'Bytes':data}
    )
    return response
    
results_user = search_users_by_image(collection_id,image)
print(results_user["UserMatches"])
print("The similarity searching against a single image was: {}".format(results["Similarity"]))
print("The similarity searching for users with multiple faces associated is: {}".format(results_user["UserMatches"][0]["Similarity"]))

### Disassociate faces from a user
Remove the association between a Face supplied in an array of FaceIds and the User. If the User is not present already, then a ResourceNotFound exception is thrown.

In [None]:
def disassociate_faces(face_ids,collection_id,user_id):
    response = rek_client.disassociate_faces(
        CollectionId=collection_id,
        UserId=user_id,
        FaceIds=face_ids
    )
    print(response)
disassociate_faces(faceIds,collection_id,user_id)

### Delete a user
Let's delete the user we created in our collection.

In [None]:
def delete_user(collection_id,user_id):
    response = rek_client.delete_user(
        CollectionId=collection_id,
        UserId=user_id
    )
    print(response)

In [None]:
delete_user(collection_id,user_id)
delete_user(collection_id,user2_id)

In [None]:
list_users()

### Delete collection
Let's delete the collections we created in our account.

In [None]:
def delete_collection(collection_id):

    print('Attempting to delete collection ' + collection_id)
    status_code=0
    try:
        response=rek_client.delete_collection(CollectionId=collection_id)
        status_code=response['StatusCode']
        
    except ClientError as e:
        if e.response['Error']['Code'] == 'ResourceNotFoundException':
            print ('The collection ' + collection_id + ' was not found ')
        else:
            print ('Error other than Not Found occurred: ' + e.response['Error']['Message'])
        status_code=e.response['ResponseMetadata']['HTTPStatusCode']
    print('Status code: ' + str(status_code))


delete_collection(collection_id)

In [None]:
collection_count=list_collections()
print("Collections: " + str(collection_count))