# Detect and blur faces in video using Amazon Rekognition Video

***
This notebook provides a walkthrough of [face detection API](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_StartFaceDetection.html) in Amazon Rekognition Video to identify faces in a stored video.
***

## Initialization

Installing required libraries:   
-**opencv-python** : pre-built CPU-only OpenCV packages for Python. OpenCV is a library of programming functions mainly aimed at real-time computer vision.  
-**ffmpeg-python** : python bindings for FFmpeg. FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.   
-**moviepy** : a Python library for video editing: cutting, concatenations, title insertions, video compositing (a.k.a. non-linear editing), video processing, and creation of custom effects. 

In [None]:
!pip install opencv-python boto3 botocore ffmpeg-python moviepy --upgrade -q

In [None]:
import boto3
import botocore

In [None]:
client = boto3.client("sts")
account = client.get_caller_identity()["Account"]
print(f"AWS account:{account}")

session = boto3.session.Session()
region = session.region_name
print(f"AWS region:{region}")

In [None]:
# S3 bucket that will be the location from where we will read images/videos

bucket = 'dml102-20221003-sagemaker-bucket-{region}-{account}'.format(region = region, account = account)

In [None]:
# copy video from local EBS storage to S3 bucket

!aws s3 cp videos/people-walking.mp4 s3://$bucket/rekognition-video-demo/people-walking.mp4

## Detect faces in video
***

Explore the code in blur_faces/

In [None]:
from blur_faces.rekognition import get_timestamps_and_faces
from blur_faces.rekognition import boto3_client, start_face_detection
from blur_faces.rekognition import wait_for_completion
from blur_faces.video_processor import apply_faces_to_video, integrate_audio

Initializing local paths to video and local output directory.

In [None]:
video = 'rekognition-video-demo/people-walking.mp4'
local_path_to_video = 'videos/people-walking.mp4'
local_output = 'videos/output.mp4'

1. Calling the [start face detection API](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_StartFaceDetection.html)  
This API starts asynchronous detection of faces in a stored video.

Amazon Rekognition Video can detect faces in a video stored in an Amazon S3 bucket. When calling the API you need to specify the bucket name and the filename of the video. StartFaceDetection returns a job identifier (JobId) that you use to get the results of the operation.
```
def start_face_detection(bucket, video, size, reko_client=None):
    assert check_format_and_size(video, size)
    if reko_client == None:
        reko_client = boto3.client('rekognition')
    response = reko_client.start_face_detection(Video={'S3Object': {'Bucket': bucket, 'Name': video}})
    return response['JobId']
```
2. Checking the progress of the job, by calling the [get_face_detection(JobId=job_id) API](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_GetFaceDetection.html) every 30sec  
Gets face detection results for a Amazon Rekognition Video analysis started by StartFaceDetection.

Face detection with Amazon Rekognition Video is an asynchronous operation. You start face detection by calling StartFaceDetection which returns a job identifier (JobId). 
 In this step we are calling the get_face_detection API passing the `JobId` and checking the JobStatus response. This parameter can have one of the following values `IN_PROGRESS | SUCCEEDED | FAILED`

When the face detection operation finishes, the JobStatus will change from `IN_PROGRESS` to `SUCCEEDED`.

```
def wait_for_completion(job_id, wait_time_in_s=30, reko_client=None):
    if reko_client == None:
        reko_client = boto3.client('rekognition')
    response = reko_client.get_face_detection(JobId=job_id)
    while (response['JobStatus'] == 'IN_PROGRESS'):
        print('.', end='')
        time.sleep(wait_time_in_s)
        response = reko_client.get_face_detection(JobId=job_id)
    print('Complete')
    return response  
```
3. Get timestamps and faces  
When the JobStatus is `SUCCEEDED` GetFaceDetection returns an array of detected faces (Faces) sorted by the time the faces were detected.

Use MaxResults parameter to limit the number of labels returned. If there are more results than specified in MaxResults, the value of NextToken in the operation response contains a pagination token for getting the next set of results. To get the next page of results, call GetFaceDetection and populate the NextToken request parameter with the token value returned from the previous call to GetFaceDetection.

In the `get_timestamps_and_faces` function, we iterate over the results and keep the bounding boxes of all the faces and the timestamps they appear at.

```
def get_timestamps_and_faces(response, job_id, reko_client=None):
    final_timestamps = {}
    next_token = "Y"
    first_round = True
    while next_token != "":
        print('.', end='')
        # Set some variables if it's the first iteration
        if first_round:
            next_token = ""
            first_round = False
        # Query Reko Video
        response = reko_client.get_face_detection(JobId=job_id, NextToken=next_token)
        # Iterate over every face
        for face in response['Faces']:
            f = face["Face"]["BoundingBox"]
            t = str(face["Timestamp"])
            time_faces = final_timestamps.get(t)
            if time_faces == None:
                final_timestamps[t] = []
            final_timestamps[t].append(f)
        # Check if there is another portion of the response
        try:
            next_token = response['NextToken']
        except:
            break
    # Return the final dictionary
    print('Complete')
    return final_timestamps
```


4. Finally, we apply blurring to the faces detected in the video using OpenCV  
Explore the code in `blur_faces/video_processor.py`


In [None]:
reko = boto3_client()
job_id = start_face_detection(bucket, video, 1, reko)
response = wait_for_completion(job_id, reko_client=reko)
timestamps=get_timestamps_and_faces(response, job_id, reko)
apply_faces_to_video(timestamps, local_path_to_video, local_output, response["VideoMetadata"])
# integrate_audio('videos/video-test.mp4', 'videos/output.mp4')

Navigate to the `/videos/` directory and locate the newly created `output.mp4` file. Download the file by right clicking on it and pressing download.  
Use a multimedia player in your local machine such as VLC to play the video and notice how the faces are now blurred.
---

Serverless implementation
---

This task can also be completed entirely serverless-ly, so we don’t need to provision, scale, or maintain our infrastructure. Furthermore, it can also be automated to run every time we have a video uploaded to an S3 bucket.

In this solution, AWS Step Functions, a low-code visual workflow service used to orchestrate AWS services, automate business processes, and build serverless applications, is used to orchestrate the calls and manage the flow of data between AWS Lambda functions. When an object is created in an Amazon Simple Storage Service (S3) bucket, for example by a video file upload, an ObjectCreated event is detected and a first Lambda function is triggered. This Lambda function makes an asynchronous call to the Amazon Rekognition Video face detection API and starts the execution of the AWS Step Functions workflow.

Inside the workflow, we use a Lambda function and a Wait State until the Amazon Rekognition Video asynchronous analysis started earlier finishes execution. Afterward, another Lambda function retrieves the result of the completed process from Amazon Rekognition and passes it to another Lambda function that uses OpenCV to blur the detected faces. To easily use OpenCV with our Lambda function, we built a Docker image hosted on Amazon Elastic Container Registry (ECR), and then deployed on AWS Lambda thanks to Container Image Support.

![Serverless architecture](img/blur-faces-arch.png)

1. The event flow starts at the moment of the video ingestion into Amazon S3. Amazon Rekognition Video supports MPEG-4 and MOV file formats, encoded using the H.264 codec.
2. After the video file has been stored into Amazon S3, it automatically kicks-off an event triggering a Lambda function.
3. The Lambda function uses the video’s attributes (name and location on Amazon S3) to start the face detection job on Amazon Rekognition through an API call.
4. The same Lambda function then starts the Step Functions state machine, forwarding the video’s attributes and the Amazon Rekognition job ID.
5. The Step Functions workflow starts with a Lambda function waiting for the Amazon Rekognition job to be finished. Once it’s done, another Lambda function gets the results from Amazon Rekognition.
6. Finally, a Lambda function with Container Image Support fetches its Docker image, which supports OpenCV from Amazon ECR, blurs the faces detected by Amazon Rekognition, and temporarily stores the output video locally.
7. Then, the blurred video is put into the output S3 bucket and removed from local files.

Now, let's look closer inside the AWS Step Function workflow  
![step_functions_workflow](img/step_functions.png)