In [516]:
import boto3
import numpy as np
import pandas as pd

## Christian Bale dataframe

Let's begin by creating the dataframe for the actor Christian Bale. We are using the photos from the Bucket which contain all of Christian Bale's photos.

In [517]:
client = boto3.client('rekognition')
def extract_similarity_christian(photo):
    try:
        comparison = client.compare_faces(
            SourceImage= {'S3Object':{'Bucket':'christianfacematch', 'Name':'Christian-Bale-Base.jpeg'}},
            TargetImage = {'S3Object':{'Bucket':'christianfacematch','Name':photo}})
        similarity = comparison['FaceMatches'][0]['Similarity']
    except Exception:
        similarity = 0
    return similarity

The function above returns the similarity score of a given photo using Rekognition. It will compare the individual photo with a base photo to generate a similarity score. For any image that is not recognizable, the similarity score would be 0.

In [518]:
def extract_confidence_christian(photo):
    try:
        response = client.recognize_celebrities(Image={'S3Object': {
        'Bucket': 'christianfacematch',
        'Name': photo
}})
        confidence = response['CelebrityFaces'][0]['MatchConfidence']
    except Exception:
        confidence = 0
    return confidence

The function above returns the confidence score of the celebrity rekognition function. If the celebrity face is detectable, it will return a score, otherwise the confidence score would be 0.

In [519]:
def extract_name_christian(photo):
    try:
        response = client.recognize_celebrities(Image={'S3Object': {
        'Bucket': 'christianfacematch',
        'Name': photo
}})
        confidence = response['CelebrityFaces'][0]['Name']
    except Exception:
        confidence = "Unrecognizable"
    return confidence

The function above returns the name of the celebrity using the celebrity rekognition function. If the face is unrecognizable, then it will return the string "Unrecognizable".

Then, we make a list of all the Christian Bale's photos:

In [520]:
s3_resource = boto3.resource('s3')
my_bucket = s3_resource.Bucket('christianfacematch')
summaries = my_bucket.objects.all()
image_christian = [image.key for image  in summaries]
image_christian

['Christian-Bale-Base.jpeg',
 'Christian-Bale-Vice1.jpg',
 'Christian-Bale-Vice2.jpeg',
 'Christian-Bale-Vice3.jpeg',
 'Christian-Bale-machinist3.jpeg',
 'christian-bale-american1.jpeg',
 'christian-bale-american2.jpg',
 'christian-bale-american3.jpeg',
 'christian-bale-machinist1.png',
 'christian-bale-machinist2.jpeg']

We create the dataframe for Christian Bale

In [521]:
df_christian = pd.DataFrame({'Name':image_christian})

After that, we add the similarity data column, the name of the celebrity column, and the celebrity confidence score to Christian Bale's dataframe.

In [None]:
df_christian['Similarity'] = [extract_similarity_christian(photo) for photo in df_christian['Name']]

In [None]:
df_christian['Celebrity_Recognized'] = [extract_name_christian(photo) for photo in df_christian['Name']]

In [None]:
df_christian['Celebrity_Confidence'] = [extract_confidence_christian(photo) for photo in df_christian['Name']]

For the data column of whether the actor is wearing makeup or not, we will input the data manually

In [None]:
Makeup_Christian = ['No', 'No', 'No', 'No', 'No', 'No','No','No', 'No', 'No']

We also input the data manually for the weight changes of the actor. This will be in absolute value.

In [None]:
Weight_Christian = [0, 40, 40, 40, 65, 43, 43, 43, 65, 65]

We add the data into new data columns.

In [None]:
df_christian['Makeup'] = Makeup_Christian

In [None]:
df_christian['Weight_Change_in_lbs'] = Weight_Christian

In [None]:
df_christian

## Jared Leto dataframe

We are creating the dataframe for Jared Leto using the same methods. We are using the photos from the Bucket which contain all of Jared Leto's photos.

In [16]:
s3_resource = boto3.resource('s3')
my_bucket = s3_resource.Bucket('jaredfacematch')
summaries = my_bucket.objects.all()
image_jared = [image.key for image  in summaries]
image_jared

['Jared-Leto-Base.jpeg',
 'Jared-Leto-Chapter1.jpeg',
 'Jared-Leto-Chapter2.jpeg',
 'Jared-Leto-Chapter3.jpeg',
 'Jared-Leto-Dallas1.jpeg',
 'Jared-Leto-Dallas2.jpeg',
 'Jared-Leto-Dallas3.jpeg',
 'Jared-Leto-joker1.jpeg',
 'Jared-Leto-joker2.png',
 'Jared-Leto-joker3.jpeg']

In [17]:
def extract_similarity_jared(photo):
    try:
        comparison = client.compare_faces(
            SourceImage= {'S3Object':{'Bucket':'jaredfacematch', 'Name':'Jared-Leto-Base.jpeg'}},
            TargetImage = {'S3Object':{'Bucket':'jaredfacematch','Name':photo}})
        similarity = comparison['FaceMatches'][0]['Similarity']
    except Exception:
        similarity = 0
    return similarity

In [18]:
def extract_confidence_jared(photo):
    try:
        response = client.recognize_celebrities(Image={'S3Object': {
        'Bucket': 'jaredfacematch',
        'Name': photo
}})
        confidence = response['CelebrityFaces'][0]['MatchConfidence']
    except Exception:
        confidence = 0
    return confidence

In [19]:
def extract_name_jared(photo):
    try:
        response = client.recognize_celebrities(Image={'S3Object': {
        'Bucket': 'jaredfacematch',
        'Name': photo
}})
        confidence = response['CelebrityFaces'][0]['Name']
    except Exception:
        confidence = "Unrecognizable"
    return confidence

In [20]:
df_jared = pd.DataFrame({'Name':image_jared})

In [21]:
df_jared['Similarity'] = [extract_similarity_jared(photo) for photo in df_jared['Name']]

In [22]:
df_jared['Celebrity_Recognized'] = [extract_name_jared(photo) for photo in df_jared['Name']]

In [23]:
df_jared['Celebrity_Confidence'] = [extract_confidence_jared(photo) for photo in df_jared['Name']]

In [24]:
Makeup_Jared = ['No', 'No', 'No', 'No', 'Yes', 'Yes','Yes','Yes', 'Yes', 'Yes']

In [25]:
Weight_Jared = [0, 70, 70, 70, 30, 30, 30, 0, 0, 0]

In [26]:
df_jared['Makeup'] = Makeup_Jared

In [27]:
df_jared['Weight_Change_in_lbs'] = Weight_Jared

In [28]:
df_jared

Unnamed: 0,Name,Similarity,Celebrity_Recognized,Celebrity_Confidence,Makeup,Weight_Change_in_lbs
0,Jared-Leto-Base.jpeg,100.0,Jared Leto,99.244003,No,0
1,Jared-Leto-Chapter1.jpeg,0.0,Jared Leto,92.98777,No,70
2,Jared-Leto-Chapter2.jpeg,0.0,Jared Leto,96.524673,No,70
3,Jared-Leto-Chapter3.jpeg,90.493851,Jared Leto,91.769417,No,70
4,Jared-Leto-Dallas1.jpeg,0.0,Unrecognizable,0.0,Yes,30
5,Jared-Leto-Dallas2.jpeg,0.0,Unrecognizable,0.0,Yes,30
6,Jared-Leto-Dallas3.jpeg,0.0,Unrecognizable,0.0,Yes,30
7,Jared-Leto-joker1.jpeg,0.0,Unrecognizable,0.0,Yes,0
8,Jared-Leto-joker2.png,0.0,Unrecognizable,0.0,Yes,0
9,Jared-Leto-joker3.jpeg,0.0,Unrecognizable,0.0,Yes,0


## Charlize Theron dataframe

We are creating the dataframe for Charlize Theron using the same methods. We are using the photos from the Bucket which contain all of Charlize Theron's photos.

In [29]:
s3_resource = boto3.resource('s3')
my_bucket = s3_resource.Bucket('charlizefacematch')
summaries = my_bucket.objects.all()
image_charlize = [image.key for image  in summaries]
image_charlize

['Charlize-Theron-Base.jpeg',
 'Charlize-Theron-Madmax1.jpeg',
 'Charlize-Theron-Madmax2.jpeg',
 'Charlize-Theron-Madmax3.jpeg',
 'Charlize-Theron-Monster1.jpeg',
 'Charlize-Theron-Monster2.jpeg',
 'Charlize-Theron-Monster3.jpeg',
 'Charlize-Theron-Tully1.jpg',
 'Charlize-Theron-Tully2.jpeg',
 'Charlize-Theron-Tully3.jpeg']

In [30]:
def extract_similarity_charlize(photo):
    try:
        comparison = client.compare_faces(
            SourceImage= {'S3Object':{'Bucket':'charlizefacematch', 'Name':'Charlize-Theron-Base.jpeg'}},
            TargetImage = {'S3Object':{'Bucket':'charlizefacematch','Name':photo}})
        similarity = comparison['FaceMatches'][0]['Similarity']
    except Exception:
        similarity = 0
    return similarity

In [31]:
def extract_confidence_charlize(photo):
    try:
        response = client.recognize_celebrities(Image={'S3Object': {
        'Bucket': 'charlizefacematch',
        'Name': photo
}})
        confidence = response['CelebrityFaces'][0]['MatchConfidence']
    except Exception:
        confidence = 0
    return confidence

In [32]:
def extract_name_charlize(photo):
    try:
        response = client.recognize_celebrities(Image={'S3Object': {
        'Bucket': 'charlizefacematch',
        'Name': photo
}})
        confidence = response['CelebrityFaces'][0]['Name']
    except Exception:
        confidence = "Unrecognizable"
    return confidence

In [33]:
df_charlize = pd.DataFrame({'Name':image_charlize})

In [34]:
df_charlize['Similarity'] = [extract_similarity_charlize(photo) for photo in df_charlize['Name']]

In [35]:
df_charlize['Celebrity_Recognized'] = [extract_name_charlize(photo) for photo in df_charlize['Name']]

In [36]:
df_charlize['Celebrity_Confidence'] = [extract_confidence_charlize(photo) for photo in df_charlize['Name']]

In [37]:
Makeup_Charlize = ['No', 'Yes', 'Yes', 'Yes', 'Yes', 'Yes','Yes','No', 'No', 'No']

In [38]:
Weight_Charlize = [0, 20, 20, 20, 30, 30, 30, 50, 50, 50]

In [39]:
df_charlize['Makeup'] = Makeup_Charlize

In [40]:
df_charlize['Weight_Change_in_lbs'] = Weight_Charlize

In [41]:
df_charlize

Unnamed: 0,Name,Similarity,Celebrity_Recognized,Celebrity_Confidence,Makeup,Weight_Change_in_lbs
0,Charlize-Theron-Base.jpeg,100.0,Charlize Theron,96.863136,No,0
1,Charlize-Theron-Madmax1.jpeg,0.0,Unrecognizable,0.0,Yes,20
2,Charlize-Theron-Madmax2.jpeg,97.272453,Unrecognizable,0.0,Yes,20
3,Charlize-Theron-Madmax3.jpeg,94.869904,Unrecognizable,0.0,Yes,20
4,Charlize-Theron-Monster1.jpeg,0.0,Unrecognizable,0.0,Yes,30
5,Charlize-Theron-Monster2.jpeg,0.0,Unrecognizable,0.0,Yes,30
6,Charlize-Theron-Monster3.jpeg,0.0,Unrecognizable,0.0,Yes,30
7,Charlize-Theron-Tully1.jpg,99.983475,Charlize Theron,82.433632,No,50
8,Charlize-Theron-Tully2.jpeg,99.932693,Unrecognizable,0.0,No,50
9,Charlize-Theron-Tully3.jpeg,98.615128,Charlize Theron,97.655167,No,50


## Merging the 3 dataframes

After we have 3 different dataframe for each actor, we merge the 3 dataframe together using the append function to create one big dataframe:

In [42]:
df = df_christian.append([df_jared, df_charlize],ignore_index=True)
display(df)

Unnamed: 0,Name,Similarity,Celebrity_Recognized,Celebrity_Confidence,Makeup,Weight_Change_in_lbs
0,Christian-Bale-Base.jpeg,100.0,Christian Bale,99.544708,No,0
1,Christian-Bale-Vice1.jpg,0.0,Unrecognizable,0.0,No,40
2,Christian-Bale-Vice2.jpeg,0.0,Unrecognizable,0.0,No,40
3,Christian-Bale-Vice3.jpeg,0.0,Unrecognizable,0.0,No,40
4,Christian-Bale-machinist3.jpeg,99.815094,Christian Bale,94.874214,No,65
5,christian-bale-american1.jpeg,99.809196,Felix Kramer,81.725716,No,43
6,christian-bale-american2.jpg,99.980438,Christian Bale,88.991524,No,43
7,christian-bale-american3.jpeg,99.969536,Felix Kramer,76.393921,No,43
8,christian-bale-machinist1.png,97.410477,Christian Bale,99.6577,No,65
9,christian-bale-machinist2.jpeg,99.520897,Christian Bale,99.595238,No,65


We want to find the average simlarity score and Celebrity Rekognition confidence score for the actor when they wear makeup and when they do not wear makeup. To do that, we group the data by the column 'Makeup' and find the mean of the similarity score and celebrity confidence score column:

In [43]:
grouped_df = df.groupby("Makeup")
mean_df = grouped_df.mean()
mean_df[["Similarity", "Celebrity_Confidence"]]

Unnamed: 0_level_0,Similarity,Celebrity_Confidence
Makeup,Unnamed: 1_level_1,Unnamed: 2_level_1
No,71.418377,72.125601
Yes,16.011863,0.0


## Saving the dataframe to S3

In [44]:
df.to_csv('actors_analysis.csv')

In [49]:
!aws s3 ls

2021-11-29 21:50:53 actorsdataframe-storage
2021-11-15 18:22:56 charlizefacematch
2021-11-15 18:24:32 christianfacematch
2021-11-15 18:26:03 jaredfacematch
2021-11-15 04:32:43 projectvuadachi


In [46]:
!aws s3 mb s3://actorsdataframe-storage

make_bucket: actorsdataframe-storage


In [47]:
!aws s3 cp actors_analysis.csv s3://actorsdataframe-storage/actors_analysis.csv

upload: ./actors_analysis.csv to s3://actorsdataframe-storage/actors_analysis.csv


In [48]:
!aws s3 ls actorsdataframe-storage

2021-11-29 21:51:24       2196 actors_analysis.csv
