##  Actor Rating Difference Analysis ##
You are given a dataset of actors and the films they have been involved in, including each film's release date and rating. For each actor, calculate the difference between the rating of their most recent film and their average rating across all previous films (the average rating excludes the most recent one).


Return a list of actors along with their average lifetime rating, the rating of their most recent film, and the difference between the two ratings. If an actor has only one film, return 0 for the difference and their only film’s rating for both the average and latest rating fields.

DataFrame: actor_rating_shift

In [1]:
import pandas as pd
import numpy as np
from datetime import datetime

# Use the below code to create actor_rating_shift dataframe
# Create the dataset
data = [
    ['Matt Damon', 'Equal Depths', '2018-09-21', 8.0],
    ['Matt Damon', 'Equal Heights', '2015-06-15', 8.0],
    ['Emma Stone', 'Quantum Fate', '2003-08-31', 8.1],
    ['Leonardo Dicaprio', 'Rebel Rising', '2001-06-26', 9.2],
    ['Alex Taylor', 'Shadow Realm', '2002-02-06', 8.2],
    ['Emma Stone', 'Eternal Bond', '2008-07-05', 6.5],
    ['Scarlett Johansson', 'Infinite Skies', '2010-04-21', 8.7],
    ['Scarlett Johansson', 'Silent Storm', '2005-12-06', 7.4],
    ['Jane Smith', 'Burning Hearts', '2003-02-17', 5.5],
    ['Leonardo Dicaprio', 'Eternal Bond', '2020-06-24', 6.4],
    ['Natalie Portman', 'Silent Storm', '2003-03-16', 8.4],
    ['Alex Taylor', 'Bright Lights', '2000-09-17', 7.7],
    ['Emma Stone', 'A Long Journey', '2016-05-09', 6.2],
    ['Will Smith', 'Rebel Rising', '2006-03-27', 6.6],
    ['Chris Evans', 'Lost Island', '2008-07-09', 5.4],
    ['Natalie Portman', 'Bright Lights', '2005-07-13', 6.9],
    ['John Doe', 'Bright Lights', '2000-07-23', 8.3],
    ['Angelina Jolie', 'Infinite Skies', '2019-03-08', 6.0],
    ['Matt Damon', 'Final Plateau', '2020-03-01', 8.0],
    ['Brad Pitt', 'Shadow Realm', '2014-10-31', 6.0],
    ['Scarlett Johansson', 'Bright Lights', '2016-05-13', 7.5],
    ['Leonardo Dicaprio', 'Cold War', '2004-09-24', 8.9],
    ['Emma Stone', 'Bright Lights', '2005-06-28', 9.2],
    ['Scarlett Johansson', 'Lost Island', '2018-02-06', 9.4],
    ['Jane Smith', 'Infinite Skies', '2016-07-08', 8.3],
    ['Will Smith', 'Quantum Fate', '2010-03-03', 6.7],
    ['Scarlett Johansson', 'Quantum Fate', '2000-12-21', 6.7],
    ['Will Smith', 'Dark Truth', '2018-09-16', 6.2],
    ['Emma Stone', 'Lost Island', '2010-05-14', 6.0],
    ['Chris Evans', 'Lost Island', '2008-07-09', 5.4],
    ['Will Smith', 'Infinite Skies', '2014-04-04', 5.9],
    ['Morgan Lee', 'Infinite Skies', '2009-06-24', 5.7],
    ['Alex Taylor', 'A Long Journey', '2004-11-26', 8.2],
    ['Alex Taylor', 'Eternal Bond', '2009-06-18', 6.0],
    ['Alex Taylor', 'Cold War', '2000-09-01', 7.5],
    ['Natalie Portman', 'Eternal Bond', '2005-12-09', 7.4],
    ['Natalie Portman', 'Lost Island', '2003-02-17', 8.4],
    ['Morgan Lee', 'Lost Island', '2015-08-29', 6.5],
    ['Will Smith', 'Dark Truth', '2011-05-21', 9.1],
    ['Leonardo Dicaprio', 'Shadow Realm', '2018-11-16', 7.9],
    ['Emma Stone', 'Bright Lights', '2003-06-06', 6.7],
    ['Alex Taylor', 'Burning Hearts', '2021-05-21', 8.5],
    ['Will Smith', 'A Long Journey', '2013-06-07', 6.5],
    ['John Doe', 'Infinite Skies', '2020-11-02', 6.6],
    ['Chris Evans', 'Bright Lights', '2001-04-19', 6.1],
    ['Emma Stone', 'Infinite Skies', '2001-12-02', 8.3],
    ['Jane Smith', 'Burning Hearts', '2017-03-05', 5.6],
    ['Leonardo Dicaprio', 'Cold War', '2021-03-27', 7.5],
    ['Chris Evans', 'Burning Hearts', '2019-07-26', 7.7],
    ['Morgan Lee', 'Burning Hearts', '2016-12-09', 8.3]
]

# Create DataFrame
actor_rating_shift = pd.DataFrame(data, columns=['actor_name', 'film_title', 'release_date', 'film_rating'])

# Convert release_date to datetime
actor_rating_shift['release_date'] = pd.to_datetime(actor_rating_shift['release_date'])

actor_rating_shift.head()


Unnamed: 0,actor_name,film_title,release_date,film_rating
0,Matt Damon,Equal Depths,2018-09-21,8.0
1,Matt Damon,Equal Heights,2015-06-15,8.0
2,Emma Stone,Quantum Fate,2003-08-31,8.1
3,Leonardo Dicaprio,Rebel Rising,2001-06-26,9.2
4,Alex Taylor,Shadow Realm,2002-02-06,8.2


In [3]:
#solution that I have written
#print(actor_rating_shift)

actor_rating_shift['rank'] = actor_rating_shift.groupby('actor_name')['release_date'].rank(ascending=False, method= 'first')
actor_rating_shift = actor_rating_shift.sort_values(['actor_name','rank'])

def calculating_average_rating(group):
    count_of_films = len(group)
    
    if count_of_films>1:
        previous_releases = group[group['rank']>1]
        average_rating = previous_releases['film_rating'].mean()
        return average_rating
    elif count_of_films==1:
        average_rating = group['film_rating'].values
        return average_rating.mean()
    

averge_lifetime_rating = actor_rating_shift.groupby('actor_name').apply(calculating_average_rating)
averge_lifetime_rating_df = pd.DataFrame({
    'actor_name': averge_lifetime_rating.index,
    'average_life_time_rating': averge_lifetime_rating.values
})

averge_lifetime_rating_data = averge_lifetime_rating_df.merge(actor_rating_shift, on='actor_name')

list_of_actors = averge_lifetime_rating_data[averge_lifetime_rating_data['rank']==1]
list_of_actors.drop('film_title',axis=1,inplace=True)
list_of_actors.drop('release_date',axis=1,inplace=True)
list_of_actors.drop('rank',axis=1,inplace=True)
list_of_actors['difference_with_average_rating'] = list_of_actors['film_rating'] - list_of_actors['average_life_time_rating']
list_of_actors.rename(columns = {'film_rating': 'recent_film_rating'}, inplace=True)

list_of_actors['average_life_time_rating'] = list_of_actors['average_life_time_rating'].round(2)
list_of_actors['difference_with_average_rating'] = list_of_actors['difference_with_average_rating'].round(2)


list_of_actors


  averge_lifetime_rating = actor_rating_shift.groupby('actor_name').apply(calculating_average_rating)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  list_of_actors.drop('film_title',axis=1,inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  list_of_actors.drop('release_date',axis=1,inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  list_of_actors.drop('rank',axis=1,inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value i

Unnamed: 0,actor_name,average_life_time_rating,recent_film_rating,difference_with_average_rating
0,Alex Taylor,7.52,8.5,0.98
6,Angelina Jolie,6.0,6.0,0.0
7,Brad Pitt,6.0,6.0,0.0
8,Chris Evans,5.63,7.7,2.07
12,Emma Stone,7.47,6.2,-1.27
19,Jane Smith,6.9,5.6,-1.3
22,John Doe,8.3,6.6,-1.7
24,Leonardo Dicaprio,8.1,7.5,-0.6
29,Matt Damon,8.0,8.0,0.0
32,Morgan Lee,6.1,8.3,2.2
