# Movie Ratings Visualization

Before developing any prediction model, it is important to understand the data and visualize it.

On this notebook, we will analyze our data from different points of views. Through 5 interactive charts, our data set is analyzed by:

- Overall evolution of the number of movies produced per year
- Genre comparisons on the number of movies produced per year
- Genre comparisons on the average rating per movie per year
- Movie rating evolution per actor/actress, per year
- Versatility indicator for actors/actresses

In [1]:
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
%matplotlib inline

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [3]:
movies_info = pd.read_csv('/content/drive/MyDrive/My project/Dataset CSV/movies_info.csv', index_col = 'tconst')
movies_actors = pd.read_csv('/content/drive/MyDrive/My project/Dataset CSV/movies_actors.csv')
actor_names = pd.read_csv('/content/drive/MyDrive/My project/Dataset CSV/crew_names.csv', index_col = 'nconst', usecols = ['nconst', 'primaryName']).to_dict('dict')

# Overall Number of Movies Produced by Year
On this first scatter plot, you will find the evolution of the overall number of movies produced by year.

In [4]:
# Analyze the evolution of the number of movies produced through the years
movies_years = pd.DataFrame(movies_info['startYear'].value_counts().sort_index())
movies_years.rename(columns={'startYear':'n_movies'}, inplace=True)

In [5]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=movies_years.index, y=movies_years['n_movies'], name = 'Total',
                         mode='lines+markers', line=dict(color='Black')))

fig.update_layout(
    xaxis=dict(
        showline=True,
        showgrid=True,
        showticklabels=True,
        ticks='outside',
        linewidth=2,
        linecolor='rgb(204,204,204)',
        title=dict(
            text='Year',
            font=dict(
                family='Arial',
                color='black')),
        tickfont=dict(
            family='Arial',
            size=11,
            color='rgb(82,82,82)')),
    title=dict(
        text='Total number of movies produced by year',
        font=dict(
            family='Arial',
            color='black'),
        x=0.5),
    yaxis=dict(
        title=dict(
            text='Number of movies',
            font=dict(
                family='Arial',
                color='black')),
        ticks='outside',
        linewidth=2,
        linecolor='rgb(204,204,204)'),
    )


As expected, from 2019 to 2020 there was a significant decrease on the number of movies produced, since the entertainement industry was one of the most affected by the COVID 19 pandemic.

# Number of movies produced by year and by genre
After analyzing how the production of movies changed over time, we now focus our attention on comparing this evolution by genre.

In [6]:
# Create a dictionary with information for each type of genre
# Create a list with all different types of genres
genres = movies_info.columns[-27:]

# Create a dictionary with the keys being the genres and values the data frames 
# corresponding to the movies of the genre
genres_data = {}

for value in genres:
    genres_data[value] = movies_info[movies_info[value]==1]

In [7]:
genres = movies_info.columns[-27:]
genres_years = pd.DataFrame(columns=genres)
genres_years['Year'] = movies_years.index
genres_years.set_index('Year', inplace=True)

# Now fill the data frame with the nunmber of movies per year, per genre
for value in genres_years.columns: # For each type of movie
    for year in genres_years.index: # For each year
        aux = genres_data[value]['startYear']
        genres_years.loc[year, value] = aux.loc[aux==year].count()

In [10]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=genres_years.index, y=genres_years['Animation'], mode='lines+markers', name='Animation',
                        line=dict(color='rgba(100, 100, 100, .9)')))

fig.add_trace(go.Scatter(x=genres_years.index, y=genres_years['Animation'], mode='lines+markers', name='Animation',
                        line=dict(color='rgba(40, 150, 150, .9)')))

fig.update_layout(
    xaxis=dict(
        showline=True,
        showgrid=True,
        showticklabels=True,
        ticks='outside',
        linewidth=2,
        linecolor='rgb(104,104,104)',
        title=dict(
            text='Year',
            font=dict(
                family='Arial',
                color='red')),
        tickfont=dict(
            family='Arial',
            size=11,
            color='rgb(82,82,82)')),
    title=dict(
        text='Number of movies produced by year (Genre comparison)',
        font=dict(
            family='Arial',
            color='black'),
        x=0.5),
    yaxis=dict(
        title=dict(
            text='Number of movies',
            font=dict(
                family='Arial',
                color='black')),
        ticks='outside',
        linewidth=2,
        linecolor='rgb(204,204,204)'),
    )
    

buttons_1 = []
# add buttons for each type of movie

for level in genres_years.columns:
  buttons_1.append(dict(method='restyle',
                        label=str(level),
                        visible=True,
                        args=[{'y':[genres_years[level]],
                            'x':[genres_years.index],
                              'type':'scatter',
                               'name':level,
                              }, [0]]))

buttons_2 = []
# add buttons for each type of movie (second dropdown menu)

for level in genres_years.columns:
    buttons_2.append(dict(method='restyle',
                        label=str(level),
                        visible=True,
                        args=[{'y':[genres_years[level]],
                            'x':[genres_years.index],
                              'type':'scatter',
                               'name':level,
                              }, [1]]))

# Adjust dropdown placement
button_layer_1_height = 1.16
updatemenus = list([
    dict(buttons=buttons_1,
        direction = 'down',
        pad = {'r':10, 't':17},
        showactive = True,
        x = 0,
        xanchor = 'left',
        y = button_layer_1_height,
        yanchor='top'),
    dict(buttons=buttons_2,
        direction = 'down',
        pad = {'r':10, 't':17},
        showactive = True,
        x = 0.43,
        xanchor = 'left',
        y = button_layer_1_height,
        yanchor='top')])

fig.update_layout(font = {'color': "black", 'family': "Arial"})

fig.update_layout(updatemenus=updatemenus)

On the scatter plot above, you can choose which 2 genres to compare regarding the number of movies produced.

# Average ratings by genre and year

In [11]:
# Compare the average rankings of two genres of movies

# Create a data frame that will have the average ranking per genre 
# First assign as many 
genres_avg = pd.DataFrame(genres , columns = ['Genre'])
genres_avg.set_index('Genre', inplace=True)

# Add the average of each genre by calculating the average on the dictionary that has as keys
# each genre
for value in genres_data.keys():
    aux = genres_data[value]['averageRating']
    genres_avg.loc[value, 'Overall'] = aux.sum()/len(aux)
    
# Now add a column for each year

for year in movies_info['startYear'].unique():
    genres_avg[year] = 0

for value in genres_data.keys():
    aux = genres_data[value].groupby('startYear')['averageRating']
    aux2 = aux.sum()/aux.count()
    for year in aux2.index:
        genres_avg.loc[value, year] = aux2[year]

genres_avg = genres_avg.sort_values(by='Overall', ascending=False)

In [12]:
# Plot a bar plot comparing the averages for two different movie types per year

fig = go.Figure()
fig.add_trace(go.Bar(x=genres_avg.index, y=genres_avg.Overall, marker_color='rgb(233,150,122)'))


fig.update_layout(xaxis=dict(
    showline=False,
    showgrid=False,
    showticklabels=True,
    ticks='outside',
    tickangle = 45,
    tickfont=dict(
        family='Arial',
        size=14,
        color='black',
        ),
    ),
    yaxis=dict(
        title=dict(
            text='Average ranking',
            font=dict(
                family='Arial',
                color='black',
                size=16)),
        color='black'),
    title = dict(
        text='Average ranking per movie genre, per year',
        x=0.5,
        font=dict(
            family='Arial',
            color='black')))

buttons = []
# add buttons for the each year

for level in genres_avg.columns:
    # Create an aux data frame only with the column for the correspondent year, to be able to sort by the
    # the average rating for each year on the dropdown menu
    aux = pd.DataFrame(genres_avg[level].sort_values(ascending=False))
    buttons.append(dict(method='restyle',
                        label=str(level),
                        visible=True,
                        args=[{'y':[aux[level]],
                            'x':[aux.index],
                              'type':'bar',
                              }, [0]]))

# Adjust dropdown placement
button_layer_1_height = 1.2
updatemenus = list([
    dict(buttons=buttons,
        direction = 'down',
        pad = {'r':10, 't':17},
        showactive = True,
        x = 0,
        xanchor = 'left',
        y = button_layer_1_height,
        yanchor='top')])

fig.update_layout(font = {'color': "black", 'family': "Arial"})

fig.update_layout(updatemenus=updatemenus)

On the bar chart above, by choosing a year, the average ratings of movies will be displayed by genre (for the movies produced on the corresponding year). The first option on the menu is 'Overall' which gives the overall average ratings by genre, across all years.

# Movie rating evolution per actor/actress

In [13]:
actor_info = movies_actors[['nconst','tconst','startYear']].groupby(['nconst','startYear']).count()
actor_info = actor_info.rename(columns={'tconst':'n_movies'})
actor_info['avg_rate'] = movies_actors[['nconst','averageRating','startYear']].groupby(['nconst','startYear']).mean()
actor_info = actor_info.reset_index().pivot(index='nconst',columns='startYear')

actor_movies = actor_info['n_movies'].copy()
actor_avg = actor_info['avg_rate'].copy()

actor_movies['n_movies'] = actor_movies.sum(axis = 1) #DataFrame with number of movies by actor and year
actor_avg['avg'] = round(actor_avg.mean(axis = 1), 2) #DataFrame with avg rating by actor and year

actor_movies = actor_movies.fillna(0)
actor_avg = actor_avg.fillna(0)

In [14]:
# Filter on those who have done filmes in2 2020 and 2021
actor_avg = actor_avg[(actor_avg[2020] > 0) & (actor_avg[2021] > 0)]

actor_avg = actor_avg.T
actor_avg = actor_avg.rename(columns = actor_names['primaryName'])
actor_avg = actor_avg.drop('Abhirami', axis = 1)
actor_avg = actor_avg.reindex(sorted(actor_avg.columns), axis=1)
average = pd.DataFrame(actor_avg.loc['avg',:])
actor_avg.drop('avg', axis=0, inplace=True)

In [15]:
actor_avg.head()

nconst,Aadukalam Naren,Aaron Groben,Abhishek Bachchan,Abhishek Banerjee,Abhishek Raveendran,Abi Casson Thompson,Abir Chatterjee,Adam Berardi,Adam Devine,Adan Canto,Aditi Rao Hydari,Aditya Srivastav,Adrian Bouchet,Adrian Grenier,Aggie Hsieh,Agnieszka Grochowska,Aidan Bristow,Aiden Gale,Aina Aiba,Aishwarya Lekshmi,Ajay,Ajay Bafna,Aju Varghese,Akira Ishida,Al Madrigal,Al Weaver,Alan Ritchson,Albin Grenholm,Albrecht Schuch,Aleksandr Demidov,Aleksey Serebryakov,Alencier Ley Lopez,Alessandro Carlini,Alex Essoe,Alex Teix,Alexander Petrov,Alexandra Robertshaw,Alexandra Shipp,Alexandre David Lejuez,Alexxis Lemire,...,Wolfgang Cerny,Woo-jin Jo,Yanhui Wang,Yanshu Wu,Yasamin Jasem,Yase Liu,Yashpal Sharma,Ye Gao,Yeom Hye-ran,Yi Zhang,Yo-Han Byun,Yogi Babu,Yola'nda Bell,Yolie Canales,Yoo In-Na,Yoo Yeon-Seok,Yoshino Kimura,Yu-Ning Tsao,Yuanyuan Zhu,Yumiko Kobayashi,Yuming Du,Yurina Hirate,Yvone Freese,Yôko Hikasa,Yôsuke Eguchi,Yôsuke Kubozuka,Yû Aoi,Yûichi Nakamura,Yûsuke Iseya,Zach Avery,Zachary Quinto,Zackary Arthur,Zazie Beetz,Zhaleh Sameti,Zhang Dong,Zishan Rong,Zizan Razak,Álex García,Álvaro Cervantes,Éléonore Loiselle
startYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
2010,0.0,0.0,5.6,0.0,0.0,0.0,6.95,0.0,0.0,0.0,0.0,3.3,0.0,0.0,0.0,6.066667,0.0,0.0,0.0,0.0,4.825,0.0,6.6,6.85,0.0,5.7,0.0,0.0,0.0,7.6,5.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,7.1,0.0,4.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.2,6.35,0.0,7.6,0.0,0.0,0.0,0.0,0.0,7.0,4.85,6.5,5.5,6.7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.7,0.0,6.9,0.0
2011,0.0,6.5,5.7,0.0,0.0,0.0,7.45,0.0,0.0,0.0,7.5,0.0,0.0,5.9,0.0,7.3,3.0,0.0,0.0,0.0,6.95,0.0,5.8,5.5,0.0,0.0,0.0,0.0,0.0,0.0,6.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,5.5,5.666667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.3,0.0,0.0,0.0,0.0,0.0,6.0,0.0,0.0,7.7,6.2,6.3,6.666667,0.0,6.45,0.0,7.1,0.0,0.0,0.0,0.0,0.0,5.5,5.9,0.0,0.0
2012,7.45,0.0,4.8,0.0,0.0,0.0,6.875,0.0,0.0,0.0,5.7,0.0,0.0,0.0,0.0,5.2,4.7,0.0,0.0,0.0,7.3,0.0,0.0,6.8,0.0,0.0,0.0,0.0,5.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,5.975,0.0,0.0,5.3,0.0,0.0,0.0,0.0,6.1,6.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.3,6.3,7.5,0.0,6.7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.94,0.0,4.75,0.0
2013,0.0,7.4,5.4,0.0,0.0,0.0,7.0,0.0,0.0,0.0,4.9,0.0,8.1,5.5,0.0,6.5,5.6,0.0,0.0,0.0,5.033333,0.0,5.4,6.7,0.0,4.6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6.2,3.8,0.0,0.0,0.0,0.0,5.8,0.0,0.0,0.0,0.0,0.0,0.0,6.65,6.4,6.8,6.4,0.0,0.0,0.0,7.7,0.0,0.0,0.0,0.0,0.0,5.775,0.0,0.0,0.0
2014,7.1,6.1,5.0,0.0,0.0,0.0,6.966667,0.0,6.0,0.0,0.0,0.0,2.7,0.0,0.0,4.6,4.725,0.0,0.0,0.0,5.566667,0.0,5.96,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.6,7.4,0.0,6.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,6.275,0.0,0.0,0.0,6.5,0.0,0.0,0.0,0.0,6.75,5.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.6,6.3,7.45,6.8,6.9,0.0,4.6,0.0,0.0,0.0,0.0,0.0,4.9,6.4,0.0,0.0


In [16]:
fig = make_subplots(rows=1, cols=2, column_widths=[0.15, 0.85])

trace1 = go.Bar(x=average.index[0:1], y=average.avg, name='Average Rating', marker_color='rgb(70,130,180)')


trace2 = go.Scatter(x=actor_avg.index, y=actor_avg['Aadukalam Naren'], mode='lines+markers',
                        line=dict(color='rgba(100, 100, 100, .9)'), name='Rating Evolution')


fig.add_trace(trace1, row=1, col=1)

fig.add_trace(trace2, row=1, col=2)


fig.update_layout(
    xaxis=dict(
        showline=True,
        showgrid=True,
        showticklabels=True,
        ticks='outside',
        linewidth=2,
        linecolor='rgb(104,104,104)',
        tickfont=dict(
            family='Arial',
            size=11,
            color='rgb(82,82,82)')),
    title=dict(
        text='Movie rating evolution per actor/actress  -  Average Rating',
        font=dict(
            family='Arial',
            color='black'),
        x=0.5),
    yaxis=dict(
        range=[0,10],
        title=dict(
            text='Rating',
            font=dict(
                family='Arial',
                color='black'))),
    )

fig.update_yaxes(range=[0, 10], row=1, col=2,title=dict(
    text='Number of movies',
    font=dict(
        family='Arial',
        color='black')) )

fig.update_xaxes(row=1, col=2, tickfont=dict(
            family='Arial',
            size=14,
            color='rgb(82,82,82)'),
            title = dict(
                text='Year',
                font=dict(
                    family='Arial',
                    size=14,
                    color='black')),
            showline=True,
            showgrid=True,
            showticklabels=True,
            ticks='outside',
            linewidth=2,
            linecolor='rgb(104,104,104)')

buttons_1 = []
# add buttons for each actor

for i in range(len(actor_avg.columns)):
    buttons_1.append(dict(method='update',
                        label=str(actor_avg.columns[i]),
                        visible=True,
                        args=[{'y':[actor_avg.iloc[:,i], average.avg[i:i+1]],
                            'x':[actor_avg.index, average.index[i:i+1]],
                              'type':['scatter','bar'],
                               'name': ['Rating Evolution', 'Average Rating']},
                              {'yaxis': {'range': [0,10]}},[1,0]]))
    
# Adjust dropdown placement
button_layer_1_height = 1.18
updatemenus = list([
    dict(buttons=buttons_1,
        direction = 'down',
        pad = {'r':10, 't':17},
        showactive = True,
        x = 0,
        xanchor = 'left',
        y = button_layer_1_height,
        yanchor='top')])

fig.update_layout(font = {'color': "black", 'family': "Arial"})

fig.update_layout(updatemenus=updatemenus)

As shown above, the movie rating evolution per actor/actress is visualized with two charts. By selecting a specific actor/actress, you will get:

- A bar chart with the overall average rating for the movies he/she was a part of;
- A scatter plot with the evolution of the average rating of movies he/she was a part of from 2010 to 2021.

# Actors versatility

In order to measure actors versatility, we developed a measure regarding the number of different genres a certain actor/actress was in.

To create the versatility indicator, two steps were taken:

1. Attribute a value to each actor/actress, according to how many different movie genres he/she had worked on

For example: if an actor only worked on action movies between 2010 and 2021, value is 1; if an actor worked on thriller, drama and action movies between 2010 and 2021, value is 3.

2. Then standardize the indicator to a range from 0 to 10 (0 being the lowest value and 10 the highest).


In [17]:
actor_ver = pd.merge(movies_actors[['tconst','nconst']], movies_info[genres], on = 'tconst')
actor_ver.head()

Unnamed: 0,tconst,nconst,Action,Adult,Adventure,Animation,Biography,Comedy,Crime,Documentary,Drama,Family,Fantasy,Game-Show,History,Horror,Music,Musical,Mystery,News,Reality-TV,Romance,Sci-Fi,Short,Sport,Talk-Show,Thriller,War,Western
0,tt0011216,nm0290157,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,tt0011216,nm0300388,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,tt0011216,nm0869559,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,tt0011216,nm0595321,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,tt0016906,nm0530110,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0


In [18]:
actor_ver = actor_ver.groupby('nconst').sum()
actor_ver = actor_ver.rename(index = actor_names['primaryName'])
actor_ver['n_genres'] = actor_ver.astype(bool).sum(axis = 1)
actor_ver['n_genres_norm'] = (actor_ver['n_genres'] - actor_ver['n_genres'].min(axis=0)) /\
                             (actor_ver['n_genres'].max(axis=0) - actor_ver['n_genres'].min(axis=0))

actor_ver['n_genres_norm'] = 10*actor_ver['n_genres_norm']
actor_ver.head()

Unnamed: 0_level_0,Action,Adult,Adventure,Animation,Biography,Comedy,Crime,Documentary,Drama,Family,Fantasy,Game-Show,History,Horror,Music,Musical,Mystery,News,Reality-TV,Romance,Sci-Fi,Short,Sport,Talk-Show,Thriller,War,Western,n_genres,n_genres_norm
nconst,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1
Lauren Bacall,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0.526316
Sophia Loren,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0.0
Gong Li,2,0,2,0,0,1,1,0,5,0,1,0,1,0,0,0,1,0,0,2,1,0,1,0,0,0,0,11,5.263158
Elena Koreneva,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0.0
John Cleese,0,0,2,4,0,8,0,1,1,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,8,3.684211


In [19]:
# Create a list of the actors/actresses codes that will be displayed on the visualization
codes = actor_ver.head(30).index.sort_values()

# Get the average value for n_genres:
mvalue = sum(actor_ver['n_genres']) / len(actor_ver['n_genres'])

In [20]:
#Create an indicator for the versatility index
fig = go.Figure(go.Indicator(
    mode = "gauge+number",
    value = actor_ver['n_genres_norm']['Aadukalam Naren'],
    title = dict(
        text = 'Versatility indicator' ,
        font = dict(
            family = 'Arial',
            color='black')),
    domain = {'x': [0, 1], 'y': [0.2, 0.9]},
    gauge = dict(
        bar = dict(
            color='rgb(205,92,92)'),
        axis = dict(
            range = [None, 10]),
        threshold = dict(
            line = dict(
                color = 'rgb(70,130,180)',
                width = 4),
            thickness = 0.75,
            value = mvalue))
))

fig.update_layout(font = {'color': "black", 'family': "Arial"})

buttons = []
# add buttons for the each year

for level in codes:
    buttons.append(dict(method='restyle',
                        label=str(level),
                        visible=True,
                        args=[{'value':[actor_ver['n_genres_norm'][level]],
                              }, [0]]))

# Adjust dropdown placement
button_layer_1_height = 1.2
updatemenus = list([
    dict(buttons=buttons,
        direction = 'down',
        pad = {'r':10, 't':17},
        showactive = True,
        x = 0,
        xanchor = 'left',
        y = button_layer_1_height,
        yanchor='top')])

fig.update_layout(updatemenus=updatemenus)


fig.show()


On the visual above, by choosing a certain actor, its versatility indicator is displayed. There is a blue line which represents the average value of the versatility indicator across all actors/actresses.