## A Star Wars Survey

While waiting for [Star Wars: The Force Awakens](https://en.wikipedia.org/wiki/Star_Wars:_The_Force_Awakens 'Star Wars: The Force Awakens') to come out, the team at [FiveThirtyEight](https://fivethirtyeight.com/ 'FiveThirtyEight') became interested in answering some questions about Star Wars fans. In particular, they wondered: **does the rest of America realize that “The Empire Strikes Back” is clearly the best of the bunch?**

The team needed to collect data addressing this question. To do this, they surveyed Star Wars fans using the online tool SurveyMonkey. They received 835 total responses, which you download [from their GitHub repository](https://github.com/fivethirtyeight/data/tree/master/star-wars-survey 'from their GitHub repository').





In [1]:
import pandas as pd
import numpy as np
star_wars = pd.read_csv("star_wars.csv", encoding="ISO-8859-1")

In [2]:
star_wars.head(5)

Unnamed: 0,RespondentID,Have you seen any of the 6 films in the Star Wars franchise?,Do you consider yourself to be a fan of the Star Wars film franchise?,Which of the following Star Wars films have you seen? Please select all that apply.,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.,...,Unnamed: 28,Which character shot first?,Are you familiar with the Expanded Universe?,Do you consider yourself to be a fan of the Expanded Universe?ÂÃ¦,Do you consider yourself to be a fan of the Star Trek franchise?,Gender,Age,Household Income,Education,Location (Census Region)
0,,Response,Response,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,Star Wars: Episode IV A New Hope,Star Wars: Episode V The Empire Strikes Back,Star Wars: Episode VI Return of the Jedi,Star Wars: Episode I The Phantom Menace,...,Yoda,Response,Response,Response,Response,Response,Response,Response,Response,Response
1,3292880000.0,Yes,Yes,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,Star Wars: Episode IV A New Hope,Star Wars: Episode V The Empire Strikes Back,Star Wars: Episode VI Return of the Jedi,3,...,Very favorably,I don't understand this question,Yes,No,No,Male,18-29,,High school degree,South Atlantic
2,3292880000.0,No,,,,,,,,,...,,,,,Yes,Male,18-29,"$0 - $24,999",Bachelor degree,West South Central
3,3292765000.0,Yes,No,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,,,,1,...,Unfamiliar (N/A),I don't understand this question,No,,No,Male,18-29,"$0 - $24,999",High school degree,West North Central
4,3292763000.0,Yes,Yes,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,Star Wars: Episode IV A New Hope,Star Wars: Episode V The Empire Strikes Back,Star Wars: Episode VI Return of the Jedi,5,...,Very favorably,I don't understand this question,No,,Yes,Male,18-29,"$100,000 - $149,999",Some college or Associate degree,West North Central


In [3]:
star_wars.columns

Index(['RespondentID',
       'Have you seen any of the 6 films in the Star Wars franchise?',
       'Do you consider yourself to be a fan of the Star Wars film franchise?',
       'Which of the following Star Wars films have you seen? Please select all that apply.',
       'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8',
       'Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.',
       'Unnamed: 10', 'Unnamed: 11', 'Unnamed: 12', 'Unnamed: 13',
       'Unnamed: 14',
       'Please state whether you view the following characters favorably, unfavorably, or are unfamiliar with him/her.',
       'Unnamed: 16', 'Unnamed: 17', 'Unnamed: 18', 'Unnamed: 19',
       'Unnamed: 20', 'Unnamed: 21', 'Unnamed: 22', 'Unnamed: 23',
       'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',
       'Unnamed: 28', 'Which character shot first?',
       'Are you familiar with the Expan

Since the columns *Have you seen any of the 6 films in the Star Wars franchise?* and *Do you consider yourself to be a fan of the Star Wars film franchise?* are a yes/no question, we will now convert them to a Boolean type from their current string type to be able to handle the data easier.

In [4]:
# mapping for each new value
yes_no = {
    'Yes': True,
    'No': False,
}

# convert yes/no to true/false boolean type
for col in [
    "Have you seen any of the 6 films in the Star Wars franchise?",
    "Do you consider yourself to be a fan of the Star Wars film franchise?"
    ]:
    star_wars[col] = star_wars[col].map(yes_no)

In [5]:
# print head to make sure conversion was succesful
star_wars.head()

Unnamed: 0,RespondentID,Have you seen any of the 6 films in the Star Wars franchise?,Do you consider yourself to be a fan of the Star Wars film franchise?,Which of the following Star Wars films have you seen? Please select all that apply.,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.,...,Unnamed: 28,Which character shot first?,Are you familiar with the Expanded Universe?,Do you consider yourself to be a fan of the Expanded Universe?ÂÃ¦,Do you consider yourself to be a fan of the Star Trek franchise?,Gender,Age,Household Income,Education,Location (Census Region)
0,,,,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,Star Wars: Episode IV A New Hope,Star Wars: Episode V The Empire Strikes Back,Star Wars: Episode VI Return of the Jedi,Star Wars: Episode I The Phantom Menace,...,Yoda,Response,Response,Response,Response,Response,Response,Response,Response,Response
1,3292880000.0,True,True,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,Star Wars: Episode IV A New Hope,Star Wars: Episode V The Empire Strikes Back,Star Wars: Episode VI Return of the Jedi,3,...,Very favorably,I don't understand this question,Yes,No,No,Male,18-29,,High school degree,South Atlantic
2,3292880000.0,False,,,,,,,,,...,,,,,Yes,Male,18-29,"$0 - $24,999",Bachelor degree,West South Central
3,3292765000.0,True,False,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,,,,1,...,Unfamiliar (N/A),I don't understand this question,No,,No,Male,18-29,"$0 - $24,999",High school degree,West North Central
4,3292763000.0,True,True,Star Wars: Episode I The Phantom Menace,Star Wars: Episode II Attack of the Clones,Star Wars: Episode III Revenge of the Sith,Star Wars: Episode IV A New Hope,Star Wars: Episode V The Empire Strikes Back,Star Wars: Episode VI Return of the Jedi,5,...,Very favorably,I don't understand this question,No,,Yes,Male,18-29,"$100,000 - $149,999",Some college or Associate degree,West North Central


We also want to conver the columns 3-9 to boolean type. In these columns the respondents answer which Star Wars movies they have seen; for each of these columns, if the value in a cell is the name of the movie, the respondent saw the movie. Therefore we can simply change the titles of the columns to *seen_i* and convert the values to a True/False boolean. 

In [6]:
# convert answers from columns 3-9 to boolean
movie_mapping = {
    "Star Wars: Episode I  The Phantom Menace": True,
    "Star Wars: Episode II  Attack of the Clones": True,
    "Star Wars: Episode III  Revenge of the Sith": True,
    "Star Wars: Episode IV  A New Hope": True,
    "Star Wars: Episode V The Empire Strikes Back": True,
    "Star Wars: Episode VI Return of the Jedi": True,
    np.nan: False
}

for col in star_wars.columns[3:9]:
    star_wars[col] = star_wars[col].map(movie_mapping)

In [7]:
# rename columns
new_names = {"Which of the following Star Wars films have you seen? Please select all that apply.": 
                         "seen_1",
            'Unnamed: 4':'seen_2',
            'Unnamed: 5':'seen_3',
            'Unnamed: 6':'seen_4',
            'Unnamed: 7':'seen_5',
            'Unnamed: 8':'seen_6',
            }
star_wars = star_wars.rename(columns=new_names)

The next six columns ask the respondent to rank the Star Wars movies in order from least to most favorite. 1 means the film was the most favorite, and 6 means it was the least favorite. Each of the following columns can contain the value 1, 2, 3, 4, 5, 6, or NaN. 

We'll convert eatch column to numeric type and rename the column titles. 

In [8]:
star_wars = star_wars.rename(columns={
        "Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.": "ranking_1",
        "Unnamed: 10": "ranking_2",
        "Unnamed: 11": "ranking_3",
        "Unnamed: 12": "ranking_4",
        "Unnamed: 13": "ranking_5",
        "Unnamed: 14": "ranking_6"
        })

# first rows of the dataframe to make sure the changes are correct
star_wars.head()

Unnamed: 0,RespondentID,Have you seen any of the 6 films in the Star Wars franchise?,Do you consider yourself to be a fan of the Star Wars film franchise?,seen_1,seen_2,seen_3,seen_4,seen_5,seen_6,ranking_1,...,Unnamed: 28,Which character shot first?,Are you familiar with the Expanded Universe?,Do you consider yourself to be a fan of the Expanded Universe?ÂÃ¦,Do you consider yourself to be a fan of the Star Trek franchise?,Gender,Age,Household Income,Education,Location (Census Region)
0,,,,True,True,True,True,True,True,Star Wars: Episode I The Phantom Menace,...,Yoda,Response,Response,Response,Response,Response,Response,Response,Response,Response
1,3292880000.0,True,True,True,True,True,True,True,True,3,...,Very favorably,I don't understand this question,Yes,No,No,Male,18-29,,High school degree,South Atlantic
2,3292880000.0,False,,False,False,False,False,False,False,,...,,,,,Yes,Male,18-29,"$0 - $24,999",Bachelor degree,West South Central
3,3292765000.0,True,False,True,True,True,False,False,False,1,...,Unfamiliar (N/A),I don't understand this question,No,,No,Male,18-29,"$0 - $24,999",High school degree,West North Central
4,3292763000.0,True,True,True,True,True,True,True,True,5,...,Very favorably,I don't understand this question,No,,Yes,Male,18-29,"$100,000 - $149,999",Some college or Associate degree,West North Central


In [9]:
star_wars = star_wars.drop(0)
star_wars[star_wars.columns[9:15]] = star_wars[star_wars.columns[9:15]].astype(float)

print('')
star_wars[star_wars.columns[9:15]].mean()




ranking_1    3.732934
ranking_2    4.087321
ranking_3    4.341317
ranking_4    3.272727
ranking_5    2.513158
ranking_6    3.047847
dtype: float64

In [10]:
print('The mean ranking for each movie is:')
star_wars[star_wars.columns[9:15]].mean()


The mean ranking for each movie is:


ranking_1    3.732934
ranking_2    4.087321
ranking_3    4.341317
ranking_4    3.272727
ranking_5    2.513158
ranking_6    3.047847
dtype: float64

In [11]:
import sys
!{sys.executable} -m pip install plotly==5.1.0 

Collecting plotly==5.1.0
  Downloading plotly-5.1.0-py2.py3-none-any.whl (20.6 MB)
[K     |████████████████████████████████| 20.6 MB 508 kB/s eta 0:00:01
Collecting tenacity>=6.2.0
  Downloading tenacity-8.0.0-py3-none-any.whl (22 kB)
Installing collected packages: tenacity, plotly
  Attempting uninstall: plotly
    Found existing installation: plotly 4.12.0
    Uninstalling plotly-4.12.0:
      Successfully uninstalled plotly-4.12.0
Successfully installed plotly-5.1.0 tenacity-8.0.0
You should consider upgrading via the '/dataquest/system/env/python3/bin/python3 -m pip install --upgrade pip' command.[0m


Now that we can compute the mean ranking we will plot it in a bar chart to better understand the data. We will be using the plotly package, as it creates interactive charts that also look arguably more professional than matlpotlib charts. 

In [12]:
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio


In [18]:
# Bar chart using plotly
fig = px.bar(star_wars[star_wars.columns[9:15]].mean(), 
            title="Mean Ranking of Star Wars Movies",
            template="plotly_white", 
            labels={
                     "index": "Mean Ranking",
                     "value": "Star Wars Movies",
                 },)

fig.update_layout(
    title={
        'text':"<b>Mean Ranking of Star Wars movies trilgoies I and II</b><br>"+
        "Movies have been ranked from most favourite (1st place), to least favourite (6th place)",
        'yanchor':'top',
        'xref':'paper',
        'x':0.5
    },
    
    xaxis = dict(
        tickmode = 'array',
        tickvals = [0, 1, 2, 3, 4, 5],
        ticktext = ['The Phantom Menace', 'Attack of the Clones', 
                    'Revenge of the Sith', 'A New Hope', 
                    'The Empire Strikes Back', 'Return of the Jedi']
    )
)
fig.layout.update(showlegend=False)

fig.show()


## Movie Rankings

It is important to remember that the best ranking a movie can have is 1; therefore, the lower mean ranking a movies has the better. 

The "original" movies are all better regarded, being the clear winner *The Empire Strikes Back*, followed by *Return of the Jedi* and *A New Hope*.

We will now take a look at the number of respondents that have seen each movie.


In [14]:
star_wars[star_wars.columns[3:9]].sum()

seen_1    673
seen_2    571
seen_3    550
seen_4    607
seen_5    758
seen_6    738
dtype: int64

In [15]:
# Bar chart using plotly
fig = px.bar(star_wars[star_wars.columns[3:9]].sum(), 
            title="Total of respondents that have seen each movie",
            template="plotly_white", 
            labels={
                     "index": "Number of Respondents",
                     "value": "Star Wars Movies",
                 },)

fig.update_layout(
    title={
        'text':"<b>Number of respondents that have seen each movie</b><br>",
        'yanchor':'top',
        'xref':'paper',
        'x':0.5
    },
    xaxis = dict(
        tickmode = 'array',
        tickvals = [0, 1, 2, 3, 4, 5],
        ticktext = ['The Phantom Menace', 'Attack of the Clones', 
                    'Revenge of the Sith', 'A New Hope', 
                    'The Empire Strikes Back', 'Return of the Jedi']
    )
)
fig.layout.update(showlegend=False)

fig.show()

The highest ranked movies are precisely those that were watched by more respondents. This just reinforces the idea that the "original" movies are more widely loved. 

We now want to study how differently males and females answered the survey. First though we want to see how many respondents of each gender there are in the survey. 

In [44]:
star_wars["Gender"].fillna('Unreported', inplace=True) #change nan values to unreported
# frequency dsitributoin table of genders
gender_counts = star_wars["Gender"].value_counts(dropna=False)
print('The total number of respondents to the survey is:\n' + str(gender_counts.values.sum()
)+',')
print('and they are distributed by gender as:')
gender_counts

The total number of respondents to the survey is:
1186,
and they are distributed by gender as:


Female     549
Male       497
Unknown    140
Name: Gender, dtype: int64

Below is presented a piechart to better visualize the gender disparity. 

In [37]:
fig = go.Figure(
    data=[
        go.Pie(
        values=gender_counts.values,
        labels=gender_counts.index,
        hoverinfo='label+percent+value',
        ) 
    ],
    layout=go.Layout(
        title={
        'text':"<b>Number of Survey Respondents by Gender</b><br>",
        'yanchor':'top',
        'xref':'paper',
        'x':0.5
    },
    )

)
fig.show()

There is a slight disparity between genders (46.3% females to 41.9% males) and a whole 11.8% of respondents from whom we do not have any information about their gender. Possible explanations are that respondents simply were not willing to give out this information or that they did not identify as either male/female.  

Let's now look at the percentage of males and females that watched each movie. 

In [25]:
# dataframe for each gender
males = star_wars[star_wars["Gender"] == "Male"]
females = star_wars[star_wars["Gender"] == "Female"]

fig = go.Figure(
    data=[
        go.Bar(
            name='females',
            x=females[females.columns[3:9]].sum().index,
            y=females[females.columns[3:9]].sum().values,
            offsetgroup=0,
            
        ),
        go.Bar(
            name='males',
            x=males[males.columns[3:9]].sum().index,
            y=males[males.columns[3:9]].sum().values,
            offsetgroup=1,
#             base=females[females.columns[3:9]].sum().values
        ),
        
    ],
    layout=go.Layout(
        title={
        'text':'<b>Percentage of respondents that have seen each movie by gender',
        'yanchor','top',
        'xref','paper',
        'x':0.5},
        xaxis = dict(
        tickmode = 'array',
        tickvals = [0, 1, 2, 3, 4, 5],
        ticktext = ['The Phantom Menace', 'Attack of the Clones', 
                    'Revenge of the Sith', 'A New Hope', 
                    'The Empire Strikes Back', 'Return of the Jedi']
    ),
        
        
        template="plotly_white",
        xaxis_title='Star Wars Movies',
        yaxis_title='Number of Respondents'
    )
)
fig.show()

In [34]:
fig = go.Figure(
    data=[
        go.Bar(
            name='females',
            x=females[females.columns[3:9]].sum().index,
            y=(females[females.columns[3:9]].sum().values/gender_counts.Female).round(2),
            offsetgroup=0,
            
        ),
        go.Bar(
            name='males',
            x=males[males.columns[3:9]].sum().index,
            y=(males[males.columns[3:9]].sum().values/gender_counts.Male).round(2),
            offsetgroup=1,
#             base=females[females.columns[3:9]].sum().values
        ),
        
    ],
    layout=go.Layout(
        title={
        'text':'<b>Respondents that have seen each movie by gender',
        'yanchor':'top',
        'xref':'paper',
        'x':0.5},
        xaxis = dict(
        tickmode = 'array',
        tickvals = [0, 1, 2, 3, 4, 5],
        ticktext = ['The Phantom Menace', 'Attack of the Clones', 
                    'Revenge of the Sith', 'A New Hope', 
                    'The Empire Strikes Back', 'Return of the Jedi']
    ),
        
        
        template="plotly_white",
        xaxis_title='Star Wars Movies',
        yaxis_title='Number of Respondents'
    )
)
fig.show()

Unsurprisingly, the most popular movies (the original trilogy) were watched by a higher percentage of the respondents. Episode 4 is the most watched at 79% and 64% of males and females respectively; on the other hand Episode 3 (the last one to come out cronologically) was only watched by 64% of males and 40% of females. 

Episode 1 was by far the most watched episode of the second trilogy. That's understandable, as after a +20 year hiatus it became quite hyped. However, interest in the two subsequent movies dropped as they are generally seen as not good as the original trilogy. Interest died down particularly among females, and only 40% of female respondents watched Episode 3 (the last one chronologically).

## Fans vs No Fans

It would be interesting to see how people that consider themselves Star Wars fans rank the movies vs people that don't consider themselves fans; or how sizable the difference is between fans and no fans that have watched each movie. 

In [69]:
# individual dataframes for fans and non-fans
fans = star_wars[star_wars['Do you consider yourself to be a fan of the Star Wars film franchise?'] == True]
no_fans = star_wars[star_wars['Do you consider yourself to be a fan of the Star Wars film franchise?'] == False]

# freq distribution table for fans and non-fans
fan_counts = star_wars["Do you consider yourself to be a fan of the Star Wars film franchise?"].value_counts(dropna=False)
print('The total number of fans is: '+str(fan_counts.values[0])+
      '\nwhile the number of non-fans is: '+str(fan_counts.values[2]))

The total number of fans is: 552
while the number of non-fans is: 284


In [50]:
fig = go.Figure(
    data=[
        go.Bar(
            name='fans',
            x=fans[fans.columns[9:15]].mean().index,
            y=(fans[fans.columns[9:15]].mean().values).round(2),
            offsetgroup=0,
            marker_color='#2ca02c',
            
        ),
        go.Bar(
            name='no_fans',
            x=no_fans[no_fans.columns[9:15]].mean().index,
            y=(no_fans[no_fans.columns[9:15]].mean().values).round(2),
            offsetgroup=1,
            marker_color='#d62728',
#             base=females[females.columns[3:9]].sum().values
        ),
        
    ],
    layout=go.Layout(
        title={
        'text':'<b>Movie Rankings of Fans VS No Fans',
        'yanchor':'top',
        'xref':'paper',
        'x':0.5},
        xaxis = dict(
        tickmode = 'array',
        tickvals = [0, 1, 2, 3, 4, 5],
        ticktext = ['The Phantom Menace', 'Attack of the Clones', 
                    'Revenge of the Sith', 'A New Hope', 
                    'The Empire Strikes Back', 'Return of the Jedi']
    ),
        
        
        template="plotly_white",
        xaxis_title='Star Wars Movies',
        yaxis_title='Number of Respondents'
    )
)
fig.show()

Interestingly, the older trilogy is much highly regarded among people that consider themselves as fans of Star Wars. It is known that the new trilogy was not particularly well received among the SW community and this data reflects that. The highest and lowest rated movie are however the same ones for fans and no fans, Episode 4 (highest) and Episode 3 (lowest) respectively. 

The starkest contrast can be seen in Episode 1 and 4; Episode 4, the first SW movie to come out, is much better regarded among fans with a 1 point differential; Episode 1 on the other hand is better regarded among non-fans with a 1.2 point differential.

True     552
NaN      350
False    284
Name: Do you consider yourself to be a fan of the Star Wars film franchise?, dtype: int64

In [73]:
fig = go.Figure(
    data=[
        go.Bar(
            name='fans',
            x=fans[fans.columns[3:9]].sum().index,
            y=(fans[fans.columns[3:9]].sum().values/fan_counts.values[0]).round(2),
            offsetgroup=0,
            marker_color='#2ca02c',
            
        ),
        go.Bar(
            name='no_fans',
            x=no_fans[no_fans.columns[3:9]].sum().index,
            y=(no_fans[no_fans.columns[3:9]].sum().values/fan_counts.values[2]).round(2),
            offsetgroup=1,
            marker_color='#d62728',
#             base=females[females.columns[3:9]].sum().values
        ),
        
    ],
    layout=go.Layout(
        title={
        'text':'<b>Fans vs Non-Fans that have seen each movie ',
        'yanchor':'top',
        'xref':'paper',
        'x':0.5},
        xaxis = dict(
        tickmode = 'array',
        tickvals = [0, 1, 2, 3, 4, 5],
        ticktext = ['The Phantom Menace', 'Attack of the Clones', 
                    'Revenge of the Sith', 'A New Hope', 
                    'The Empire Strikes Back', 'Return of the Jedi']
    ),
        
        
        template="plotly_white",
        xaxis_title='Star Wars Movies',
        yaxis_title='Number of Respondents'
    )
)
fig.show()

## Results by Age

Lastly, it will be interesting to see how the respondents rank the movies according to their age bracket.

In [77]:
age_counts = star_wars['Age'].value_counts(dropna=True)
age_counts

45-60    291
> 60     269
30-44    268
18-29    218
Name: Age, dtype: int64

In [85]:
zoomers=star_wars[star_wars['Age']== '18-29']
millenians=star_wars[star_wars['Age']== '30-44']
boomers=star_wars[star_wars['Age']== '45-60']
oldies=star_wars[star_wars['Age']== '> 60']

In [87]:
fig = go.Figure(
    data=[
        go.Bar(
            name='18-29',
            x=zoomers[zoomers.columns[9:15]].mean().index,
            y=(zoomers[zoomers.columns[9:15]].mean().values).round(2),
            offsetgroup=0,
            marker_color='#2ca02c',
            
        ),
        go.Bar(
            name='30-44',
            x=millenians[millenians.columns[9:15]].mean().index,
            y=(millenians[millenians.columns[9:15]].mean().values).round(2),
            offsetgroup=1,
            marker_color='#d62728',
        ),
         go.Bar(
            name='45-60',
            x=boomers[boomers.columns[9:15]].mean().index,
            y=(boomers[boomers.columns[9:15]].mean().values).round(2),
            offsetgroup=2,
            marker_color='#e377c2',
        ),
         go.Bar(
            name='> 60',
            x=oldies[oldies.columns[9:15]].mean().index,
            y=(oldies[oldies.columns[9:15]].mean().values).round(2),
            offsetgroup=3,
            marker_color='#17becf',
        ),
        
    ],
    layout=go.Layout(
        title={
        'text':'<b>Movie Rankings by Age',
        'yanchor':'top',
        'xref':'paper',
        'x':0.5},
        xaxis = dict(
        tickmode = 'array',
        tickvals = [0, 1, 2, 3, 4, 5],
        ticktext = ['The Phantom Menace', 'Attack of the Clones', 
                    'Revenge of the Sith', 'A New Hope', 
                    'The Empire Strikes Back', 'Return of the Jedi']
    ),
        
        
        template="plotly_white",
        xaxis_title='Star Wars Movies',
        yaxis_title='Number of Respondents'
    )
)
fig.show()

In [41]:
pio.templates


Templates configuration
-----------------------
    Default template: 'plotly'
    Available templates:
        ['ggplot2', 'seaborn', 'simple_white', 'plotly',
         'plotly_white', 'plotly_dark', 'presentation', 'xgridoff',
         'ygridoff', 'gridon', 'none']