# Bacon Number

[Jian Tao](https://coehpc.engr.tamu.edu/people/jian-tao/), Texas A&M University and [Enrique Z. Losoya](https://orcid.org/0000-0001-7763-3349), Texas A&M University. 

January 3, 2023.

The [Bacon number](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon#Bacon_numbers) of an actor or actress is the number of degrees of separation (see Six degrees of separation) they have from actor Kevin Bacon, as defined by the game known as [Six Degrees of Kevin Bacon](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon). The higher the Bacon number, the farther away from Kevin Bacon the actor is.

For example, Kevin Bacon's Bacon number is 0. If an actor works in a movie with Kevin Bacon, the actor's Bacon number is 1. If an actor works with an actor who worked with Kevin Bacon in a movie, the first actor's Bacon number is 2, and so forth.

Using the file Movie_Data.txt in the repository to

1. Construct a graph with pandas and NetworkX;
2. Implement a function to find Bacon number of an arbitrary actor/actress;
3. Find the Bacon number of Bruce Lee and Elizabeth Taylor or your favorite actor/actress using your function from step 2. 

The movie data was downloaded and uncompressed from https://oracleofbacon.org/data.txt.bz2, which is collected with a Ruby script by Patrick Reynolds at https://github.com/piki/wikipedia-film-database.

In [1]:
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from itertools import combinations

Read in the movie data and explore the content.

In [2]:
df = pd.read_json('https://raw.githubusercontent.com/e-zl/dswebinar/master/networkx/case2.2/Movie_Data.txt', lines = True)

In [3]:
df.head(5)

Unnamed: 0,title,cast,directors,producers,companies,year
0,Actrius,"[Núria Espert, Rosa Maria Sardà, Anna Lizaran,...",[Ventura Pons],[Ventura Pons],"[Canal+ España, Els Films de la Rambla S.A., G...",1997.0
1,Army of Darkness,"[Bruce Campbell, Embeth Davidtz, Marcus Gilber...",[Sam Raimi],[Robert Tapert],"[Dino De Laurentiis Communications, Renaissanc...",1992.0
2,The Birth of a Nation,"[Lillian Gish, Mae Marsh, Henry B. Walthall, M...",[D. W. Griffith],"[D. W. Griffith, Harry Aitken]","[David W. Griffith Corp., Epoch Producing Co.]",1915.0
3,Blade Runner,"[Harrison Ford, Rutger Hauer, Sean Young, Edwa...",[Ridley Scott],[Michael Deeley],"[The Ladd Company, Shaw Brothers, Blade Runner...",1982.0
4,Blazing Saddles,"[Cleavon Little, Gene Wilder, Harvey Korman, S...",[Mel Brooks],[Michael Hertzberg],[Warner Bros.],1974.0


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 164318 entries, 0 to 164317
Data columns (total 6 columns):
 #   Column     Non-Null Count   Dtype  
---  ------     --------------   -----  
 0   title      164318 non-null  object 
 1   cast       164318 non-null  object 
 2   directors  122964 non-null  object 
 3   producers  96024 non-null   object 
 4   companies  94051 non-null   object 
 5   year       121881 non-null  float64
dtypes: float64(1), object(5)
memory usage: 7.5+ MB


List all the movies that Bruce Lee played.

In [None]:
for i in range(0,len(df)):
    try:
        if "Bruce Lee" in df["cast"][i]:
            print (df['title'][i])
    except:
        pass

To get the Bacon Number, we first create a complex graph that associates different actors/actresses together based on their movies. In the graph, the actor/actress names are the nodes and if two actors/actresses are in the same movie they will be connected by an edge.

In [6]:
G = nx.Graph()
for x in range(0,len(df)):
    myList = list(combinations(df['cast'][x],2))
    G.add_edges_from(myList)

Define a function to find the Bacon Number of an actor/actress.

In [7]:
def Bacon_Number(Actor_Name):
    bcn_num = nx.shortest_path_length(G,'Kevin Bacon', Actor_Name)
    print ("Bacon Number of %s is %d" % (Actor_Name, bcn_num))
    shortest_paths = nx.all_shortest_paths(G, 'Kevin Bacon', Actor_Name)
    for sp in shortest_paths:
        print(sp)
    return bcn_num

#function is used to determine the bacon number of a certain actor/actress, where the actor/actress name will be the input

Let's find the Bacon Number of your favourite actor/actress!

In [None]:
Bacon_Number('Bruce Lee')
#determines bacon number of Bruce Lee

In [None]:
Bacon_Number('Elizabeth Taylor')
#determines Bacon Number of Elizabeth taylor