# Hollywood-Focused Network Analysis

This notebook analyzes a filtered subgraph focusing specifically on Hollywood productions and associated actors.

In [1]:
import json
import networkx as nx

with open("hollywood_graph.json") as f:
    graph_dict = json.load(f)

graph = nx.node_link_graph(graph_dict)

## 1. Connectivity Analysis

Examine whether the Hollywood network forms a connected whole or consists of separate communities.

In [2]:
print("Is the graph connected?", nx.is_connected(graph))
print("Connected components:", nx.number_connected_components(graph))

Is the graph connected? False
Connected components: 943


In [10]:

print("GRAPH METRICS - HOLLYWOOD\n")
print("Number of nodes:", graph.number_of_nodes())
print("Number of edges:", graph.number_of_edges())

movie_nodes = {n for n, d in graph.nodes(data=True) if d.get("bipartite") == 0}
actor_nodes = {n for n, d in graph.nodes(data=True) if d.get("bipartite") == 1}

print("Number of actor nodes:", len(actor_nodes))
print("Number of movie nodes:", len(movie_nodes))

print(f"Movies: {len(movie_nodes)}")
print(f"Actors: {len(actor_nodes)}")
print("Graph density:", nx.bipartite.density(graph, list(actor_nodes)))

mean_degree_movie= graph.size() / len(movie_nodes)
mean_degree_actor = graph.size() / len(actor_nodes)

print(f"Average number of actors per movie (mean degree of movie nodes): {(mean_degree_movie)}")
print(f"Average number of movie per actor (mean degree of actor nodes): {(mean_degree_actor)}")

print("Is there any self loop?", len(list(nx.selfloop_edges(graph))))
if len(list(nx.selfloop_edges(graph))) > 0:
    print(list(nx.selfloop_edges(graph)))


GRAPH METRICS - HOLLYWOOD

Number of nodes: 132612
Number of edges: 322774
Number of actor nodes: 111458
Number of movie nodes: 21154
Movies: 21154
Actors: 111458
Graph density: 0.00013689727344201054
Average number of actors per movie (mean degree of movie nodes): 15.258296303299613
Average number of movie per actor (mean degree of actor nodes): 2.8959249223922914
Is there any self loop? 0


## 2. Hollywood Network Metrics

Calculate key structural properties of the Hollywood subgraph:
- Network size (nodes and edges)
- Density of connections
- Average cast size per movie
- Average number of films per actor

Compare these metrics with the full network to understand Hollywood's specific characteristics.

In [13]:
print("Nodes by attributes")

women = sum(1 for n, d in graph.nodes(data=True) if d.get("bipartite") == 1 and d.get("gender")=="female")
men = len(actor_nodes)- women
print(f"""In the projection of the Hollywood graph, there are: 
      {women} women ({(women/len(actor_nodes)*100)}%),
      {men} men ({(men/len(actor_nodes)*100)}%)""")

new_hollywood = sum(1 for n,d in graph.nodes(data=True) if d.get("period") == "new_hollywood" and d.get("bipartite") == 1)
old_hollywood = sum(1 for n,d in graph.nodes(data=True) if d.get("period") == "old_hollywood" and d.get("bipartite") == 1)
early_hollywood = len(actor_nodes) - new_hollywood - old_hollywood
print(f"""With respect to the main activity period, there are
      {early_hollywood} actors ({early_hollywood/len(actor_nodes)*100}%) from the early Hollywood period (until 1927)
      {old_hollywood} actors ({old_hollywood/len(actor_nodes)*100}%) from Old Hollywood (from 1927 to 1967)
      {new_hollywood} actors ({new_hollywood/len(actor_nodes)*100}%) from New Hollywood (from 1967 to the present day)""")

oscar_nominated = sum(1 for n,d in graph.nodes(data = True) if d.get("oscar_nomination") == True and d.get("bipartite") ==1)
print(f"Within the sample, {oscar_nominated} individuals ({oscar_nominated/len(actor_nodes)*100}%) have received at least an Oscar nomination.")


Nodes by attributes
In the projection of the Hollywood graph, there are: 
      84782 women (76.06632094600657%),
      26676 men (23.93367905399343%)
With respect to the main activity period, there are
      1188 actors (1.0658723465341204%) from the early Hollywood period (until 1927)
      13879 actors (12.452224156184393%) from Old Hollywood (from 1927 to 1967)
      96391 actors (86.4819034972815%) from New Hollywood (from 1967 to the present day)
Within the sample, 946 individuals (0.8487502018697626%) have received at least an Oscar nomination.


## 3. Degree Centrality - Top Actors

Degree centrality measures the number of direct connections a node has in the network. In the actor collaboration network, this represents the total number of distinct actors someone has worked with across all films.

**What high degree centrality indicates:**
- Extensive collaboration portfolio with many different actors
- High visibility and integration in the collaboration network
- Prolific career spanning multiple productions and actor communities

The following list shows actors with the **highest degree centrality** - those who have collaborated with the most distinct actors in the network. These are the most connected individuals who serve as major hubs in the collaboration structure.

In [7]:
movie_nodes = {n for n, d in graph.nodes(data=True) if d["bipartite"] == 0}
actor_nodes = set(graph) - movie_nodes
print("DEGREE CENTRALITY\n")
i = 0
actor_deg = dict(nx.bipartite.degrees(graph, movie_nodes)[0])
movie_deg = dict(nx.bipartite.degrees(graph, actor_nodes)[0])
centr = list(nx.bipartite.degree_centrality(graph, nodes=movie_nodes).items())
centr_actors = [tup for tup in centr if tup[0][:5] == "actor"]
centr_movies = [tup for tup in centr if tup[0][:5] == "movie"]
centr_movies = sorted(centr_movies, key=lambda x: x[1], reverse=True)
centr_actors = sorted(centr_actors, key=lambda x: x[1], reverse=True)
for movie in centr_movies[:10]:
    print(movie[0][6:], "-", "centrality :", round(movie[1], 6), ", cast :", movie_deg[movie[0]], "actors\n")
for actor in centr_actors[:10]:
    print(actor[0][6:], "-", "centrality :", round(actor[1], 6), ", appearances :", actor_deg[actor[0]], "movies\n")

DEGREE CENTRALITY

 Around the World in Eighty Days - centrality : 0.002799 , cast : 312 actors

 Rock of Ages - centrality : 0.00201 , cast : 224 actors

 Mr. Smith Goes to Washington - centrality : 0.001911 , cast : 213 actors

 Les Mis√©rables (2012) - centrality : 0.001866 , cast : 208 actors

 Jason Bourne - centrality : 0.001866 , cast : 208 actors

 Union Pacific - centrality : 0.001651 , cast : 184 actors

 You Don't Mess with the Zohan - centrality : 0.001642 , cast : 183 actors

 Real Steel - centrality : 0.001543 , cast : 172 actors

 Mr. Deeds Goes to Town - centrality : 0.001534 , cast : 171 actors

 Star Trek - centrality : 0.001507 , cast : 168 actors

 Bess Flowers - centrality : 0.010778 , appearances : 228 movies

 John Wayne - centrality : 0.005767 , appearances : 122 movies

 Samuel L. Jackson - centrality : 0.005436 , appearances : 115 movies

 John Carradine - centrality : 0.004869 , appearances : 103 movies

 Irving Bacon - centrality : 0.004396 , appearances : 9