# General Network Analysis

This notebook performs comprehensive analysis of the full bipartite actor-movie network, examining structural properties and node characteristics.

In [1]:
import json
import networkx as nx

with open("full_graph.json") as f:
    graph_dict = json.load(f)

graph = nx.node_link_graph(graph_dict)

## 1. Basic Network Metrics

Calculate fundamental graph properties:
- Number of nodes (actors and movies)
- Number of edges (actor-movie connections)
- Graph density
- Average degrees for both node types
- Self-loop detection

In [9]:
print("GRAPH METRICS\n")
movie_nodes = {n for n, d in graph.nodes(data=True) if d["bipartite"] == 0}
actor_nodes = {n for n, d in graph.nodes(data=True) if d["bipartite"] == 1}
print("Number of actor nodes:", len(actor_nodes))
print("Number of movie nodes:", len(movie_nodes))

print("Number of nodes:", graph.number_of_nodes())
print("Number of edges:", graph.number_of_edges())
print("Graph density:", nx.bipartite.density(graph, list(actor_nodes)))
mean_degree_movie= graph.size() / len(movie_nodes)
mean_degree_actor = graph.size() / len(actor_nodes)

print(f"Average number of actors per movie (mean degree of movie nodes): {(mean_degree_movie)}")
print(f"Average number of movie per actor (mean degree of actor nodes): {(mean_degree_actor)}")

print("Is there any self loop?", len(list(nx.selfloop_edges(graph))))
if len(list(nx.selfloop_edges(graph))) > 0:
    print(list(nx.selfloop_edges(graph))) 




GRAPH METRICS

Number of actor nodes: 202747
Number of movie nodes: 45456
Number of nodes: 248203
Number of edges: 560987
Graph density: 6.087053854207219e-05
Average number of actors per movie (mean degree of movie nodes): 12.34131907778951
Average number of movie per actor (mean degree of actor nodes): 2.7669311999684334
Is there any self loop? 0


## 2. Demographic and Temporal Analysis

Examine the distribution of actors across different attributes:
- **Gender distribution**: Breakdown of male and female actors
- **Temporal periods**: Classification into Early Hollywood (pre-1927), Old Hollywood (1927-1967), and New Hollywood (1967-present)
- **Award recognition**: Percentage of Oscar-nominated actors

In [None]:
print("Nodes by attributes")

women = sum(1 for n, d in graph.nodes(data=True) if d.get("bipartite") == 1 and d.get("gender")=="female")
men = len(actor_nodes)- women
print(f"""In the projection of the Hollywood graph, there are: 
      {women} women ({(women/len(actor_nodes)*100)}%),
      {men} men ({(men/len(actor_nodes)*100)}%)""")

new_hollywood = sum(1 for n,d in graph.nodes(data=True) if d.get("period") == "new_hollywood" and d.get("bipartite") == 1)
old_hollywood = sum(1 for n,d in graph.nodes(data=True) if d.get("period") == "old_hollywood" and d.get("bipartite") == 1)
early_hollywood = len(actor_nodes) - new_hollywood - old_hollywood
print(f"""With respect to the main activity period, there are
      {early_hollywood} actors ({early_hollywood/len(actor_nodes)*100}%) from the early Hollywood period (until 1927)
      {old_hollywood} actors ({old_hollywood/len(actor_nodes)*100}%) from Old Hollywood (from 1927 to 1967)
      {new_hollywood} actors ({new_hollywood/len(actor_nodes)*100}%) from New Hollywood (from 1967 to the present day)""")

oscar_nominated = sum(1 for n,d in graph.nodes(data = True) if d.get("oscar_nomination") == True and d.get("bipartite") ==1)
print(f"Within the sample, {oscar_nominated} individuals ({oscar_nominated/len(actor_nodes)}%) have received at least an Oscar nomination.")


Nodes by attributes
In the projection of the Hollywood graph, there are: 
      165199 women (81.48%),
      37548 men (18.52%)
With respect to the main activity period, there are
      1938 actors (0.9558711103000291%) from the early Hollywood period (until 1927)
      23553 actors (11.61694131109215%) from Old Hollywood (from 1927 to 1967)
      177256 actors (87.42718757860783%) from New Hollywood (from 1967 to the present day)
Within the sample, 0 individuals (0.0%) have received at least an Oscar nomination.


## 3. Degree Centrality - Top Actors

Degree centrality measures the number of direct connections a node has in the network. In the actor collaboration network, this represents the total number of distinct actors someone has worked with across all films.

**What high degree centrality indicates:**
- Extensive collaboration portfolio with many different actors
- High visibility and integration in the collaboration network
- Prolific career spanning multiple productions and actor communities

The following list shows actors with the **highest degree centrality** - those who have collaborated with the most distinct actors in the network. These are the most connected individuals who serve as major hubs in the collaboration structure.

In [4]:

print("DEGREE CENTRALITY\n")
i = 0
actor_deg = dict(nx.bipartite.degrees(graph, movie_nodes)[0])
movie_deg = dict(nx.bipartite.degrees(graph, actor_nodes)[0])
centr = list(nx.bipartite.degree_centrality(graph, nodes=movie_nodes).items())
centr_actors = [tup for tup in centr if tup[0][:5] == "actor"]
centr_movies = [tup for tup in centr if tup[0][:5] == "movie"]
centr_movies = sorted(centr_movies, key=lambda x: x[1], reverse=True)
centr_actors = sorted(centr_actors, key=lambda x: x[1], reverse=True)
for movie in centr_movies[:10]:
    print(movie[0][6:], "-", "centrality :", round(movie[1], 6), ", cast :", movie_deg[movie[0]], "actors\n")
for actor in centr_actors[:10]:
    print(actor[0][6:], "-", "centrality :", round(actor[1], 6), ", appearances :", actor_deg[actor[0]], "movies\n")

DEGREE CENTRALITY

 Around the World in Eighty Days - centrality : 0.001539 , cast : 312 actors

 Rock of Ages - centrality : 0.001105 , cast : 224 actors

 Mr. Smith Goes to Washington - centrality : 0.001051 , cast : 213 actors

 Jason Bourne - centrality : 0.001026 , cast : 208 actors

 Les Misérables (2012) - centrality : 0.001026 , cast : 208 actors

 Union Pacific - centrality : 0.000908 , cast : 184 actors

 You Don't Mess with the Zohan - centrality : 0.000903 , cast : 183 actors

 Real Steel - centrality : 0.000848 , cast : 172 actors

 Mr. Deeds Goes to Town - centrality : 0.000843 , cast : 171 actors

 Star Trek - centrality : 0.000829 , cast : 168 actors

 Bess Flowers - centrality : 0.00528 , appearances : 240 movies

 Christopher Lee - centrality : 0.003256 , appearances : 148 movies

 John Wayne - centrality : 0.00275 , appearances : 125 movies

 Samuel L. Jackson - centrality : 0.002706 , appearances : 123 movies

 Gérard Depardieu - centrality : 0.00242 , appearances :

In [10]:
print("Is the graph connected?", nx.is_connected(graph))
print("Connected components:", nx.number_connected_components(graph))

Is the graph connected? False
Connected components: 3800
