**The Academy Awards**, also known as the **Oscars**, are awards given annually by the Academy of Motion Picture Arts and Sciences (AMPAS) in the United States to recognize achievements in cinematic excellence. The awards are considered the most prestigious in the film industry. The Academy's voting membership assesses the achievements, which were first presented in 1929.

The awards recognize excellence in cinematic achievements. To be eligible, a film has to be exhibited in a theater in Los Angeles for at least a week. The exceptions to this are foreign-language movies, which are submitted by their country of origin and need not be featured in the US.

# **Objective:**
## **To understand the trends and patterns of winning categories in Oscar-nominated films over the past 96 years.**.

##1. Importing Libraries & accessing the data

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
oscar=pd.read_csv('/content/drive/MyDrive/OSCARS.csv')
oscar.head()

Unnamed: 0,year_film,year_ceremony,ceremony,category,name,film,winner
0,1927,1928,1,ACTOR,Richard Barthelmess,The Noose,False
1,1927,1928,1,ACTOR,Emil Jannings,The Last Command,True
2,1927,1928,1,ACTRESS,Louise Dresser,A Ship Comes In,False
3,1927,1928,1,ACTRESS,Janet Gaynor,7th Heaven,True
4,1927,1928,1,ACTRESS,Gloria Swanson,Sadie Thompson,False


In [4]:
oscar.shape

(10889, 7)

In [5]:
oscar.isnull().sum()

year_film          0
year_ceremony      0
ceremony           0
category           0
name               5
film             319
winner             0
dtype: int64

##2. Data Exploration & Analysis

#1. How many unique films, categories, and names are there in the dataset?

In [6]:
#How many unique films?
len(oscar['film'].unique())

5042

In [7]:
#How many unique categories?
len(oscar['category'].unique())

115

In [8]:
#How many unique people?
len(oscar['name'].unique())

7040

### Insights:
1. A total of 5,042 movies were nominated over the period of 96 years.

2. Nominations varied across 115 distinct categories.

3. 7,040 individuals have been nominated, so far

# 2. Out of the distinct films, categories and names nominated, how many won the award?

In [9]:
#Films
win_film=oscar[oscar['winner']==True]['film'].unique()
print("Number of films that won oscar award:", len(win_film))

print(round(len(win_film)/len(oscar['film'].unique()),2)*100,"%", "of nominated films won oscars.")

Number of films that won oscar award: 1329
26.0 % of nominated films won oscars.


In [10]:
#Categories
win_categories=oscar[oscar['winner']==True]['category'].unique()
print("Number of categories that won oscar award:", len(win_categories))

print(round(len(win_categories)/len(oscar['category'].unique()),2)*100,"%", "of nominated categories won oscars.")
print("Thus, each of the categories, have won an award at any certain point of time or the other.")

Number of categories that won oscar award: 115
100.0 % of nominated categories won oscars.
Thus, each of the categories, have won an award at any certain point of time or the other.


In [11]:
#Individuals
win_artist=oscar[oscar['winner']==True]['name'].unique()
print("Number of artists that won oscar award:", len(win_artist))

print(round(len(win_artist)/len(oscar['name'].unique()),2)*100,"%", "of nominated artists won oscars.")

Number of artists that won oscar award: 2082
30.0 % of nominated artists won oscars.


### Insights:

1. Out of 5042 films, 1329 have won an Oscar, depicting a 26.0 % winning proportion.

2. Out of 115 categories nominated, all of them bagged an award, at a certain point in time or at other.

3. 2,082 individuals out of 7,040 (i.e., almost 30%), have won an award.  

#3. What is the distribution of the 'winner' column? How many winners across all categories?


In [12]:
oscar['winner'].value_counts()

False    8424
True     2465
Name: winner, dtype: int64

In [13]:
(2465/10889)*100

22.63752410689687

# Insights:

**Approximately 22.64% of nominations have won the award, this indicates a selective and competitive nature of the accolade**

#4. Trends over time:

##a. How many films released per year?
##b. Are there any trends in the types of categories nominated over the years?
##c. What are the distinctive Nominated Categories & its respective Winning Proportion?

##a. Number of films released per year

In [14]:
films_per_year=oscar['year_film'].value_counts().sort_values('index', ascending=True).reset_index()
films_per_year

  films_per_year=oscar['year_film'].value_counts().sort_values('index', ascending=True).reset_index()


Unnamed: 0,index,year_film
0,1927,35
1,1928,38
2,1929,42
3,1930,44
4,1931,45
...,...,...
91,1945,160
92,1944,169
93,1943,182
94,1941,183


In [15]:
px.bar(films_per_year,x='index', y='year_film', color='year_film', title='Films Released Per Year')

##b. Trend in nominated categories over time

In [16]:
nominated_categories_per_year=oscar.groupby('year_ceremony')['category'].count().reset_index()
nominated_categories_per_year['change%']=round(nominated_categories_per_year['category'].pct_change()*100,2)
nominated_categories_per_year

Unnamed: 0,year_ceremony,category,change%
0,1928,35,
1,1929,38,8.57
2,1930,42,10.53
3,1931,44,4.76
4,1932,45,2.27
...,...,...,...
91,2020,128,2.40
92,2021,120,-6.25
93,2022,124,3.33
94,2023,126,1.61


In [17]:

fig = px.bar(nominated_categories_per_year, x='year_ceremony', y='category', title='Nominated Categories per year', height=1000)

# Add a line plot for the % change
fig.add_scatter(x=nominated_categories_per_year['year_ceremony'], y=nominated_categories_per_year['change%'], mode='lines', name='% Change')

# Update layout and axis labels
fig.update_layout(xaxis_title='Year', yaxis_title='Category Count')

# Show the plot
fig.show()

#c. Distinct Nominated Categories & Winning Proportion

In [18]:
category_counts=oscar['category'].value_counts().reset_index()
category_counts.columns = ['CATEGORY', 'Total Nominations']
category_counts

Unnamed: 0,CATEGORY,Total Nominations
0,DIRECTING,469
1,FILM EDITING,450
2,ACTRESS IN A SUPPORTING ROLE,440
3,ACTOR IN A SUPPORTING ROLE,440
4,DOCUMENTARY (Short Subject),378
...,...,...
110,SPECIAL FOREIGN LANGUAGE FILM AWARD,2
111,GORDON E. SAWYER AWARD,1
112,SPECIAL ACHIEVEMENT AWARD (Sound Editing),1
113,SPECIAL ACHIEVEMENT AWARD (Sound Effects),1


In [19]:
winning_category_counts = oscar[oscar['winner'] == True].groupby('category').size().reset_index()
winning_category_counts.columns=['CATEGORY', 'Wins']
winning_category_counts

Unnamed: 0,CATEGORY,Wins
0,ACTOR,49
1,ACTOR IN A LEADING ROLE,48
2,ACTOR IN A SUPPORTING ROLE,88
3,ACTRESS,49
4,ACTRESS IN A LEADING ROLE,48
...,...,...
110,WRITING (Story and Screenplay),7
111,WRITING (Story and Screenplay--based on factua...,4
112,WRITING (Story and Screenplay--based on materi...,1
113,WRITING (Story and Screenplay--written directl...,12


In [49]:
# Merge the two DataFrames on the 'Category' column
category_data = pd.merge(category_counts, winning_category_counts, on='CATEGORY', how='left')

# Calculate the proportion of nominations that resulted in winning an award for each category
category_data['Proportion of Wins'] = category_data['Wins'] / category_data['Total Nominations']*100

# Fill NaN values with 0 (if there were no wins in a category)
category_data.fillna(0, inplace=True)

# I want the top 20 categories
WIN_CATEGORIES=category_data.head(20)
WIN_CATEGORIES.sort_values(by='Proportion of Wins', ascending=False)

Unnamed: 0,CATEGORY,Total Nominations,Wins,Proportion of Wins
6,DOCUMENTARY (Feature),345,79,22.898551
18,SHORT FILM (Live Action),226,50,22.123894
4,DOCUMENTARY (Short Subject),378,80,21.164021
17,ACTOR,232,49,21.12069
16,MUSIC (Original Song),235,49,20.851064
15,ACTRESS,236,49,20.762712
7,CINEMATOGRAPHY,338,70,20.710059
0,DIRECTING,469,95,20.255864
12,SOUND,245,49,20.0
14,ACTOR IN A LEADING ROLE,240,48,20.0


In [21]:
px.bar(WIN_CATEGORIES, x='CATEGORY', y=['Total Nominations', 'Wins'],
             title='Total Nominations vs Wins per Category',
             labels={'value': 'Count', 'variable': 'Metric', 'Category': 'Category'},
             color_discrete_map={'Total Nominations': 'black', 'Wins': 'gold'},
             barmode='group')

##d. Winning Proportion of movies

In [47]:
# Calculate the total number of films and the number of winning films for each year
films_per_year = oscar.groupby('year_film').size().reset_index(name='Total Films')
winning_films_per_year = oscar[oscar['winner'] == True].groupby('year_film').size().reset_index(name='Winning Films')

# Merge the two DataFrames on the 'year_film' column
films_data = pd.merge(films_per_year, winning_films_per_year, on='year_film', how='left')

# Calculate the proportion of winning films for each year
films_data['Winning Proportion'] = round(films_data['Winning Films'] / films_data['Total Films']*100,2)
films_data.sort_values(by='Winning Proportion', ascending=False)

Unnamed: 0,year_film,Total Films,Winning Films,Winning Proportion
0,1927,35,15,42.86
4,1931,45,14,31.11
5,1932,62,19,30.65
20,1948,109,32,29.36
21,1949,112,32,28.57
...,...,...,...,...
14,1942,186,30,16.13
16,1944,169,27,15.98
13,1941,183,28,15.30
12,1940,160,24,15.00


In [23]:
px.bar(films_data, x='year_film', y=['Total Films', 'Winning Films'], title='Total Nominations vs Wins per Film',
             labels={'value': 'Count', 'variable': 'Metric', 'year_film': 'Years'},
             color_discrete_map={'Total Films': 'black', 'Winning Films': 'gold'})

### Insights:

(a) A total of  **186 films** were released in **1942**, which is by far the highest number of releases.

(b) The year **1933** has witnessed highest increase in the no. of nominated categories (**+38%** approx.) on YoY basis; whereas year **1947** shown a sharp **decrease of 35%** of categories been nominated.

(c) **'Directing'** category has the **highest no. of nominations**, whereas **'short films' and 'documentaries' has the highest winning proportion** (22% each, approximately). **Directing** witnessed a winning proportion of **20%**.

(d) The year **1927** has the highest proportion of winning films (approximately 43%), followed by 1931, 1932.

#5. Winners and Nominees:

##a. Which films have won the most Oscars?
##b. Which individuals have won the most awards?
##c. Top 5 award winning movie analysis
##d. Top 5 award winning person/entity analysis

## (a) Which films have won the most Oscars?

In [24]:
winning_film=oscar[oscar['winner']==True]
# Group by film and count the number of wins for each film
wins_per_film = winning_film.groupby('film').size().reset_index(name='Wins')

# Sort the results in descending order based on the number of wins
wins_per_film_sorted = wins_per_film.sort_values(by='Wins', ascending=False)

# Display the films with the most wins
print("Films with the most Oscars:")
print(wins_per_film_sorted.head(15))

Films with the most Oscars:
                                               film  Wins
1215                                        Titanic    12
1274                                West Side Story    11
1055  The Lord of the Rings: The Return of the King    11
134                                         Ben-Hur    11
960                             The English Patient     9
380                                            Gigi     9
1037                               The Last Emperor     9
393                              Gone with the Wind     8
674                               On the Waterfront     8
68                                          Amadeus     8
365                           From Here to Eternity     8
199                                         Cabaret     8
804                             Slumdog Millionaire     8
369                                          Gandhi     8
638                                    My Fair Lady     8


## (b) Which individuals have won the most awards?

In [25]:
winning_person=oscar[oscar['winner']==True]
# Group by film and count the number of wins for each film
wins_per_person = winning_film.groupby('name').size().reset_index(name='Wins')

# Sort the results in descending order based on the number of wins
wins_per_person_sorted = wins_per_person.sort_values(by='Wins', ascending=False)

# Display the films with the most wins
print("Person with the most Oscars:")
print(wins_per_person_sorted.head(15))

Person with the most Oscars:
                                                   name  Wins
1994                              Walt Disney, Producer    22
1354                                Metro-Goldwyn-Mayer    12
973                                               Italy    10
771                                              France     9
2007                                       Warner Bros.     7
246                                       Alfred Newman     7
879                       Gordon Hollingshead, Producer     6
708                             Edward Selzer, Producer     5
1529                                          Paramount     5
1355  Metro-Goldwyn-Mayer Studio Sound Department, D...     5
1104                                          John Ford     4
1158                               Joseph L. Mankiewicz     4
1187                                  Katharine Hepburn     4
2044                                      William Wyler     4
585                                      

## (c) Top 5 Award Winning Movie Analysis

###1. Titanic

There are 2 movies named 'Titanic' (1953, 1997) in this dataset.

In [26]:
Titanic=oscar[(oscar['film']=='Titanic') & (oscar['year_film']==1953)].reset_index()
Titanic

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,2780,1953,1954,26,ART DIRECTION (Black-and-White),"Art Direction: Lyle Wheeler, Maurice Ransford...",Titanic,False
1,2879,1953,1954,26,WRITING (Story and Screenplay),"Charles Brackett, Walter Reisch, Richard Breen",Titanic,True


In [27]:
Titanic=oscar[(oscar['film']=='Titanic') & (oscar['winner']==True) & (oscar['year_film']==1997)].reset_index()
Titanic

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,7692,1997,1998,70,ART DIRECTION,Art Direction: Peter Lamont; Set Decoration:...,Titanic,True
1,7696,1997,1998,70,CINEMATOGRAPHY,Russell Carpenter,Titanic,True
2,7701,1997,1998,70,COSTUME DESIGN,Deborah L. Scott,Titanic,True
3,7707,1997,1998,70,DIRECTING,James Cameron,Titanic,True
4,7722,1997,1998,70,FILM EDITING,"Conrad Buff, James Cameron, Richard A. Harris",Titanic,True
5,7735,1997,1998,70,MUSIC (Original Dramatic Score),James Horner,Titanic,True
6,7745,1997,1998,70,MUSIC (Original Song),Music by James Horner; Lyric by Will Jennings,Titanic,True
7,7750,1997,1998,70,BEST PICTURE,"James Cameron and Jon Landau, Producers",Titanic,True
8,7765,1997,1998,70,SOUND,"Gary Rydstrom, Tom Johnson, Gary Summers, Mark...",Titanic,True
9,7768,1997,1998,70,SOUND EFFECTS EDITING,"Tom Bellfort, Christopher Boyes",Titanic,True


In [28]:
#what are the unique categories for which titanic won an award?
Titanic['category'].value_counts()

ART DIRECTION                      1
CINEMATOGRAPHY                     1
COSTUME DESIGN                     1
DIRECTING                          1
FILM EDITING                       1
MUSIC (Original Dramatic Score)    1
MUSIC (Original Song)              1
BEST PICTURE                       1
SOUND                              1
SOUND EFFECTS EDITING              1
VISUAL EFFECTS                     1
Name: category, dtype: int64

In [29]:
#what are the years in which titanic won an award?
Titanic['year_ceremony'].value_counts()

1998    11
Name: year_ceremony, dtype: int64

###2. West Side Story

There're 2 movies sharing this same name. One was released in 1961, other in 2021.

**(a) 1961**

In [30]:
West_side_story=oscar[(oscar['film']=='West Side Story') & (oscar['year_film']==1961)].reset_index()
West_side_story

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,3719,1961,1962,34,ACTOR IN A SUPPORTING ROLE,George Chakiris,West Side Story,True
1,3733,1961,1962,34,ACTRESS IN A SUPPORTING ROLE,Rita Moreno,West Side Story,True
2,3743,1961,1962,34,ART DIRECTION (Color),Art Direction: Boris Leven; Set Decoration: ...,West Side Story,True
3,3753,1961,1962,34,CINEMATOGRAPHY (Color),Daniel L. Fapp,West Side Story,True
4,3763,1961,1962,34,COSTUME DESIGN (Color),Irene Sharaff,West Side Story,True
5,3768,1961,1962,34,DIRECTING,"Robert Wise, Jerome Robbins",West Side Story,True
6,3780,1961,1962,34,FILM EDITING,Thomas Stanford,West Side Story,True
7,3795,1961,1962,34,MUSIC (Scoring of a Musical Picture),"Saul Chaplin, Johnny Green, Sid Ramin, Irwin K...",West Side Story,True
8,3805,1961,1962,34,BEST MOTION PICTURE,"Robert Wise, Producer",West Side Story,True
9,3820,1961,1962,34,SOUND,"Todd-AO Sound Department, Fred Hynes, Sound Di...",West Side Story,True


**(b) 2021**

In [31]:
West_side_story=oscar[(oscar['film']=='West Side Story') & (oscar['year_film']==2021)].reset_index()
West_side_story

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,10531,2021,2022,94,ACTRESS IN A SUPPORTING ROLE,Ariana DeBose,West Side Story,True
1,10544,2021,2022,94,CINEMATOGRAPHY,Janusz Kaminski,West Side Story,False
2,10549,2021,2022,94,COSTUME DESIGN,Paul Tazewell,West Side Story,False
3,10554,2021,2022,94,DIRECTING,Steven Spielberg,West Side Story,False
4,10599,2021,2022,94,BEST PICTURE,"Steven Spielberg and Kristie Macosko Krieger, ...",West Side Story,False
5,10604,2021,2022,94,PRODUCTION DESIGN,Production Design: Adam Stockhausen; Set Decor...,West Side Story,False
6,10619,2021,2022,94,SOUND,"Tod A. Maitland, Gary Rydstrom, Brian Chumney,...",West Side Story,False


###3. The Lord of the Rings: The Return of the King

In [32]:
L_rings=oscar[(oscar['film']=='The Lord of the Rings: The Return of the King')].reset_index()
L_rings

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,8366,2003,2004,76,ART DIRECTION,Art Direction: Grant Major; Set Decoration: Da...,The Lord of the Rings: The Return of the King,True
1,8376,2003,2004,76,COSTUME DESIGN,Ngila Dickson and Richard Taylor,The Lord of the Rings: The Return of the King,True
2,8380,2003,2004,76,DIRECTING,Peter Jackson,The Lord of the Rings: The Return of the King,True
3,8394,2003,2004,76,FILM EDITING,Jamie Selkirk,The Lord of the Rings: The Return of the King,True
4,8402,2003,2004,76,MAKEUP,Richard Taylor and Peter King,The Lord of the Rings: The Return of the King,True
5,8409,2003,2004,76,MUSIC (Original Score),Howard Shore,The Lord of the Rings: The Return of the King,True
6,8411,2003,2004,76,MUSIC (Original Song),Music and Lyric by Fran Walsh and Howard Shore...,The Lord of the Rings: The Return of the King,True
7,8415,2003,2004,76,BEST PICTURE,"Barrie M. Osborne, Peter Jackson and Fran Wals...",The Lord of the Rings: The Return of the King,True
8,8434,2003,2004,76,SOUND MIXING,"Christopher Boyes, Michael Semanick, Michael H...",The Lord of the Rings: The Return of the King,True
9,8438,2003,2004,76,VISUAL EFFECTS,"Jim Rygiel, Joe Letteri, Randall William Cook ...",The Lord of the Rings: The Return of the King,True


###4. Ben-Hur

In [33]:
Ben_hur=oscar[(oscar['film']=='Ben-Hur')].reset_index()
Ben_hur

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,3474,1959,1960,32,ACTOR,Charlton Heston,Ben-Hur,True
1,3478,1959,1960,32,ACTOR IN A SUPPORTING ROLE,Hugh Griffith,Ben-Hur,True
2,3498,1959,1960,32,ART DIRECTION (Color),"Art Direction: William A. Horning, Edward Car...",Ben-Hur,True
3,3508,1959,1960,32,CINEMATOGRAPHY (Color),Robert L. Surtees,Ben-Hur,True
4,3518,1959,1960,32,COSTUME DESIGN (Color),Elizabeth Haffenden,Ben-Hur,True
5,3523,1959,1960,32,DIRECTING,William Wyler,Ben-Hur,True
6,3534,1959,1960,32,FILM EDITING,"Ralph E. Winters, John D. Dunning",Ben-Hur,True
7,3543,1959,1960,32,MUSIC (Music Score of a Dramatic or Comedy Pic...,Miklos Rozsa,Ben-Hur,True
8,3559,1959,1960,32,BEST MOTION PICTURE,"Sam Zimbalist, Producer",Ben-Hur,True
9,3572,1959,1960,32,SOUND,"Metro-Goldwyn-Mayer Studio Sound Department, F...",Ben-Hur,True


###5. The English Patient

In [34]:
Eng_patient=oscar[(oscar['film']=='The English Patient')].reset_index()
Eng_patient

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,7554,1996,1997,69,ACTOR IN A LEADING ROLE,Ralph Fiennes,The English Patient,False
1,7566,1996,1997,69,ACTRESS IN A LEADING ROLE,Kristin Scott Thomas,The English Patient,False
2,7570,1996,1997,69,ACTRESS IN A SUPPORTING ROLE,Juliette Binoche,The English Patient,True
3,7574,1996,1997,69,ART DIRECTION,Art Direction: Stuart Craig; Set Decoration:...,The English Patient,True
4,7578,1996,1997,69,CINEMATOGRAPHY,John Seale,The English Patient,True
5,7585,1996,1997,69,COSTUME DESIGN,Ann Roth,The English Patient,True
6,7588,1996,1997,69,DIRECTING,Anthony Minghella,The English Patient,True
7,7603,1996,1997,69,FILM EDITING,Walter Murch,The English Patient,True
8,7616,1996,1997,69,MUSIC (Original Dramatic Score),Gabriel Yared,The English Patient,True
9,7631,1996,1997,69,BEST PICTURE,"Saul Zaentz, Producer",The English Patient,True


##(d). Topmost award winning person/entity analysis

In [35]:
WD=oscar[(oscar['name']=='Walt Disney, Producer') & (oscar['winner']==True)].reset_index()
WD

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,182,1931,1932,5,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",Flowers and Trees,True
1,249,1932,1933,6,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",The Three Little Pigs,True
2,308,1934,1935,7,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",The Tortoise and the Hare,True
3,389,1935,1936,8,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",Three Orphan Kittens,True
4,489,1936,1937,9,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",The Country Cousin,True
5,610,1937,1938,10,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",The Old Mill,True
6,740,1938,1939,11,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",Ferdinand the Bull,True
7,885,1939,1940,12,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",The Ugly Duckling,True
8,1211,1941,1942,14,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",Lend a Paw,True
9,1404,1942,1943,15,SHORT SUBJECT (Cartoon),"Walt Disney, Producer",Der Fuehrer's Face,True


In [36]:
WD['category'].value_counts()

SHORT SUBJECT (Cartoon)        12
SHORT SUBJECT (Two-reel)        5
DOCUMENTARY (Feature)           2
DOCUMENTARY (Short Subject)     2
SHORT SUBJECT (Live Action)     1
Name: category, dtype: int64

In [37]:
MGM=oscar[(oscar['name']=='Metro-Goldwyn-Mayer') & (oscar['winner']==True)].reset_index()
MGM

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,64,1928,1929,2,OUTSTANDING PICTURE,Metro-Goldwyn-Mayer,The Broadway Melody,True
1,178,1931,1932,5,OUTSTANDING PRODUCTION,Metro-Goldwyn-Mayer,Grand Hotel,True
2,384,1935,1936,8,OUTSTANDING PRODUCTION,Metro-Goldwyn-Mayer,Mutiny on the Bounty,True
3,481,1936,1937,9,OUTSTANDING PRODUCTION,Metro-Goldwyn-Mayer,The Great Ziegfeld,True
4,500,1936,1937,9,SHORT SUBJECT (Two-reel),Metro-Goldwyn-Mayer,The Public Pays,True
5,619,1937,1938,10,SHORT SUBJECT (Two-reel),Metro-Goldwyn-Mayer,Torture Money,True
6,745,1938,1939,11,SHORT SUBJECT (One-reel),Metro-Goldwyn-Mayer,That Mothers Might Live,True
7,1035,1940,1941,13,SHORT SUBJECT (Cartoon),Metro-Goldwyn-Mayer,The Milky Way,True
8,1223,1941,1942,14,SHORT SUBJECT (One-reel),Metro-Goldwyn-Mayer,Of Pups and Puzzles,True
9,1228,1941,1942,14,SHORT SUBJECT (Two-reel),Metro-Goldwyn-Mayer,Main Street on the March!,True


In [38]:
MGM['category'].value_counts()

OUTSTANDING PRODUCTION        3
SHORT SUBJECT (Two-reel)      3
SHORT SUBJECT (One-reel)      2
OUTSTANDING PICTURE           1
SHORT SUBJECT (Cartoon)       1
OUTSTANDING MOTION PICTURE    1
SPECIAL EFFECTS               1
Name: category, dtype: int64

### Insights:

### **MOST AWARD-WINNING MOVIES:**
## (a) There are 2 movies named 'Titanic' (1953, 1997) in this dataset.

The **1997's Titanic** was **nominated** for oscars **14 times**, out of which it **won 11 times** (Each in 11 distinct categories) in the year 1998.

The **1953's Titanic** was **nominated twice** in 1954 and **won an award for "WRITING (Story and Screenplay)**". This award was shared amongst Charles Brackett, Walter Reisch, Richard Breen.

## (b) There're 2 movies named 'West Side Story'. One was released in 1961, other in 2021.

The **1961's West Side Story** was **nominated** for oscars **11 times**, out of which it **won 10 times** (Each in 10 distinct categories) in the year 1962.

The **2021's West Side Story** was **nominated 7 times** in 2022 and **won an award for "ACTRESS IN A SUPPORTING ROLE" by Ariana DeBose.**

## (c) 2003's "**The Lord of the Rings: The Return of the King**" was nominated 11 times for oscars and won all the 11 times (each in distinct 11 categories)

## (d) 1959's **Ben-Hur** was nominated for 12 times and won 11 awards for each distinct category.

## (e) 1996's **The English Patient** was nominated 12 times, wherein it won 9 awards.

### **MOST AWARD-WINNING INDIVIDUALS/ENTITIES:**

##(a) **Walt Disney Productions** were nominated 59 times, out of which it won 22 times. Most of which (almost 55%) were won for SHORT SUBJECT (Cartoon) category.

##(b) **MGM** was nominated for 64 times, out of which it won 12 times. Mostly for Production & short subject content

##6. Recent Trends (2010s- till present)

We want to shed light onto the trends of oscar nominees and winners over the period: 2010 to 2023.

In [39]:
Recent=oscar[(oscar['year_film']>2009) & (oscar['winner']==True)].reset_index()
Recent

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
0,9145,2010,2011,83,ACTOR IN A LEADING ROLE,Colin Firth,The King's Speech,True
1,9147,2010,2011,83,ACTOR IN A SUPPORTING ROLE,Christian Bale,The Fighter,True
2,9155,2010,2011,83,ACTRESS IN A LEADING ROLE,Natalie Portman,Black Swan,True
3,9159,2010,2011,83,ACTRESS IN A SUPPORTING ROLE,Melissa Leo,The Fighter,True
4,9164,2010,2011,83,ANIMATED FEATURE FILM,Lee Unkrich,Toy Story 3,True
...,...,...,...,...,...,...,...,...
383,10880,2023,2024,96,WRITING (Original Screenplay),Screenplay - Justine Triet and Arthur Harari,Anatomy of a Fall,True
384,10885,2023,2024,96,JEAN HERSHOLT HUMANITARIAN AWARD,,,True
385,10886,2023,2024,96,HONORARY AWARD,"To Angela Bassett, who has inspired audiences ...",,True
386,10887,2023,2024,96,HONORARY AWARD,"To Mel Brooks, for his comedic brilliance, pro...",,True


In [40]:
#which category won the highest number of awards
Recent_wins=Recent[Recent['winner']==True]
Recent_cats=Recent_wins['category'].value_counts()
Recent_cats

HONORARY AWARD                       39
ACTOR IN A LEADING ROLE              14
FILM EDITING                         14
WRITING (Original Screenplay)        14
WRITING (Adapted Screenplay)         14
VISUAL EFFECTS                       14
ACTOR IN A SUPPORTING ROLE           14
SHORT FILM (Animated)                14
BEST PICTURE                         14
MUSIC (Original Song)                14
MUSIC (Original Score)               14
SHORT FILM (Live Action)             14
ACTRESS IN A LEADING ROLE            14
DIRECTING                            14
COSTUME DESIGN                       14
CINEMATOGRAPHY                       14
ANIMATED FEATURE FILM                14
ACTRESS IN A SUPPORTING ROLE         14
DOCUMENTARY (Short Subject)          12
DOCUMENTARY (Feature)                12
PRODUCTION DESIGN                    12
MAKEUP AND HAIRSTYLING               12
JEAN HERSHOLT HUMANITARIAN AWARD     11
SOUND EDITING                        11
SOUND MIXING                         10


In [41]:
#Most award-winning film
Recent['film'].value_counts()

Oppenheimer                          7
Gravity                              7
Everything Everywhere All at Once    7
Mad Max: Fury Road                   6
La La Land                           6
                                    ..
Inocente                             1
Searching for Sugar Man              1
Phantom Thread                       1
Anna Karenina                        1
Anatomy of a Fall                    1
Name: film, Length: 205, dtype: int64

In [42]:
#Most award-winning person
Recent['name'].value_counts()

Alfonso Cuarón                                                                                           3
Emmanuel Lubezki                                                                                         3
Jenny Beavan                                                                                             2
Emma Stone                                                                                               2
Alejandro G. Iñárritu                                                                                    2
                                                                                                        ..
Music and Lyric by John Stephens and Lonnie Lynn                                                         1
Frances Hannon and Mark Coulier                                                                          1
Poland                                                                                                   1
Tom Cross                            

In [43]:
#Across which all categories did Oppenheimer win an award?
Opp=Recent[Recent['film']=='Oppenheimer']
Opp

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
361,10768,2023,2024,96,ACTOR IN A LEADING ROLE,Cillian Murphy,Oppenheimer,True
362,10772,2023,2024,96,ACTOR IN A SUPPORTING ROLE,Robert Downey Jr.,Oppenheimer,True
366,10793,2023,2024,96,CINEMATOGRAPHY,Hoyte van Hoytema,Oppenheimer,True
368,10802,2023,2024,96,DIRECTING,Christopher Nolan,Oppenheimer,True
371,10818,2023,2024,96,FILM EDITING,Jennifer Lame,Oppenheimer,True
374,10833,2023,2024,96,MUSIC (Original Score),Ludwig Göransson,Oppenheimer,True
376,10846,2023,2024,96,BEST PICTURE,"Emma Thomas, Charles Roven and Christopher Nol...",Oppenheimer,True


In [44]:
#Most award winning director?
Director=Recent[Recent['category']=='DIRECTING']
Dir_wins=Director['name'].value_counts()
Dir_wins

Alfonso Cuarón                      2
Alejandro G. Iñárritu               2
Tom Hooper                          1
Michel Hazanavicius                 1
Ang Lee                             1
Damien Chazelle                     1
Guillermo del Toro                  1
Bong Joon Ho                        1
Chloé Zhao                          1
Jane Campion                        1
Daniel Kwan and Daniel Scheinert    1
Christopher Nolan                   1
Name: name, dtype: int64

### Insights:

1. **'Honorary Award'** is the most award winning category (39 awards), in recent times.
2. **'Oppenheimer', 'Gravity', 'Everything Everywhere All at Once'** is most award winning movie; bagging **7** **awards**, each.

Oppenheimer (2023) won awards across 7 distinct categories:

**ACTOR IN A LEADING ROLE (Cillian Murphy), ACTOR IN A SUPPORTING ROLE (Robert Downey Jr), CINEMATOGRAPHY (Hoyte van Hoytema), DIRECTING (Christopher Nolan), FILM EDITING (Jennifer Lame), MUSIC Original Score (Ludwig Göransson), BEST PICTURE (the crew).**


3. **Alfonso Cuarón** is the most award-winning individual, of recent times; winning **3 awards** since 2010.




# **Winners of 2023!!**

In [50]:
#2023 films winning an oscar
entries_2023=Recent[Recent['year_film']==2023]
entries_2023

Unnamed: 0,index,year_film,year_ceremony,ceremony,category,name,film,winner
361,10768,2023,2024,96,ACTOR IN A LEADING ROLE,Cillian Murphy,Oppenheimer,True
362,10772,2023,2024,96,ACTOR IN A SUPPORTING ROLE,Robert Downey Jr.,Oppenheimer,True
363,10779,2023,2024,96,ACTRESS IN A LEADING ROLE,Emma Stone,Poor Things,True
364,10784,2023,2024,96,ACTRESS IN A SUPPORTING ROLE,Da'Vine Joy Randolph,The Holdovers,True
365,10785,2023,2024,96,ANIMATED FEATURE FILM,Hayao Miyazaki and Toshio Suzuki,The Boy and the Heron,True
366,10793,2023,2024,96,CINEMATOGRAPHY,Hoyte van Hoytema,Oppenheimer,True
367,10799,2023,2024,96,COSTUME DESIGN,Holly Waddington,Poor Things,True
368,10802,2023,2024,96,DIRECTING,Christopher Nolan,Oppenheimer,True
369,10809,2023,2024,96,DOCUMENTARY FEATURE FILM,"Mstyslav Chernov, Michelle Mizner and Raney Ar...",20 Days in Mariupol,True
370,10813,2023,2024,96,DOCUMENTARY SHORT FILM,Ben Proudfoot and Kris Bowers,The Last Repair Shop,True


In [51]:
#no. of oscars per movie
entries_2023['film'].value_counts()

Oppenheimer                                          7
Poor Things                                          4
The Zone of Interest                                 2
The Holdovers                                        1
The Boy and the Heron                                1
20 Days in Mariupol                                  1
The Last Repair Shop                                 1
Barbie                                               1
WAR IS OVER! Inspired by the Music of John & Yoko    1
The Wonderful Story of Henry Sugar                   1
Godzilla Minus One                                   1
American Fiction                                     1
Anatomy of a Fall                                    1
Name: film, dtype: int64

In [52]:
#categories with highest no. of oscars
entries_2023['category'].value_counts().sort_values(ascending=False)

HONORARY AWARD                      3
ACTOR IN A SUPPORTING ROLE          1
ACTRESS IN A LEADING ROLE           1
ACTRESS IN A SUPPORTING ROLE        1
ANIMATED FEATURE FILM               1
CINEMATOGRAPHY                      1
COSTUME DESIGN                      1
DIRECTING                           1
DOCUMENTARY FEATURE FILM            1
DOCUMENTARY SHORT FILM              1
FILM EDITING                        1
INTERNATIONAL FEATURE FILM          1
ACTOR IN A LEADING ROLE             1
MUSIC (Original Score)              1
MUSIC (Original Song)               1
BEST PICTURE                        1
PRODUCTION DESIGN                   1
SHORT FILM (Animated)               1
SHORT FILM (Live Action)            1
SOUND                               1
VISUAL EFFECTS                      1
WRITING (Adapted Screenplay)        1
WRITING (Original Screenplay)       1
JEAN HERSHOLT HUMANITARIAN AWARD    1
MAKEUP AND HAIRSTYLING              1
Name: category, dtype: int64

In [54]:
#individuals/entities with highest number of awards
entries_2023['name'].value_counts().sort_values(ascending=True)

Cillian Murphy                                                                                                 1
Da'Vine Joy Randolph                                                                                           1
Hayao Miyazaki and Toshio Suzuki                                                                               1
Hoyte van Hoytema                                                                                              1
Holly Waddington                                                                                               1
Christopher Nolan                                                                                              1
Mstyslav Chernov, Michelle Mizner and Raney Aronson-Rath                                                       1
Ben Proudfoot and Kris Bowers                                                                                  1
Jennifer Lame                                                                                   

### Insights:
1. Oppenheimer (7 awards), Poor things (3 awards), The zone of interest (2 awards) are most award winning movies.

2. Honorary award is the most award-winning category.



## Oscar-winning movie per Decade
Finally, What are the number of oscar winning movie in each decade?

In [45]:
import pandas as pd

# Assuming 'oscar' is your DataFrame containing the Oscars data

# 1. Filter the dataset to keep only the records where 'winner' is True
oscar_winners = oscar[oscar['winner'] == True]

# 2. Create groups based on the conditions for 'year_film'
conditions = [
    (oscar_winners['year_film'] >= 1920) & (oscar_winners['year_film'] <= 1929),
    (oscar_winners['year_film'] >= 1930) & (oscar_winners['year_film'] <= 1939),
    (oscar_winners['year_film'] >= 1940) & (oscar_winners['year_film'] <= 1949),
    (oscar_winners['year_film'] >= 1950) & (oscar_winners['year_film'] <= 1959),
    (oscar_winners['year_film'] >= 1960) & (oscar_winners['year_film'] <= 1969),
    (oscar_winners['year_film'] >= 1970) & (oscar_winners['year_film'] <= 1979),
    (oscar_winners['year_film'] >= 1980) & (oscar_winners['year_film'] <= 1989),
    (oscar_winners['year_film'] >= 1990) & (oscar_winners['year_film'] <= 1999),
    (oscar_winners['year_film'] >= 2000) & (oscar_winners['year_film'] <= 2009),
    (oscar_winners['year_film'] >= 2010) & (oscar_winners['year_film'] <= 2019),
    (oscar_winners['year_film'] >= 2020)
]

# Define the corresponding groups
groups = ['1920s', '1930s', '1940s', '1950s', '1960s', '1970s', '1980s', '1990s', '2000s', '2010s', '2020s']

# Create a new column 'decade_group' based on the conditions and groups
oscar_winners['decade_group'] = pd.cut(oscar_winners['year_film'], bins=[1919, 1929, 1939, 1949, 1959, 1969, 1979, 1989, 1999, 2009, 2019, 2029], labels=groups)

# 3. Count the unique number of films in each group
group_counts = oscar_winners.groupby('decade_group')['film'].nunique().reset_index()
group_counts.columns = ['Decade Group', 'Number of Films']

# Display the unique number of films in each group
group_counts



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Decade Group,Number of Films
0,1920s,20
1,1930s,101
2,1940s,170
3,1950s,149
4,1960s,143
5,1970s,134
6,1980s,135
7,1990s,138
8,2000s,147
9,2010s,148


In [46]:
px.bar(group_counts, x='Decade Group', y='Number of Films', color='Number of Films', title="Oscar-Winning Movies per Decade")

## Finally, 1940s (spanning 1940-1949) is the most award winning decade, with 170 sistinct films winning awards across different categories.

# **SUMMARY**

In this comprehensive exploratory data analysis project focusing on the Oscars, **we aimed to discern the trends and patterns of winning categories in Oscar-nominated films over the past 96 years.** Our investigation yielded several significant insights:

* Firstly, we found that a total of 5,042 movies were nominated across 115 distinct categories, involving 7,040 individuals. Among these, 1,329 films emerged victorious, reflecting a winning proportion of 26.0%. Interestingly, all nominated categories have secured awards at various points in time, and approximately 30% of the individuals nominated have clinched an award.

* Moreover, our analysis revealed that about 22.64% of nominations have culminated in receiving the prestigious accolade, underlining the selective and competitive nature of the Oscars. Notably, the year 1942 witnessed the highest number of film releases, while 1933 experienced a substantial increase in nominated categories, whereas 1947 saw a sharp decline.

* Furthermore, we identified 'Directing' as the category with the highest number of nominations, whereas 'short films' and 'documentaries' boasted the highest winning proportions (approximately 22% each). The years 1927, 1931, and 1932 stood out for their exceptionally high proportions of winning films.

**Delving into the most award-winning movies:**

> we highlighted notable films such as: **'Titanic,' 'West Side Story,' 'The Lord of the Rings: The Return of the King,' and 'Ben-Hur,'** among others. Additionally, we recognized Walt Disney Productions and MGM as the most award-winning entities, with their notable contributions to cinema.



> Moreover, **'Honorary Award' emerged as the most award-winning category**, while 'Oppenheimer,' 'Gravity,' and 'Everything Everywhere All at Once' claimed distinction as the most award-winning movies. Alfonso Cuarón emerged as the most award-winning individual of recent times (2010 - 2023).

* Finally, we concluded that the 1940s emerged as the most award-winning decade, with 170 distinct films clinching awards across various categories, encapsulating the rich history and enduring legacy of cinematic excellence celebrated by the Oscars over the years.